Currently, we are unable to scale to more than 20 threads when using a WKS custom model for entity and relation prediction with the NLU service. Our documents are very large and one document can take up to 5 minutes to process. If we have multiple users and multiple documents, we cannot process more than one document at a time, or if we do, we need to share the 20 threads among these documents. IBM employees told us that the only way we could scale is to manually deploy WKS models to new NLU instances when our usage increases. So we either have the choice to deploy WKS instances manually when our usage increases, or always have a high number of WKS instances deployed at all time and pay 800$*number of instances/month (even when we don't need it). An easy solution would be for you to provide and endpoint to duplicate a custom model. Either from Watson Knowledge Studio directly, or from the NLU Service. That way, we can handle the scaling on our side, and we don't need to hire an employee whose amazing job would be to deploy custom models manually.
Why is it useful?
|Who would benefit from this IDEA?||As a customer, I want to be able to duplicate WKS custom model instances, so that we can scale our entity/relation prediction to more than one user.|
How should it work?