Today, various teams both internal and external have developed their own harnesses, tools to measure accuracy of their models as part of their testing efforts. This is not only redundant but probably not the best use of their development cycles.
Example : The TWC - Ads (Creative Labs) team spends about 2-3 weeks as part of their development cycle to measure, evaluate and deploy new version of Conversation workspaces to their customers.
a) Leverage the experiment, testing, model management framework from Watson Studio (part of Modeler) for all Watson services starting with Watson Assistant and Watson Discovery.
b) Create a set of Notebooks that include pre-built code to run cross-fold validation, blind set and accuracy analysis.
What do we have now:
a) Research Offering called FARCAST (From Yorktown) that's shelved.
b) With Watson Team Developed Notebooks in GH for each of the services :
Why is it useful?
Today, every IBM product team is developing their own framework to train, test and evaluate ML models trained using Watson. This time is well spent understanding the client business and training needs than writing test harnesses.
|Who would benefit from this IDEA?||As a Data Scientist, I want to run tests against Watson services to determine accuracy metrics for my deployment needs|
How should it work?
|Submitter Tags||With Watson|