A three-stage approach to make your business AI ready
Organizations implementing artificial intelligence (AI) have increased by 270% over the last four years, according to a recent survey by Gartner. Even though the implementation of AI is a growing trend, 63% of organizations haven’t deployed this technology. What is holding them back: cost? talent shortage? something else?
For many organizations it is the inability to reach the desired confidence level in the algorithm itself. Data science teams often blow their budget, time and resources on AI models that never make it out of the beginning stages of testing. And even if projects make it out of the initial stage, not all projects are successful.
One example we saw last year was Amazon’s attempt to implement AI in their HR department. Amazon received a huge number of resumes for their thousands of open positions. They hypothesized that they could use machine learning to go through all of the resumes and find the top talent. While the system was able to filter the resumes and apply scores to the candidates, it also showed gender bias. While this proof of concept was approved, they didn’t watch for bias in their training data and the project was recalled.
Companies want to jump on the “Fourth Industrial revolution” bandwagon and prove that AI will deliver ROI for their businesses. The truth is AI is in its early stages and many companies are just now getting AI ready. For machine learning (ML) project teams that are starting a project for the first time, a deliberate, three-stage approach to project evolution will pave a shortcut to success:
1. Test the fundamental efficacy of your model with an internal Proof of Concept (POC)
The point of a POC is to prove that in a certain case it is possible to save money or improve a customer experience using AI. You are not attempting to get the model to the level of confidence needed to deploy it, but just to say (and show) the project can work.
A POC like this is all about testing things to see if a given approach produces results. There is no sense in making deep investments for a POC. You can use an off-the-shelf algorithm, find open source training data, purchase a sample dataset, create your own algorithm with limited functionality, and/or label your own data. Find what works for you to prove that your project will achieve the intended corporate goal. A successful POC is what is going to get the rest of the project funded.
In the grand scheme of your AI project, this step is the easiest part of your journey. Keep in mind, as you get further into training your algorithm, you will not be able to use sample data or prepare all of your training data yourself. The subsequent improvements in model confidence required to make your system production ready will take immense amounts of training data.
2. Prepare the data you’ll need to train your algorithm… and keep going
In this step the hard work really begins. Let’s say that your POC using pre-labeled data got your model to a 60% confidence. 60% is not ready for primetime. In theory, that could mean that 40 percent of the interactions your algorithm has with customers will be unsatisfactory. How to reach a higher level of confidence? More training data.
Proving AI will work for your business is a huge step toward implementing it and actually reaping the benefits. But don’t let it lull you into thinking the next 10% confidence is going to be 6x easier than that. The ugly truth is that models have an insatiable appetite for training data and getting from 60% to 70% confidence could take more training data that it took to get to the original 60 percent. The needs become exponential.
3. Watch out for possible roadblocks
Imagine: if it took tens of thousands of labeled images to prove one use case for a successful POC, it is going to take tens of thousands of images for each use case you need your algorithm to learn. How many use cases is that? Hundreds? Thousands? There are edge cases that will continually arise, and each of those will require training data. And on and on. It is understandable that data science teams often underestimate the quantity of training data they will need and attempt to do the labeling and annotating in-house. This could also partially account for why data scientists are leaving their jobs.
While not enough training data is one common pitfall, there are others. It is essential that you are watching for and eliminating any sample, measurement, algorithm, or prejudicial bias in your training data as you go. You’ll want to implement agile practices to catch these things early and make adjustments.
And one final thing to keep in mind,=: AI labs, data scientists, AI teams, and training data are expensive. Yet, in a Gartner report that says that AI projects are in the top three priorities, it also states that AI is thirteenth on their list of funding priorities. Yes, you’re going to need a bigger budget.
Author: Glen Ford