Everyone is talking about Artificial Intelligence, but what’s really going on? Is AI going to solve all your problems? And what even IS AI? What are the key things you need to take into account when building AI solutions? In practice AI is often used as an umbrella term, and most of the solutions we refer to are Machine Learning – and it has been around for some time. For example, the first artificial neural networks emerged in the mid 1900s.
Sure, methods and algorithms have evolved a lot, and perhaps even more significantly computing power has increased drastically, but we have been using and developing machine learning models for quite some time. Even though there have been fantastic recent advances in ease of use through ready toolkits like TensorFlow or manged cloud services like Sagemaker, I do think many of the fundamental things learned through experience over the years still apply, but have sometimes seen some of the fundamentals can get forgotten in all the excitement. I have been working in the field for a while now, and just wanted to share a few principles that I think everyone should keep in mind when designing and building – or evaluating, if you are sitting on that side of the fence – successful AI/ML solutions.
1) There is no silver bullet
If you think that there is one algorithm or model that always works better than all others, you are wrong. Sorry. There is no silver bullet or no single algorithm that will solve any problem. Sure, everyone talks about Deep Learning nowadays, and it without doubt provides an unprecedented level of accuracy in many tasks. But still it is not the right solution for all problems. I believe that as a data scientist you need to build a solid understanding of underlying principles as well as knowledge of different types of approaches, understanding the pros and cons and where they work (and where not).
And as a business owner evaluating a solution to buy or build, you should be a little sceptical someone says anything along the lines of: “Our solution is the best because it uses ______.” It is just too easy to bolt on “AI” to any solution without generating any – or at not much more than marketing – value whereas building solutions that really robustly deliver on their promise may be a whole different story.
2) Know your models
Whenever applying a model onto a set of data, you should be aware of the underlying assumptions of your model and the conditions it operates under, as well as be aware of the most common possible pitfalls. Perhaps the most classical mistake of making machine learning models is “overfitting” a very complex model to a set of data, which may lead to poor ability extrapolate outside of that particular set.
Here’s a very simple example of what it means: Looking at the figure below, let’s assume we have an underlying linear process, but with some noise on one point. We get four points of data (usually it won’t be such manageable numbers and thus much harder to see, but to illustrate the point please bear with me). A line won’t fit perfectly. A 2nd degree polynomial either – but it will be better in terms of training set error. But a 3rd degree polynomial will fit – zero error on the training data. But if your data is actually from a linear process, how well will your model generalize to new values? Let’s see what happens if we take a couple more points and then look at what happens with our overly complex model – the model doesn’t really work at all for the new data points.
This is just one (overly simplified) example of building a complex solution that can seem, on your training data, to work very well without understanding the whole situation – and what it can lead to with regards to robustness of your model. It really does make sense to try to understand the full situation as well as possible before making a decision – especially if not only numbers but money, or even people’s lives, depend on it.
3) Training is key (know your data and how you train your model)
Even if you have the perfect model, it will learn from the data you give it to be trained on. You need be very careful about how you train your model. Machine learning means a machine is learning from data – the model will learn what is in the data you give it – and at least before we truly start to see general intelligence only what is in the data you give it. So in a way most of the current AI is not really all that smart… the outputs will reflect the data you use for training, and if there are biases in the training data (or process), it is only to be expected that they will be present also in the outputs of the model when it is applied.
There are just so many examples out there of situations where a great model has been trained on a subset of data that transfers a bias into the model. Image recognition models that detect people as criminals or terrorists based on ethnicity, for instance, make me feel ashamed to be honest. Military systems have been taught to learn to discriminate between friendly and enemy tanks by using pictures, with good sharp pictures of friendly tanks and some fuzzy long-distance images of enemy tanks. Guess what happens on a foggy day? There are too many examples to mention but trust me on this! You need to be very careful as without the proper training — and the right training data — your model may become just a “garbage in garbage out” machine instead of the truly intelligent solution you were looking to build.
So, there you go – a few high-level principles. You probably already knew all of this already, but I believe that they are all so important that a little repetition won’t hurt. What do you think? What would be your key rules to building a reliable and robust AI solution? Drop me a message, or comment below – I would love to hear from you!
Oh, and also: we are hiring, just drop me a note if you would like to be a part of building the future of cyber security with the help of artificial intelligence!
Vice President, Artificial Intelligence