Khangjrakpam Arjun, South Asian University([email protected])
Harvard University labelled the profession, data scientist, “the sexiest job of the 21st century.” You must have come across these days, a lot of confusing buzz words like data science(DS), artificial intelligence(AI), machine learning(ML), image processing, pattern recognition, computer vision(CV), etc. in the newspapers, social media and on the internet. We hear of AI driven cars, machines, healthcare, manufacturing, smart cities, smart traffic, etc. affecting almost every sphere of our lives and also its potential. Does it not strike your mind, then, as to why AI related jobs should not be the coolest job of the century and why should not we be striving to work on these apps that are driven by AI? If AI has so much impact and data science is the coolest job, are these the same?As a newbie in this field of DS, DL, ML, AI and I am not sure which field I actually fit in, the more I tried to segregate one from the other the lesser it made sense to me. Is it because there are layers of overlapping domains, and yet no two of them can be clubbed into a single term ? Why do we have books titled “Pattern recognition and Machine Learning” ? Is pattern recognition the same as ML, if it is, then why do we have different terms, and if they are different, then why not have separate books for the two? When we hear the names of research institutes like the Alan Turing Institute for Data Science and Artificial Intelligence, does it strike you as well, as to why the term ‘Data Science’ and ‘Artificial Intelligence’ are used along side each other? There seems to be a lot of confusion with these jargon. Are you ready to bust this jargon? Just hang on till the end.
Let’s begin by looking at the definitions of these terms. The Merriam Webster defines artificial intelligence as:
a branch of computer science dealing with the simulation of intelligent behaviour in computers.
It also describes machine learning as:
the process by which a computer is able to improve its own performance (as in analysing image files) by continuously incorporating new data into an existing statistical model.
Adam Pi-ore said :
An entire speciality known as machine learning is devoted to building algorithms that allow computers to develop new behaviours based on experience.
So, is ML different from AI because it involves building algorithms ? Does ML involve building algorithms too? Again when we study AI does it not involve learning, data which are a part of the terms such as machine learning and data science . And when we see the word simulation in the definition of AI, does the question not occur to you as to, “How does it simulate”? Just reading the definitions of these two terms gives us a very unclear picture . Wikipedia defines data science as :
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured.
But AI is also a multidisciplinary field used in healthcare, manufacturing, space technology, agriculture, defence, etc. If we look at current areas of AI research, it has adaptive learning, text parsing, pattern recognition, expert systems, speech recognition, natural language, etc. and hence AI is multidisciplinary. Therefore, multidisciplinary is not a criteria to distinguish between AI and data science .
Let’s take a dip into the definition of AI. We can classify it into three platforms:
1.Abilities: It can be grouped as those capabilities that a computer can have like computer vision(CV), speech, NLP, planning/decision making, etc.
2.Tasks : Detecting text from a picture, recognising speech, recognising a particular language, etc.
3. Methods: Expert systems, machine learning, deep learning, reinforced learning, etc.
If a system can differentiate between a bird or an animal using an SVM or a neural network, its a small part of AI which can do only a specific task unlike a human being who can recognise pattern, speech, make decisions or plans, etc. This system will still be an AI empowered system. Thus AI encompasses any or all the different combinations of these abilities, tasks and methods.
Now let us take a look into an example of machine learning. Suppose a system wants to recognise a cat from a vast collection of different pictures. First, we will have to give different labelled sets of data containing the cat’s picture and also supply a function which we think has a relation, which may be captured by an SVM equation or an ANN function,with the labelled data sets and the function and tell the computer to compute the parameters of the function .We see that AI encompasses all the tasks that we have seen till now including ML.
Taking a look at pattern recognition, a modern definition of pattern recognition is :
The field of pattern recognition is concerned with the automatic discovery of regularities in data through the use of computer algorithms and with the use of these regularities to take actions such as classifying the data into different categories.
If we look at computer vision, speech and NLP, all of them have pattern recognition in their task, that is to recognise and discover pattern of regularities in the data and with the use of these regularities through computer algorithms, it takes actions such as classification, regression, clustering, generation, etc. Hence, most AI tasks involves pattern recognition. Thus each of these tasks spans different abilities and may be solved by different methods.
Going on to image processing, is it different from computer vision? Image processing takes an image and gives out an image. In computer vision if we are carrying out an image classification, it takes an image and gives the class of the image. This same result obtained by image processing can be accomplished using a machine learning algorithm.
Lastly, is data science a part of AI or is it the other way round or is it neither?
In data science, we can use ML algorithms to predict the outcomes of sales, growth of business, profits, etc. for numeric and structured data. So, if we are dealing with picture data, it is preferably called computer vision, with speech data — speech, with text data — natural language processing. Since text, speech, numbers, pictures are all data and looking at the definition of science:
Science is the intellectual and practical activity encompassing the systematic study of the structure and behaviour of the physical and natural world through observation and experiment.
one can say that a person working in NLP is also working in data science and a person working in computer vision can also be said to be working in data science. After breaking down the specificity of the tasks and the jargon associated with it, it would be preferable and more accurate to be identified someone dealing with NLP as a person working in NLP instead of data science, which is more broader and confusing as well.
So, it is seen that all these jargon seems to be overlapping at various points but if we are dealing with specific tasks, then there are specific terms for each of these tasks and it must not be confused with a term which has a broader meaning. I hope that I have to some extent succeeded in busting this riddle of jargon for you .