Final Project
Final Project
1.2 Machine Learning vs. Deep Learning vs. Neural Networks ......................................................... 2
2.1 Machine learning classifiers fall into three primary categories. ................................................... 5
1.1 INTRODUCTION
This introduction to machine learning provides an overview of its history, important definitions,
applications and concerns within businesses today.
Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use
of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.
IBM has a rich history with machine learning. One of its own, Arthur Samuel, is credited for
coining the term, “machine learning” with his research (PDF, 481 KB) (link resides outside IBM) around
the game of checkers. Robert Nealey, the self-proclaimed checkers master, played the game on an IBM
7094 computer in 1962, and he lost to the computer. Compared to what can be done today, this feat almost
seems trivial, but it’s considered a major milestone within the field of artificial intelligence. Over the next
couple of decades, the technological developments around storage and processing power will enable some
innovative products that we know and love today, such as Netflix’s recommendation engine or self-driving
cars.
Machine learning is an important component of the growing field of data science. Through the use
of statistical methods, algorithms are trained to make classifications or predictions, uncovering key insights
within data mining projects. These insights subsequently drive decision making within applications and
businesses, ideally impacting key growth metrics. As big data continues to expand and grow, the market
demand for data scientists will increase, requiring them to assist in the identification of the most relevant
business questions and subsequently the data to answer them.
The way in which deep learning and machine learning differ is in how each algorithm learns. Deep
learning automates much of the feature extraction piece of the process, eliminating some of the manual
human intervention required and enabling the use of larger data sets. You can think of deep learning as
"scalable machine learning" as Lex Fridman notes in this MIT lecture (01:08:05) (link resides outside IBM).
Classical, or "non-deep", machine learning is more dependent on human intervention to learn. Human
experts determine the set of features to understand the differences between data inputs, usually requiring
more structured data to learn.
"Deep" machine learning can leverage labeled datasets, also known as supervised learning, to
inform its algorithm, but it doesn’t necessarily require a labeled dataset. It can ingest unstructured data in
its raw form (e.g. text, images), and it can automatically determine the set of features which distinguish
different categories of data from one another. Unlike machine learning, it doesn't require human
intervention to process data, allowing us to scale machine learning in more interesting ways. Deep learning
and neural networks are primarily credited with accelerating progress in areas, such as computer vision,
natural language processing, and speech recognition.
Neural networks, or artificial neural networks (ANNs), are comprised of a node layers, containing
an input layer, one or more hidden layers, and an output layer. Each node, or artificial neuron, connects to
another and has an associated weight and threshold. If the output of any individual node is above the
specified threshold value, that node is activated, sending data to the next layer of the network. Otherwise,
no data is passed along to the next layer of the network. The “deep” in deep learning is just referring to the
depth of layers in a neural network. A neural network that consists of more than three layers—which would
be inclusive of the inputs and the output—can be considered a deep learning algorithm or a deep neural
network. A neural network that only has two or three layers is just a basic neural network.
See the blog post “AI vs. Machine Learning vs. Deep Learning vs. Neural Networks: What’s the
Difference?” for a closer look at how the different concepts relate.
UC Berkeley (link resides outside IBM) breaks out the learning system of a machine learning algorithm
into three main parts.
A Decision Process: In general, machine learning algorithms are used to make a prediction or
classification. Based on some input data, which can be labelled or unlabeled, your algorithm will produce
an estimate about a pattern in the data.
An Error Function: An error function serves to evaluate the prediction of the model. If there are
known examples, an error function can make a comparison to assess the accuracy of the model.
A Model Optimization Process: If the model can fit better to the data points in the training set, then
weights are adjusted to reduce the discrepancy between the known example and the model estimate. The
algorithm will repeat this evaluate and optimize process, updating weights autonomously until a threshold
of accuracy has been met.
2 MACHINE LEARNING METHODS
Supervised learning, also known as supervised machine learning, is defined by its use of labeled datasets
to train algorithms that to classify data or predict outcomes accurately. As input data is fed into the model,
it adjusts its weights until the model has been fitted appropriately. This occurs as part of the cross validation
process to ensure that the model avoids overfitting or underfitting. Supervised learning helps organizations
solve for a variety of real-world problems at scale, such as classifying spam in a separate folder from your
inbox. Some methods used in supervised learning include neural networks, naïve bayes, linear regression,
logistic regression, random forest, support vector machine (SVM), and more.
Unsupervised learning, also known as unsupervised machine learning, uses machine learning algorithms to
analyze and cluster unlabeled datasets. These algorithms discover hidden patterns or data groupings without
the need for human intervention. Its ability to discover similarities and differences in information make it
the ideal solution for exploratory data analysis, cross-selling strategies, customer segmentation, image and
pattern recognition. It’s also used to reduce the number of features in a model through the process of
dimensionality reduction; principal component analysis (PCA) and singular value decomposition (SVD)
are two common approaches for this. Other algorithms used in unsupervised learning include neural
networks, k-means clustering, probabilistic clustering methods, and more.
Semi-supervised learning offers a happy medium between supervised and unsupervised learning. During
training, it uses a smaller labeled data set to guide classification and feature extraction from a larger,
unlabeled data set. Semi-supervised learning can solve the problem of having not enough labeled data (or
not being able to afford to label enough data) to train a supervised learning algorithm.
For a deep dive into the differences between these approaches, check out "Supervised vs. Unsupervised
Learning: What's the Difference?"
2.2 REINFORCEMENT MACHINE LEARNING
Reinforcement machine learning is a behavioral machine learning model that is similar to supervised
learning, but the algorithm isn’t trained using sample data. This model learns as it goes by using trial and
error. A sequence of successful outcomes will be reinforced to develop the best recommendation or policy
for a given problem.
The IBM Watson® system that won the Jeopardy! challenge in 2011 makes a good example. The
system used reinforcement learning to decide whether to attempt an answer (or question, as it were), which
square to select on the board, and how much to wager—especially on daily doubles.
Speech recognition: It is also known as automatic speech recognition (ASR), computer speech recognition,
or speech-to-text, and it is a capability which uses natural language processing (NLP) to process human
speech into a written format. Many mobile devices incorporate speech recognition into their systems to
conduct voice search—e.g. Siri—or provide more accessibility around texting.
Customer service: Online chatbots are replacing human agents along the customer journey. They answer
frequently asked questions (FAQs) around topics, like shipping, or provide personalized advice, cross-
selling products or suggesting sizes for users, changing the way we think about customer engagement across
websites and social media platforms. Examples include messaging bots on e-commerce sites with virtual
agents, messaging apps, such as Slack and Facebook Messenger, and tasks usually done by virtual assistants
and voice assistants.
Computer vision: This AI technology enables computers and systems to derive meaningful information
from digital images, videos and other visual inputs, and based on those inputs, it can take action. This ability
to provide recommendations distinguishes it from image recognition tasks. Powered by convolutional
neural networks, computer vision has applications within photo tagging in social media, radiology imaging
in healthcare, and self-driving cars within the automotive industry.
Recommendation engines: Using past consumption behavior data, AI algorithms can help to discover data
trends that can be used to develop more effective cross-selling strategies. This is used to make relevant add-
on recommendations to customers during the checkout process for online retailers.
Automated stock trading: Designed to optimize stock portfolios, AI-driven high-frequency trading
platforms make thousands or even millions of trades per day without human intervention.
As machine learning technology advances, it has certainly made our lives easier. However, implementing
machine learning within businesses has also raised a number of ethical concerns surrounding AI
technologies. Some of these include:
Privacy tends to be discussed in the context of data privacy, data protection and data security, and these
concerns have allowed policymakers to make more strides here in recent years. For example, in 2016,
GDPR legislation was created to protect the personal data of people in the European Union and European
Economic Area, giving individuals more control of their data. In the United States, individual states are
developing policies, such as the California Consumer Privacy Act (CCPA), which require businesses to
inform consumers about the collection of their data. This recent legislation has forced companies to rethink
how they store and use personally identifiable data (PII). As a result, investments within security have
become an increasing priority for businesses as they seek to eliminate any vulnerabilities and opportunities
for surveillance, hacking, and cyberattacks.
Instances of bias and discrimination across a number of intelligent systems have raised many ethical
questions regarding the use of artificial intelligence. How can we safeguard against bias and discrimination
when the training data itself can lend itself to bias? While companies typically have well-meaning intentions
around their automation efforts, Reuters (link resides outside IBM) highlights some of the unforeseen
consequences of incorporating AI into hiring practices. In their effort to automate and simplify a process,
Amazon unintentionally biased potential job candidates by gender for open technical roles, and they
ultimately had to scrap the project. As events like these surface, Harvard Business Review (link resides
outside IBM) has raised other pointed questions around the use of AI within hiring practices, such as what
data should you be able to use when evaluating a candidate for a role.
Bias and discrimination aren’t limited to the human resources function either; it can be found in a
number of applications from facial recognition software to social media algorithms.
As businesses become more aware of the risks with AI, they’ve also become more active this
discussion around AI ethics and values. For example, last year IBM’s CEO Arvind Krishna shared that
IBM has sunset its general purpose IBM facial recognition and analysis products, emphasizing that “IBM
firmly opposes and will not condone uses of any technology, including facial recognition technology
offered by other vendors, for mass surveillance, racial profiling, violations of basic human rights and
freedoms, or any purpose which is not consistent with our values and Principles of Trust and Transparency.”
To read more about this, check out IBM’s policy blog, relaying its point of view on “A Precision
Regulation Approach to Controlling Facial Recognition Technology Exports.”
3.4 ACCOUNTABILITY
Since there isn’t significant legislation to regulate AI practices, there is no real enforcement mechanism to
ensure that ethical AI is practiced. The current incentives for companies to adhere to these guidelines are
the negative repercussions of an unethical AI system to the bottom line. To fill the gap, ethical frameworks
have emerged as part of a collaboration between ethicists and researchers to govern the construction and
distribution of AI models within society. However, at the moment, these only serve to guide, and research
(link resides outside IBM) (PDF, 1 MB) shows that the combination of distributed responsibility and lack
of foresight into potential consequences isn’t necessarily conducive to preventing harm to society.