Data Science in 2021

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

A Look at Data

Science in 2021
In this e-guide In this e-guide:
• 14 most in-demand data As demand for data science expertise grows ever higher, the challenges data scientists are
being asked to tackle are becoming more and more complex.
science skills you need to
Keep reading, and you’ll learn how you can tackle common data science problems, grow your
succeed skillset via popular certifications, and bring innovative data science best practices to your
company.
• Certificates important to the


data science learning path
Next Article
• Data science vs. machine
learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 1 of 27
In this e-guide 14 most in-demand data science
• 14 most in-demand data
skills you need to succeed
science skills you need to
Kathleen Walch, Principal analyst
succeed
As companies continue to grow their data assets, the need to extract meaningful information -
• Certificates important to the - and business value -- from that data is becoming increasingly important. Analyzing and
gleaning insights from data requires a different skill set than simply storing and managing it.
data science learning path Many organizations are quickly realizing that they need talented analytics professionals who
have specific skills in scientific methods, statistical approaches, data analysis and other data-
• Data science vs. machine centric methodologies.
learning vs. AI: How they
Emerging a little over a decade ago, the field of data science focuses on uncovering
work together information and insights in large amounts of both structured data and unstructured data. It
enables organizations to get answers to business questions, spot trends and make intelligent
• 4 data science project best predictions based on analysis of their data.
practices to follow Data science work is typically performed by data scientists. With backgrounds in
mathematics, statistics, data mining, advanced analytics, algorithms and now machine
learning and AI, data scientists can gain a comprehensive understanding of data and apply
their skills to find relevant analytics results.

For prospective data scientists, and organizations looking to hire them, the critical skills they
need to do their jobs effectively include both technical capabilities and soft skills -- personality
traits and characteristics that can help them achieve the desired outcomes and bridge the gap
between technologists and business executives and workers. Let's look more closely at these
key data science skills.

Page 2 of 27
In this e-guide Data science technical skills
• 14 most in-demand data In order for data scientists to ask the right questions, develop good analytical models and
successfully analyze the findings, they must have a variety of "hard skills" that require specific
science skills you need to training and education. Here are eight technical skills that data scientists typically need.
succeed Statistics. Since data scientists regularly apply statistical concepts and techniques, it should
come as no surprise that it's important for them to have a good understanding of statistics.
• Certificates important to the Being familiar with statistical analysis, distribution curves, probability and other elements of
data science learning path statistics helps data scientists collect, organize, analyze, interpret and present data -- better
enabling them to work with the data to find useful results.
• Data science vs. machine
Calculus and linear algebra. Being able to apply mathematical concepts to understand and
learning vs. AI: How they optimize fitting functions for matching a model to a data set is incredibly important to getting
work together accurate predictions from the model. Additionally, data scientists should be versed in using
dimensionality reduction to simplify complicated analysis problems involving high-dimensional
• 4 data science project best data. These skills are also important in machine learning -- for example, to train an artificial
neural network on large volumes of data.
practices to follow
Relevant coding skills. Many data scientists learn programming out of necessity. They
typically aren't coding masters and usually don't have a degree in computer science, but they
are familiar with the basics. Popular programming skills for data scientists include knowledge
of the Python, R, SQL and Julia languages.

Predictive modeling. Being able to use data to make predictions and model different
scenarios and outcomes is a central part of data science. Predictive analytics looks for
patterns in existing or new data to forecast future events, behavior and results; it can be
applied to various use cases in different industries. As a result, predictive modeling skills are
heavily used by data scientists.
Page 3 of 27
In this e-guide

• 14 most in-demand data


science skills you need to
succeed

• Certificates important to the


data science learning path

• Data science vs. machine


learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 4 of 27
Machine learning and deep learning. While data scientists don't necessarily need to work
In this e-guide with AI technologies, they're increasingly being hired by companies looking to implement
machine learning applications, in which they train algorithms to learn about data sets and then
• 14 most in-demand data look for patterns, anomalies or insights in the data. As a result, demand is on the rise for data
science skills you need to scientists who are skilled in the supervised, unsupervised and reinforcement learning
methods used in machine learning. Skills in deep learning, which uses neural networks to
succeed create complex analytical models, particularly help data scientists stand out.

• Certificates important to the Data wrangling. Over 80% of the time spent on data science projects is often devoted to
wrangling and preparing data for analysis. While most of the data preparation tasks fall on
data science learning path
data engineers, data scientists can benefit from being able to do basic data profiling,
• Data science vs. machine cleansing and modeling tasks. That allows them to be able to deal with imperfections in data,
such as missing fields, mislabeled fields or formatting issues. Data wrangling skills also
learning vs. AI: How they involve collecting data from multiple sources and massaging data formats to work with the
work together required algorithms.

• 4 data science project best Model deployment and production. Data scientists spend the majority of their time building
and deploying models. They need to be able to select the correct algorithm and then use
practices to follow training data for supervised learning approaches or run the algorithm to automatically find
clusters or patterns in unsupervised learning ones. Once a model produces the desired
results, data scientists, often working with data engineers, must deploy it in a production
environment to help their organizations make practical business decisions on an ongoing
basis.

Data visualization. Especially when working with sets of big data that are large and contain
different data types, being able to present analytics results in a visually appealing format is
another important data science skill. Data scientists must have the ability to use data
storytelling to highlight and explain the insights they've generated, and data visualization is a

Page 5 of 27
core way that they communicate those insights to business executives and other
In this e-guide stakeholders. As a result, they should master the use of Tableau, D3.js or various other
visualization tools that are widely available to help with the process.
• 14 most in-demand data
science skills you need to Nontechnical and soft skills
succeed In addition to technical skills, it's just as important for data scientists to possess a set of soft
skills. As mentioned above, many data scientists need to be able to translate analytics
• Certificates important to the findings and report on them to their business colleagues. Additionally, certain innate traits
data science learning path help them look at large pools of data with an inquiring mind, form analytics hypotheses and
find gems of knowledge hidden in the data. These six soft skills are part of the makeup of a
• Data science vs. machine well-rounded data scientist.
learning vs. AI: How they Business knowledge. At many organizations, data science teams fall under a line of
work together business, rather than being in IT or a centralized analytics group. And even if that isn't the
case, their work still focuses on business issues. As such, data scientists need to have a
• 4 data science project best strong understanding of the business and the industry it's in. This helps them to ask better
data analysis questions, identify new ways that the company should use its data and know
practices to follow which analytics problems to prioritize.

Problem solving. Data scientists are often asked to find information needles in very large
data haystacks. To do so, they come up with a hypothesis related to a business opportunity or
problem and then try to validate it by analyzing the data. As they work through the analytics
process, they need to have a keen mind for problem solving to figure out how various pieces
fit into the equation and determine which data should be included or left out, among other
tasks.

Curiosity. Being curious, asking questions and having a desire to continually learn are
important skills to possess as a data scientist. Curious minds are able to sift through large
Page 6 of 27
amounts of data to find answers and insights. Data itself constantly changes, so it's important
In this e-guide not to be complacent in the ways you approach data or limit yourself to the current
conclusions derived from the data.
• 14 most in-demand data
science skills you need to
succeed

• Certificates important to the


data science learning path

• Data science vs. machine


learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 7 of 27
In this e-guide

• 14 most in-demand data


science skills you need to
succeed

• Certificates important to the


data science learning path

• Data science vs. machine


learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 8 of 27
Critical thinking. Critical thinking skills are also crucial for data scientists. They need to be
In this e-guide able to assess data sets, analytics results and various additional information to form
judgments about the validity and relevance. Looking at data with a skeptical eye helps data
• 14 most in-demand data scientists reach accurate and unbiased conclusions.
science skills you need to
Communication. Data scientists who work with data on a daily basis understand it, and its
succeed nuances and intricacies, better than anyone else. The same, of course, goes for the findings
they produce as part of data science applications. They need to be able to successfully
• Certificates important to the communicate their understanding of the data and explain the analytics results so business
executives and workers can use the information to make good decisions.
data science learning path
Collaboration. Being able to work as part of a larger team is important, too. Data scientists
• Data science vs. machine often need to collaborate with each other and with data analysts, business leaders, subject
learning vs. AI: How they matter experts, data engineers and other people in an organization.

work together
Learning resources for data scientists
• 4 data science project best
Because of the many technical skills that are required, data science isn't a field that one can
practices to follow master in just a few weeks or through casual online courses, code academies and
bootcamps. Usually, data scientists have various academic degrees and certifications, and
they partake in continuous learning to stay up to date on the latest data science techniques
and tools. However, for those looking to get started, an increasing number of resources and
opportunities are now available.

Many universities offer degrees in data science at both the undergraduate and graduate
levels. Additionally, various online courses and other learning resources are available through
websites such as Coursera and Udemy. If you're looking to learn the fundamentals or basics
of data science, many analytics software vendors and traditional code academy programs
have also set up specific data science training courses.
Page 9 of 27
And now is a good time to take advantage of those resources. As more and more companies
In this e-guide look to hire people with data science skills, and the talent crunch in this field continues, the
need for well-trained data scientists and other analytics professionals will only continue to
• 14 most in-demand data increase.
science skills you need to
succeed ▼ Next Article
• Certificates important to the
data science learning path

• Data science vs. machine


learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 10 of 27
In this e-guide Certificates important to the data
• 14 most in-demand data
science learning path
science skills you need to
Joseph Carew, Assistant Site Editor
succeed
Certificates can help establish a benchmark of a data scientist's technical ability and skill, but
• Certificates important to the their true impact on a resume remains unclear.

data science learning path Organizations are looking for continued education and proof of technical ability for their open
data science positions. Certificates can give candidates the skills to set themselves apart but -
• Data science vs. machine - for data scientists seeking an edge -- the data science learning path can never cease.
learning vs. AI: How they Certificates are a natural step on this journey but, above all, employers are looking for hard
work together evidence of technical skill to match their open positions.

• 4 data science project best The value of a certificate


practices to follow When it comes to data science, having a wide arrangement of skills and a drive to learn more
are crucial. This is especially true for those seeking new positions.

"Continuing and pushing your skills in this field is the key to success," said Josh St. John, vice
president of client delivery at Brooks Bell, an analytics and UX consultancy. "Data science,
today, involves ever-evolving techniques and technologies that can make large impacts in
what can be accomplished in business."

A common way for candidates to grow their skill set is through continuing traditional education
and earning certificates. However, certificates are less powerful on the data science learning
path than they used to be.
Page 11 of 27
"Certificates have a muddled view of data science training," said Justin Richie, data science
In this e-guide director at Nerdery. "Cloud certifications from AWS, GCP and Azure have more value than
other certifications because they show an understanding of infrastructure."
• 14 most in-demand data
The valuable certifications are the ones that can prove specific knowledge of a technical skill.
science skills you need to
These help organizations better understand what data scientists are familiar with, but even
succeed these certificates have limitations in their value.

• Certificates important to the Fortunately, most certificates that candidates can earn in the space are relevant to a specific
technology, not an overall methodology. This is helpful because they can provide some
data science learning path evidence of technical familiarity, but they're tricky for those seeking a combination of skills.

• Data science vs. machine Oftentimes certificates fall short of the value that on-the-job experience has in the hiring
process because of their limited scope. Candidates would have to spend significant time
learning vs. AI: How they
collecting numerous certificates to truly stand out.
work together
"Certificates are not as influential as solid technical ability and relevant project work," Richie
• 4 data science project best said. "Most of today's data scientists come from all walks of life, so it's very hard to gauge skill
set without more technical evaluations."
practices to follow
Getting the most out of certificates
To make the most use of certificates, it is important to understand that, while being certified
doesn't necessarily make or break a candidate in the interview process, it can provide
interviewers with insight to a candidate's knowledge level. Certifications can generally help
match candidates to a position's responsibilities.

"In the interview process, certificates are most influential when they're relevant to specific
skills required to fulfill the role responsibilities," St. John said.

Page 12 of 27
But beyond that, having a long list of certifications does not make a candidate a preferred
In this e-guide choice when another person displays a strong core in the base requirements.

• 14 most in-demand data "While certificates and certificate training programs are useful in growing data scientist skill
sets, hands-on experience is always a much more effective teacher of these types of skills,"
science skills you need to
St. John said. "When you combine the two, you get the well-rounded experience needed to
succeed grow in your role."

• Certificates important to the What matters in the interview process?


data science learning path
Balancing experience with certifications is crucial for candidates interviewing in data science.
• Data science vs. machine "Relevant work is critical," Richie said. "The most important item that stands out to me in
learning vs. AI: How they hiring decisions is a strong GitHub profile."

work together When it comes to the more large-scale hiring institutions such as Google and Facebook, proof
of technical ability stands supreme over traditional weights like college degrees.
• 4 data science project best
"Data preparation, analysis and presentation are baseline requirements for data scientists,"
practices to follow St. John said. "This includes fundamental knowledge of statistical tools, databases, modeling
and -- of course -- mathematical ability."

It is the demonstration of the abilities associated with certifications that is key for the
candidate. Certificates are not enough if, during the interview, the skills can't be proved. And
the best data scientists can prove these skills, as well as translate their knowledge to a
nontechnical audience and therefore allow them to act on those recommendations
accordingly.

It is only through investing in oneself and expanding technical capabilities that a data scientist
can rise above other candidates.
Page 13 of 27
"Continued education is essential," Richie said. "I envision continuing education and degrees
In this e-guide as being the catalyst for maturity in the industry."

• 14 most in-demand data


science skills you need to ▼ Next Article
succeed

• Certificates important to the


data science learning path

• Data science vs. machine


learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 14 of 27
In this e-guide Data science vs. machine learning
• 14 most in-demand data
vs. AI: How they work together
science skills you need to
Ronald Schmelzer, Principal analyst
succeed
Today's organizations are awash in data. Just a decade ago, a gigabyte of data still seemed
• Certificates important to the like a large quantity. Nowadays, however, some large organizations are managing upward of
a zettabyte. To get a sense of how much data that is, if your typical laptop or desktop
data science learning path computer has a 1 TB hard drive inside it, a zettabyte is equal to one billion of those hard
drives.
• Data science vs. machine
How can organizations even hope to get any business value from so much data? They need
learning vs. AI: How they
to be able to analyze it and identify needles of valuable knowledge in an almost infinite
work together haystack. That's where the combination of data science, machine learning and AI has become
remarkably useful -- but you don't need anywhere near a zettabyte of data for those three
• 4 data science project best things to be relevant.
practices to follow Once relegated to esoteric corners of academia and research or the wonky side of IT and
data management, they've collectively emerged as crucial technology topics for organizations
of all types and sizes in various industries. However, there's often still confusion about data
science vs. machine learning vs. AI and what each involves. Understanding the nature and
purpose of these transformative concepts will point the way toward how to best apply them to
meet pressing business needs.

Let's look at each one, plus the differences between them and how they can be used
together.

Page 15 of 27
In this e-guide Data science
• 14 most in-demand data While data has been central to computing since its inception, a separate field dealing
specifically with data analytics didn't emerge until many decades later. Rather than the
science skills you need to technical aspects of data management, data science focuses on statistical approaches,
scientific methods and advanced analytics techniques that treat data as a discrete resource,
succeed
regardless of how it's stored or manipulated.
• Certificates important to the At its core, data science aims to extract useful insights from data given the specific
data science learning path requirements of business executives and other prospective users of those insights. What are
customers interested in purchasing? How is the business doing with a particular product or in
• Data science vs. machine a geographic region? Is the COVID-19 pandemic straining or growing resources? These are
questions that can be answered using the mathematics, statistics and data analytics that are
learning vs. AI: How they
part of the data science process.
work together
Traditionally, organizations have depended on business intelligence systems to derive
• 4 data science project best insights from their growing pools of data. However, BI systems depend partly on humans to
spot trends in spreadsheets, dashboards, charts or graphs. They're also challenged by at
practices to follow least four of the Vs of big data: volume, velocity, variety and veracity. As organizations store
data in increasing quantities and collect it at increasing speed from a wide variety of data
sources, in different formats and with different data quality levels, the conventional data
warehousing and business analytics approaches that BI is built on fall short.

By comparison, the experiences of leading-edge companies, such as Amazon, Google, Netflix


and Spotify, show how applying the fundamental aspects of data science can help uncover
deeper insights that provide significant competitive advantages over business rivals. They and
other organizations -- banking and insurance companies, retailers, manufacturers and many
more -- use data science to spot patterns in data sets, identify potentially anomalous

Page 16 of 27
transactions, uncover missed opportunities with customers and create predictive models of
In this e-guide future behavior and events.

• 14 most in-demand data Likewise, healthcare providers rely on data science to help diagnose medical conditions and
improve patient care, while government agencies use it for things such as providing early
science skills you need to
notification of potentially life-threatening situations and ensuring the safety and security of
succeed critical systems and infrastructure.

• Certificates important to the Data science work is done primarily by data scientists. While there's no universal consensus
on their job description, this is the minimum set of skills that effective data scientists must
data science learning path have:

• Data science vs. machine • a firm grasp on statistics and probability;


• knowledge of various algorithmic approaches to analyzing data;
learning vs. AI: How they
• the ability to use various tools, technologies and techniques to plumb large data sets
work together for the desired analytics results; and
• data visualization capabilities to provide visibility into the derived insights.
• 4 data science project best
As part of data science teams, data scientists often work with data engineers to facilitate the
practices to follow collection and wrangling of data from multiple source systems, as well as business analysts
who understand evolving business needs, data analysts who understand the characteristics of
changing data sets and developers who can help put the analytical models generated by data
science applications into production.
Increasingly, those models are being called on to do more than just provide a snapshot of
insights into the current state of data. Data scientists can train algorithms to learn patterns,
correlations and other characteristics about sample data and then analyze full data sets that
they haven't seen before. In this way, data science has contributed to the growth of artificial
intelligence and, in particular, the use of machine learning to support the goals of AI.

Page 17 of 27
In this e-guide Machine learning
• 14 most in-demand data One of the hallmarks of intelligence is the ability to learn from experience. If machines can
identify patterns in data, they can then use those patterns to generate insights or predictions
science skills you need to on new data that they're run against. This is the fundamental idea behind machine learning.
succeed Machine learning relies on algorithms that can encode learning from examples of good data
into models. The models can be used for a wide range of applications, such as classifying
• Certificates important to the data into categories ("Is this image a cat?"), predicting a value for some data given previously
data science learning path identified patterns ("What is the probability that this transaction is fraudulent?"), and identifying
groups in a data set ("What other products can I recommend to those who have bought this
• Data science vs. machine product?").
learning vs. AI: How they The core concepts of machine learning are embodied in the ideas of classification, regression
work together and clustering. A wide range of machine learning algorithms have been created to perform
those tasks across disparate data sets. The available algorithms include decision trees,
• 4 data science project best support vector machines, K-means clustering, K-nearest neighbors, Naïve Bayes classifiers,
random forests, Gaussian mixture models, linear regression, logistic regression, principal
practices to follow component analysis and many others. Data scientists typically build and run the algorithms;
some data science teams now also include machine learning engineers, who help code and
deploy the resulting models.

The machine learning process involves different types of learning, with varying levels of
guidance by data scientists and analysts. The primary alternatives are:

• supervised learning, which starts with human-labeled training data that helps instruct
algorithms on what to learn;
• unsupervised learning, a method in which an algorithm is left to discover information on
its own using unlabeled training data; or
Page 18 of 27
• reinforcement learning, which lets algorithms learn through trial and error with initial
In this e-guide instructions and ongoing oversight from data scientists.

• 14 most in-demand data Of late, no algorithmic approach has generated as much excitement and promise as the use
of artificial neural networks. Like the biological systems they're inspired by, neural networks
science skills you need to comprise neurons that can take input data, apply weights and bias adjustments to the inputs
succeed and then feed the resulting outputs to additional neurons. Through a complex series of
interconnections and interactions among these neurons, the neural network can learn over
• Certificates important to the time how to adjust the weights and biases in a way that provides the desired results.
What started out in the 1950s simply as a single layer of neurons in the perceptron algorithm
data science learning path
has evolved into a much more complicated approach -- known as deep learning -- that uses
• Data science vs. machine multiple layers to produce nuanced and sophisticated results. These multilayered neural nets
have shown a remarkable ability to learn from large data sets and enable uses such as facial
learning vs. AI: How they recognition, multilingual conversational systems, autonomous vehicles and advanced
work together predictive analytics.

• 4 data science project best With a significant push from data-drenched companies like Google, Netflix, Amazon, Microsoft
and IBM, what once seemed like a research hypothetical rapidly became the here-and-now
practices to follow possible, really taking hold in the early 2000s. The availability of big data, capabilities of data
science and power of machine learning not only provide answers to today's organizational
challenges but also may help crack the longstanding challenge of making AI a full reality.

Artificial intelligence
AI is an idea older than computing itself: Is it possible to create machines that have the
cognitive ability of humans? The idea has long inspired academicians, researchers and
science fiction writers, and it emerged as a practical pursuit in the middle of the 20th century.
In 1950, computing pioneer and well-known code-cracker Alan Turing came up with a

Page 19 of 27
fundamental test of machine intelligence, which became known as the Turing Test. The term
In this e-guide artificial intelligence was coined in the proposal for a seminal AI conference that took place at
Dartmouth in 1956.
• 14 most in-demand data
AI still remains a dream, at least in the form that many envisioned decades ago. The concept
science skills you need to
of a machine with the full range of cognitive and intellectual capabilities that people have is
succeed known as artificial general intelligence (AGI), or, alternatively, general AI. No one has yet built
such a system, and the development of AGI may be decades away, if it's feasible at all.
• Certificates important to the
However, we have been able to tackle narrow AI tasks. Cognilytica, my research firm, has
data science learning path defined seven patterns of AI that focus on specific needs for perception, prediction or
planning. For example, they include training machines to:
• Data science vs. machine
learning vs. AI: How they • accurately recognize images, objects and other elements in unstructured data;
• have meaningful conversational interactions with people;
work together • use derived insights to power predictive analytics systems;
• spot patterns and anomalies in large data sets;
• 4 data science project best • create detailed profiles of individuals for hyperpersonalization uses;
• power autonomous systems with minimal or no human involvement; and
practices to follow • solve scenario simulations and other challenging goal-driven problems.

Each of these narrow use cases provides significant capabilities and value today, despite not
addressing the overarching goals of AGI. The development of machine learning has directly
led to the advancement of these narrow AI applications. And because data science has made
machine learning practical, it too has helped make them a reality.
Differences between data science, machine learning
and AI

Page 20 of 27
While data science, machine learning and AI have affinities and support each other in
In this e-guide analytics applications and other use cases, their concepts, goals and methods differ in
significant ways. To further differentiate between them, consider these lists of some of their
• 14 most in-demand data key attributes.
science skills you need to
Data science:
succeed
• focuses on extracting information needles from data haystacks to aid in decision-
• Certificates important to the making and planning;
• is applicable to a wide range of business issues and problems through descriptive,
data science learning path predictive and prescriptive analytics applications;
• deals with data at a small scale up through very large data sets; and
• Data science vs. machine • uses statistics, mathematics, data wrangling, big data analytics, machine learning and
various other methods to answer analytics questions.
learning vs. AI: How they
Machine learning:
work together

• 4 data science project best • focuses on providing a means for algorithms and systems to learn from experience
with data and use that experience to improve over time;
practices to follow • learns by examining data sets rather than explicit programming, which makes use of
data science methods, techniques and tools a key asset;
• can be done through supervised, unsupervised or reinforcement learning approaches;
and
• supports artificial intelligence uses, especially narrow AI applications that handle
specific tasks.

Artificial intelligence:

• focuses on giving machines cognitive and intellectual capabilities similar to those of


humans;

Page 21 of 27
• encompasses a collection of intelligence concepts, including elements of perception,
In this e-guide planning and prediction;
• is capable of augmenting or replacing humans in specific tasks and workflows; and
• 14 most in-demand data • currently doesn't address key aspects of human intelligence, such as commonsense
understanding, applying knowledge from one context to another, adapting to change
science skills you need to and displaying sentience and awareness.
succeed

• Certificates important to the


data science learning path

• Data science vs. machine


learning vs. AI: How they
work together

• 4 data science project best


practices to follow

Page 22 of 27
In this e-guide How data science, machine learning and AI can be
• 14 most in-demand data
combined
science skills you need to The power of data science on its own is significant. Combining it with machine learning adds
even more potential value for generating insights from evergrowing pools of data. Used
succeed together, they also drive a variety of narrow AI applications and might eventually lead to
solving the challenge of general AI.
• Certificates important to the
More specifically, here are some examples of how organizations are combining data science,
data science learning path
machine learning and AI to potent effect:
• Data science vs. machine
• predictive analytics applications that forecast customer behavior, business trends and
learning vs. AI: How they events based on analysis of constantly changing data sets;
• AI-enabled conversational systems that can engage in highly interactive
work together communications with customers, users, patients and other individuals;
• anomaly detection systems driven by machine learning and AI that can respond to
• 4 data science project best continually evolving threats and power adaptive cybersecurity and fraud detection
systems; and
practices to follow • hyperpersonalization systems that enable targeted advertising, product
recommendations, financial guidance and medical care, plus other personalized
offerings to customers.

While data science, machine learning and AI are separate concepts that individually offer
powerful capabilities, using them together is transforming the way we manage organizations
and business operations -- and how we live, work and interact with the world around us.

▼ Next Article
Page 23 of 27
In this e-guide 4 data science project best practices
• 14 most in-demand data
to follow
science skills you need to
Yujun Chen and Dawn Li
succeed
Automation is a key component of enterprises' push to transform their operations, and a driver
• Certificates important to the of that automation is data science. However, there are many misconceptions that still arise
when it comes to data science, AI and machine learning, specifically when it comes to data
data science learning path projects.

• Data science vs. machine To help address some of these, here are four data science project best practices for
organizations to follow.
learning vs. AI: How they
work together 1. Understand the business requirement
• 4 data science project best A common misconception about data scientists is that they simply grab data, run models and
practices to follow then produce results. While they do all these things, the most important part of the job is to
first establish and understand the use case for a particular model. Put simply, what is the
business problem that needs to be addressed?

For data scientists, this process is summed up by converting the business problem into a
mathematical one. But to do that, they must intricately understand the pain point of the
business or customer, as this will determine the data sets used to build the models.

Data scientists can only understand the business problem by fully understanding the market
the business operates in. Data scientists must also work closely with business teams, such as
product managers, to understand exactly how a customer views their problem.

Page 24 of 27
In this e-guide 2. Communicate effectively
• 14 most in-demand data Communicating with a business team is an important data science project best practice to
follow, but this has its difficulties. Data scientists typically have more technical backgrounds
science skills you need to than product managers, so communicating complex mathematical solutions effectively -- i.e.
in a way that can be understood and fed back to clients -- poses a challenge. They can't
succeed
simply point to a set of formulas and say, "These meet the customer's requirements, so we're
• Certificates important to the ready to get going."

data science learning path Properly conveying how a model can answer a business problem is a soft skill that data
scientists should develop. By doing so, the business team can help ask the right questions
• Data science vs. machine that will enable data scientists to identify the right datasets for the models.
learning vs. AI: How they "We need an efficient way to do X" is a simplistic but typical starting point for any data project.
work together But there is an understanding that "X" is never clearly defined. This is when data scientists
work with the business teams to eliminate ambiguities and refine the use case.
• 4 data science project best
Never underestimate the power of "Why?" It's sometimes the case that a customer's demand
practices to follow doesn't address the problem. A data scientist may not have the datasets available to achieve
the best model, so an alternative and workable answer may be needed. Adjusting the target
to what's possible is essential in this instance and, again, requires effective communication
with the business team so that the technical constraints can be relayed to the client.

3. Avoid junk in, junk out


Data scientists are faced with many inherent constraints when it comes to getting the
information needed for the models, from gaining the right permissions to access certain
datasets and regulatory issues around sensitive data, to the disparate locations and formats

Page 25 of 27
of the data required. Once they have this information in one place, they then manipulate the
In this e-guide data to identify the features that will become the input for the models.

• 14 most in-demand data This process can take up to 90% of a data scientist's time as they need to clean the data,
locate anomalies and missing values and merge datasets. Often, the tools and algorithms
science skills you need to
needed to create a certain use case already exist through open source libraries such as
succeed Python, Tensorflow and PyTorch. This is why feature engineering, due diligence and data
manipulation are the most time-consuming parts of the job.
• Certificates important to the
The feature engineering process is, of course, informed by their knowledge of the business
data science learning path problem, which is why the first step -- understanding the business requirement -- is a vital
data science project best practice to follow. The quality of the data that data scientists feed
• Data science vs. machine into an algorithm ultimately determines the success of the data science project, and quality is
learning vs. AI: How they determined by the accuracy of the data itself, but also its relevance to meeting the business
requirement.
work together
Data scientists are aware that data scarcity and inaccurate data are the norm whenever they
• 4 data science project best embark on a project. Even when it comes to data recorded by advanced monitoring tools, it is
practices to follow a fundamental principle of physics that a measurement is never 100% accurate, and this must
also be taken into consideration. Every model is "wrong" in some way, but models get data
science teams close enough to the answers to business problems so that effective data-
driven decisions can be made.

At some point, data scientists must make the decision that they have enough data to make a
workable model. But data works like a currency -- you get the closest to what you want by
using what you have.

4. Iterate and adapt to change


Page 26 of 27
A characteristic of data-driven projects is that they cannot be built for perpetual use. There
In this e-guide could be a shift in business priorities that will require a data scientist to rebuild a model.

• 14 most in-demand data A recent example is the shifting behaviours of organizations and customers in the wake of the
COVID-19 pandemic. Statistical models that addressed certain problems before the crisis
science skills you need to
have either been rebuilt or adjusted to address the new reality. As organizations continue to
succeed adapt to the crisis, they need to re-engineer their models. The judgement as to when this
happens is determined by their performance, which must be closely monitored.
• Certificates important to the
Monitoring the effectiveness of an algorithm requires setting thresholds for performance,
data science learning path which is quite simple. Once performance drops below a set threshold -- i.e. the minimum
required to deliver actionable insights -- it is time for a new iteration. From a business
• Data science vs. machine perspective, this is key to delivering monetizable data offerings, as shifting data requirements
learning vs. AI: How they necessitate new models. To deliver new models, data scientists must, once again, understand
the new business requirements -- and thus the cycle begins again.
work together

• 4 data science project best


▼ About searchBusinessAnalytics.com
practices to follow
SearchBusinessAnalytics.com is dedicated to serving the information needs of business
intelligence leaders, CIOs, business analysts, architects, data scientists, statisticians, and
data analytics professionals. We’re a trusted source for advancing your knowledge with free
resources designed to accelerate business planning, forecasting and visibility, and create a
more data-driven culture throughout their organization.

Page 27 of 27

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy