Market Guide For DSML Engineering Platforms - Gartner 2022
Market Guide For DSML Engineering Platforms - Gartner 2022
Engineering Platforms
Published 2 May 2022 - ID G00763493 - 16 min read
Overview
Key Findings
• Creating an efficient and maintainable path from prototype to production for
artificial intelligence (AI) and machine learning (ML) models continues to be a
major challenge, fueled by the increasing complexity of data and
infrastructure.
• Match the skill sets of data science and ML engineering teams to DSML
engineering platform capabilities by assessing the level of collaboration and
abstraction provided across the model development and management life
cycle.
• Select a DSML engineering platform only when you have a pipeline of use
cases and a strategic business need for the delivery of machine
learning products with quantifiable scalability and performance requirements.
Market Definition
This document was revised on 3 May 2022. The document you are viewing is the
corrected version. For more information, see the Corrections page on gartner.com.
•
Capability DSML Engineering Platform Focus
Data access and Data access is provided for streaming data and
preparation unstructured data, typically achieved
through prebuilt connectors provided with the
platform. Data-centric AI is supported through data
labeling and synthetic data generation. Metadata
generated throughout the development life cycle is
stored and can be accessed programmatically.
Feature stores are also provided.
Data exploration and Support for notebooks is the de facto way for
visualization exploring and visualizing data. There are also
platform-specific functions for a variety of
exploratory statistics and geolocation and graph
analytics. Integrations are provided for
visualizations in external analytics platforms.
Market Direction
The AI and data science platform market is due to grow to over $10 billion by 2025 at
a 21.6% compounded annual growth rate.1 This growth in the market mirrors the
investments made by organizations in data science and ML initiatives, which are
largely turning from strategy to execution. The DSML engineering market is
representative of this shift in dynamics between business need and technical
implementation.
Buyers of these platforms have typically had success in building and deploying
DSML solutions in pockets within their enterprise and are now looking to formalize
DSML development practices, platforms and architectures to provide sustainable
growth in the use of DSML enterprisewide. DSML engineering platforms will continue
to focus on enterprisewide deployments, which are managed by centralized teams,
often within IT, but also give visibility to lines of business (LOBs) for decision making.
The capabilities that will drive the development of these platforms and have the most
impact for these users are:
• Data access across hybrid and multicloud data sources with provisioning for
on-demand scalable compute for data engineering and model training.
Market Analysis
The DSML engineering platform market is still an emerging and immature market but
has many established vendors that have been adding functionality to their DSML
platforms to ease the frustration organizations face when deploying and running
models in production. These issues include:
• Building a data pipeline that supports the usage of the model in the correct
context (low latency or high throughput, for example)
Not all barriers are technical in nature, and they are often resolved by improving
processes and collaboration (see 4 Machine Learning Best Practices to Achieve
Project Success). This demand and gap in the market has also allowed smaller
vendors to create offerings, either as all-purpose platforms or targeted on certain
tasks in the ML model life cycle, referred to as MLOps. Gartner’s social media
analysis of MLOps and related terms shows that the topic of MLOps platforms had
the biggest share of voice in 2021, along with MLOps capabilities such as continuous
monitoring, model governance and continuous delivery, as shown in Figure 2.
Figure 2: Share of Voice for MLOps in Social Media 2021
The emergence of MLOps has fragmented the DSML market into four broad
categories:
DSML Engineering — These platforms are focused on serving the needs of expert
data scientists and delivery teams responsible for building and maintaining ML and
AI solutions. They provide an end-to-end platform with developer tools for managing
code, data, experiments, models, model outputs and associated pipelines, often
integrating with DevOps tools and open-source frameworks. They also provide and
manage their own compute servers, while also able to connect to external compute
resources.
Specialists — A number of tools and platforms within the MLOps category focus on a
subset of capabilities. This can include explainability, security, deployment,
monitoring and governance. More information on these capabilities and platforms
can be found in Market Guide for AI Trust, Risk and Security Management.
Notebooks are a key tool in a data scientist’s toolbox for data exploration,
experimentation, collaboration and sharing and remain front and center in DSML
engineering platforms. Recent notebook innovations include real-time
collaboration, autopackaging, and deployment and auditability. Future innovations
will continue to focus on bringing notebook-based experiments into live production
settings.
Business
Data ML Models Model Outputs
Understanding
Further details on these providers and others can be found in the Gartner
research Tool: Vendor Identification for Data Science and Machine Learning
Platforms (the Market Guide is limited to a maximum of 40 vendors [see Note 1
and Note 2]).
Representative Vendors
The vendors listed in this Market Guide do not imply an exhaustive list. This section
is intended to provide more understanding of the market and its offerings.
Market Introduction
Table 3: Representative Vendors in DSML Engineering Platform Market
Enlarge Table
Vendor Product(s)
Dataiku Dataiku
DataVision BeeYard
Deepnote Deepnote
Exponential AI Enso
Vendor Product(s)
MathWorks MATLAB
Palantir Foundry
Technologies
Red Hat Red Hat OpenShift, Red Hat OpenShift Data Science
Valohai Valohai
Market Recommendations
Data and analytics leaders must capitalize on trends and configure their strategy for
DSML engineering platforms by:
• Assessing the current state of model development practices across data, data
science, machine learning engineering and operations. Assess the DSML
engineering platforms against current process limitations, and consider
specialists for acute needs such as explainability, testing and monitoring.
Evidence
Approved Methodology: Gartner conducts social listening analysis leveraging third-
party data tools to complement or supplement the other fact bases presented in this
document. Due to its qualitative and organic nature, the results should not be used
separately from the rest of this research. No conclusions should be drawn from this
data alone. Social media data in reference is from 1 January 2019 through 31
December 2021 in all geographies (except China) and recognized languages.
The SMA Team: Mani Ratnam and Talmeez Fahim from the Social Media Analytics
Team contributed to this research.
1
Forecast Analysis: Artificial Intelligence Software, Worldwide
© 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of
Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form
without Gartner's prior written permission. It consists of the opinions of Gartner's research
organization, which should not be construed as statements of fact. While the information
contained in this publication has been obtained from sources believed to be reliable, Gartner
disclaims all warranties as to the accuracy, completeness or adequacy of such information.
Although Gartner research may address legal and financial issues, Gartner does not provide
legal or investment advice and its research should not be construed or used as such. Your
access and use of this publication are governed by Gartner’s Usage Policy. Gartner prides itself
on its reputation for independence and objectivity. Its research is produced independently by its
research organization without input or influence from any third party. For further information, see
"Guiding Principles on Independence and Objectivity."