0% found this document useful (0 votes)
112 views4 pages

Big Data Technologies

This document discusses 15 popular big data technologies that companies are investing in. It notes that spending on big data technologies is expected to continue growing significantly through 2020 as most companies have already invested in big data solutions. Popular technologies include Hadoop, Spark, R, data lakes, NoSQL databases, predictive analytics, in-memory databases, and big data security and governance solutions. Self-service analytics capabilities are also in high demand as data scientists are in short supply.

Uploaded by

NRK Murthy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views4 pages

Big Data Technologies

This document discusses 15 popular big data technologies that companies are investing in. It notes that spending on big data technologies is expected to continue growing significantly through 2020 as most companies have already invested in big data solutions. Popular technologies include Hadoop, Spark, R, data lakes, NoSQL databases, predictive analytics, in-memory databases, and big data security and governance solutions. Self-service analytics capabilities are also in high demand as data scientists are in short supply.

Uploaded by

NRK Murthy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

http://www.datamation.com/big-data/big-data-technologies.

html

Back to article

Big Data Technologies


By Cynthia Harvey
June 13, 2017

Most experts expect spending on big data technologies to continue at a breakneck pace through the rest of the decade. According to IDC's
Worldwide Semiannual Big Data and Analytics Spending Guide, enterprises will likely spend $150.8 billion on big data and business analytics in
2017, 12.4 percent more than they spent in 2016. And the firm forecasts a compound annual growth rate (CAGR) of 11.9 percent for the market
through 2020, when revenues will top $210 billion.

Surveys of IT leaders and executives also lend credence to the idea that enterprises are spending substantial sums on big data technology. The
NewVantage Partners Big Data Executive Survey 2017, found that 95 percent of Fortune 1000 executives said their firms had invested in big
data technology over the past five years. In some cases, those investments were large, with 37.2 percent of respondents saying their companies
had spent more than $100 million on big data projects, and 6.5 invested more than $1 billion.

And the IDG Enterprise 2016 Data & Analytics Research found that this spending is likely to continue. Among those surveyed, 89 percent
expected that within the next 12 to 18 months their companies would purchase new solutions designed to help them derive business value from
their big data.

So what Big Data technologies are these companies buying?

15 Big Data Technologies to Watch


The list of technology vendors offering big data solutions is seemingly infinite. Many of the big data solutions that are particularly popular right
now fit into one of the following 15 categories:

1. The Hadoop Ecosystem

While Apache Hadoop may not be as dominant as it once was, it's nearly impossible to talk about big data without mentioning this open source
framework for distributed processing of large data sets. Last year, Forrester predicted, "100% of all large enterprises will adopt it (Hadoop and
related technologies such as Spark) for big data analytics within the next two years."

Over the years, Hadoop has grown to encompass an entire ecosystem of related software, and many commercial big data solutions are based on
Hadoop. In fact, Zion Market Research forecasts that the market for Hadoop-based products and services will continue to grow at a 50 percent
CAGR through 2022, when it will be worth $87.14 billion, up from $7.69 billion in 2016.

Key Hadoop vendors include Cloudera, Hortonworks and MapR, and the leading public clouds all offer services that support the technology.

Hadoop is a high profile Big Data technology, though arguably it has lost some of its original dominance in the data analytics world.
2. Spark

Apache Spark is part of the Hadoop ecosystem, but its use has become so widespread that it deserves a category of its own. It is an engine for
processing big data within Hadoop, and it's up to one hundred times faster than the standard Hadoop engine, MapReduce.

In the AtScale 2016 Big Data Maturity Survey, 25 percent of respondents said that they had already deployed Spark in production, and 33
percent more had Spark projects in development. Clearly, interest in the technology is sizable and growing, and many vendors with Hadoop
offerings also offer Spark-based products.

3. R

R, another open source project, is a programming language and software environment designed for working with statistics. The darling of data
scientists, it is managed by the R Foundation and available under the GPL 2 license. Many popular integrated development environments (IDEs),
including Eclipse and Visual Studio, support the language.

Several organizations that rank the popularity of various programming languages say that R has become one of the most popular languages in
the world. For example, the IEEE says that R is the fifth most popular programming language, and both Tiobe and RedMonk rank it 14th. This is
significant because the programming languages near the top of these charts are usually general-purpose languages that can be used for many
different kinds of work. For a language that is used almost exclusively for big data projects to be so near the top demonstrates the significance
of big data and the importance of this language in its field.

4. Data Lakes

To make it easier to access their vast stores of data, many enterprises are setting up data lakes. These are huge data repositories that collect
data from many different sources and store it in its natural state. This is different than a data warehouse, which also collects data from disparate
sources, but processes it and structures it for storage. In this case, the lake and warehouse metaphors are fairly accurate. If data is like water, a
data lake is natural and unfiltered like a body of water, while a data warehouse is more like a collection of water bottles stored on shelves.

Data lakes are particularly attractive when enterprises want to store data but aren't yet sure how they might use it. A lot of Internet of Things
(IoT) data might fit into that category, and the IoT trend is playing into the growth of data lakes.

MarketsandMarkets predicts that data lake revenue will grow from $2.53 billion in 2016 to $8.81 billion by 2021.

5. NoSQL Databases

Traditional relational database management systems (RDBMSes) store information in structured, defined columns and rows. Developers and
database administrators query, manipulate and manage the data in those RDBMSes using a special language known as SQL.

NoSQL databases specialize in storing unstructured data and providing fast performance, although they don't provide the same level of
consistency as RDBMSes. Popular NoSQL databases include MongoDB, Redis, Cassandra, Couchbase and many others; even the leading RDBMS
vendors like Oracle and IBM now also offer NoSQL databases.

NoSQL databases have become increasingly popular as the big data trend has grown. According to Allied Market Research the NoSQL market
could be worth $4.2 billion by 2020. However, the market for RDBMSes is still much, much larger than the market for NoSQL.

MonboDB is one of several well-known NoSQL databases.

6. Predictive Analytics

Predictive analytics is a sub-set of big data analytics that attempts to forecast future events or behavior based on historical data. It draws on
data mining, modeling and machine learning techniques to predict what will happen next. It is often used for fraud detection, credit scoring,
marketing, finance and business analysis purposes.

In recent years, advances in artificial intelligence have enabled vast improvements in the capabilities of predictive analytics solutions. As a
result, enterprises have begun to invest more in big data solutions with predictive capabilities. Many vendors, including Microsoft, IBM, SAP,
SAS, Statistica, RapidMiner, KNIME and others, offer predictive analytics solutions. Zion Market Research says the Predictive Analytics market
generated $3.49 billion in revenue in 2016, a number that could reach $10.95 billion by 2022.

7. In-Memory Databases

In any computer system, the memory, also known as the RAM, is orders of magnitude faster than the long-term storage. If a big data analytics
solution can process data that is stored in memory, rather than data stored on a hard drive, it can perform dramatically faster. And that's exactly
what in-memory database technology does.
Many of the leading enterprise software vendors, including SAP, Oracle, Microsoft and IBM, now offer in-memory database technology. In
addition, several smaller companies like Teradata, Tableau, Volt DB and DataStax offer in-memory database solutions. Research from
MarketsandMarkets estimates that total sales of in-memory technology were $2.72 billion in 2016 and may grow to $6.58 billion by 2021.

8. Big Data Security Solutions

Because big data repositories present an attractive target to hackers and advanced persistent threats, big data security is a large and growing
concern for enterprises. In the AtScale survey, security was the second fastest-growing area of concern related to big data.

According to the IDG report, the most popular types of big data security solutions include identity and access controls (used by 59 percent of
respondents), data encryption (52 percent) and data segregation (42 percent). Dozens of vendors offer big data security solutions, and Apache
Ranger, an open source project from the Hadoop ecosystem, is also attracting growing attention.

9. Big Data Governance Solutions

Closely related to the idea of security is the concept of governance. Data governance is a broad topic that encompasses all the processes related
to the availability, usability and integrity of data. It provides the basis for making sure that the data used for big data analytics is accurate and
appropriate, as well as providing an audit trail so that business analysts or executives can see where data originated.

In the NewVantage Partners survey, 91.8 percent of the Fortune 1000 executives surveyed said that governance was either critically important
(52.5 percent) or important (39.3 percent) to their big data initiatives. Vendors offering big data governance tools include Collibra, IBM, SAS,
Informatica, Adaptive and SAP.

10. Self-Service Capabilities

With data scientists and other big data experts in short supply — and commanding large salaries — many organizations are looking for big data
analytics tools that allow business users to self-service their own needs. In fact, a report from Research and Markets estimates that the self-
service business intelligence market generated $3.61 billion in revenue in 2016 and could grow to $7.31 billion by 2021. And Gartner has noted,
"The modern BI and analytics platform emerged in the last few years to meet new organizational requirements for accessibility, agility and
deeper analytical insight, shifting the market from IT-led, system-of-record reporting to business-led, agile analytics including self-service."

Hoping to take advantage of this trend, multiple business intelligence and big data analytics vendors, such as Tableau, Microsoft, IBM, SAP,
Splunk, Syncsort, SAS, TIBCO, Oracle and other have added self-service capabilities to their solutions. Time will tell whether any or all of the
products turn out to be truly usable by non-experts and whether they will provide the business value organizations are hoping to achieve with
their big data initiatives.

11. Artificial Intelligence

While the concept of artificial intelligence (AI) has been around nearly as long as there have been computers, the technology has only become
truly usable within the past couple of years. In many ways, the big data trend has driven advances in AI, particularly in two subsets of the
discipline: machine learning and deep learning.

The standard definition of machine learning is that it is technology that gives "computers the ability to learn without being explicitly
programmed." In big data analytics, machine learning technology allows systems to look at historical data, recognize patterns, build models and
predict future outcomes. It is also closely associated with predictive analytics.

Deep learning is a type of machine learning technology that relies on artificial neural networks and uses multiple layers of algorithms to analyze
data. As a field, it holds a lot of promise for allowing analytics tools to recognize the content in images and videos and then process it
accordingly.

Experts say this area of big data tools seems poised for a dramatic takeoff. IDC has predicted, "By 2018, 75 percent of enterprise and ISV
development will include cognitive/AI or machine learning functionality in at least one application, including all business analytics tools."

Leading AI vendors with tools related to big data include Google, IBM, Microsoft and Amazon Web Services, and dozens of small startups are
developing AI technology (and getting acquired by the larger technology vendors).

12. Streaming analytics

As organizations have become more familiar with the capabilities of big data analytics solutions, they have begun demanding faster and faster
access to insights. For these enterprises, streaming analytics with the ability to analyze data as it is being created, is something of a holy grail.
They are looking for solutions that can accept input from multiple disparate sources, process it and return insights immediately — or as close to
it as possible. This is particular desirable when it comes to new IoT deployments, which are helping to drive the interest in streaming big data
analytics.

Several vendors offer products that promise streaming analytics capabilities. They include IBM, Software AG, SAP, TIBCO, Oracle, DataTorrent,
SQLstream, Cisco, Informatica and others. MarketsandMarkets believes the streaming analytics solutions brought in $3.08 billion in revenue in
2016, which could increase to $13.70 billion by 2021.

13. Edge Computing

In addition to spurring interest in streaming analytics, the IoT trend is also generating interest in edge computing. In some ways, edge
computing is the opposite of cloud computing. Instead of transmitting data to a centralized server for analysis, edge computing systems analyze
data very close to where it was created — at the edge of the network.

The advantage of an edge computing system is that it reduces the amount of information that must be transmitted over the network, thus
reducing network traffic and related costs. It also decreases demands on data centers or cloud computing facilities, freeing up capacity for other
workloads and eliminating a potential single point of failure.

While the market for edge computing, and more specifically for edge computing analytics, is still developing, some analysts and venture
capitalists have begun calling the technology the "next big thing."

14. Blockchain

Also a favorite with forward-looking analysts and venture capitalists, blockchain is the distributed database technology that underlies Bitcoin
digital currency. The unique feature of a blockchain database is that once data has been written, it cannot be deleted or changed after the fact.
In addition, it is highly secure, which makes it an excellent choice for big data applications in sensitive industries like banking, insurance, health
care, retail and others.

Blockchain technology is still in its infancy and use cases are still developing. However, several vendors, including IBM, AWS, Microsoft and
multiple startups, have rolled out experimental or introductory solutions built on blockchain technology.

15. Prescriptive Analytics

Many analysts divide big data analytics tools into four big categories. The first, descriptive analytics, simply tells what happened. The next type,
diagnostic analytics, goes a step further and provides a reason for why events occurred. The third type, predictive analytics, discussed in depth
above, attempts to determine what will happen next. This is as sophisticated as most analytics tools currently on the market can get.

However, there is a fourth type of analytics that is even more sophisticated, although very few products with these capabilities are available at
this time. Prescriptive analytics offers advice to companies about what they should do in order to make a desired result happen. For example,
while predictive analytics might give a company a warning that the market for a particular product line is about to decrease, prescriptive
analytics will analyze various courses of action in response to those market changes and forecast the most likely results.

Currently, very few enterprises have invested in prescriptive analytics, but many analysts believe this will be the next big area of investment
after organizations begin experiencing the benefits of predictive analytics.

The market for big data technologies is diverse and constantly changing. But perhaps one day soon predictive and prescriptive analytics tools will
offer advice about what is coming next for big data — and what enterprises should do about it.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy