Guide To Big Data For Finance
Guide To Big Data For Finance
Guide To Big Data For Finance
BROUGHT TO YOU BY
Big Data for Finance
VELOCITY VARIETY
Source – Exist.com 2013
The proverbial 3 V’s of big data: volume, velocity and variety and what
constitutes each classification for the finance sector.
2
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
3
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
5
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
There are many quality software tools allowing Kitenga is Hadoop-enabled for big data scalability
banking institutions to reap the benefits of big data. and allows for integration of disparate data sources
For example, the Kitenga Analytics Suite from Dell and cost efficient storage of growing data volumes.
is an industry leading big data search and analytics Kitenga can directly analyze Hadoop results using
platform designed to integrate information of all information visualization tools that bind directly
types into easily deployed visualizations. With this to Hadoop Distributed File System (HDFS) files, as
content mining and analytics solution, banks are well as index the created data and metadata into
a searchable form with embedded visualization
Kitenga also helps to reduce the capabilities. Kitenga acts as a wrapper of libraries
on top of Hadoop, with a drag-and-drop interface
pain of solving big data problems by to query, search, and analyze data directly against
providing out-of-the-box capabilities HDFS, without the need to create and script
for building search and analysis MapReduce jobs.
capabilities that work over big Kitenga enables financial institutions to integrate
data assets. structured transaction data with a variety of
unstructured private and public information
sources to quickly discern patterns associated with
able to transform complex and time consuming
fraud and immediately identify new transactions
manipulation of web-scale data resources into
that meet the model and require additional review
a fast and intuitive process. Banks can harvest
before being approved.
sentiments from Twitter feeds, blogs, news reports,
CRM systems, and other sources, and combine
them with demographic and regional data to better With the Dell Boomi master data
understand market traction and opportunities. management (MDM) solution,
Kitenga can be characterized with the following
high-level feature set:
you can take the complexity out
of integration. The software
• Analytics – Sophisticated natural language
processing (NLP), machine learning, predictive
solution offers on-premise and
modeling, sentiment analysis, social network cloud integration without
analysis, and visualization. appliances, software or coding
• Search – Interactive and intuitive. Search for easier transfers between CRM
interface allows business analyst to explore and back-office applications.
and exploit all data resources.
• Visualization – Interactive web-based authoring In the banking industry, it’s easier to provide
empowers business users to perform analysis, customers with a more personalized experience
visualize results and make decisions. when you have the latest data. But since so much
of that data lives in so many places throughout a
Kitenga also helps to reduce the pain of solving
business, it’s hard to make the best decisions quickly.
big data problems by providing out-of-the-box
With the Dell Boomi master data management
capabilities for building search and analysis
(MDM) solution, you can take the complexity
capabilities that work over big data assets. These
out of integration. The software solution offers
capabilities enable data analyst personnel to:
on-premise and cloud integration without
1. A
uthor big data solutions by dragging-and- appliances, software or coding for easier transfers
dropping analysis components that operate between CRM and back-office applications.
over Hadoop for solving everyday problems.
2. E xecute analytical pipelines that operate over
Hadoop to scale up by scaling out
3. Monitor the jobs as they run
6
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
Credit Scoring C AS E ST U DY
Historically, loan and credit scoring methodology
employed by credit bureaus and used by banks and
other financial institutions has been based on a five
Credit Scoring at Novum Bank
component composite score including (i) past loan
by Marcel Wiedenbrugge
and credit applications, (ii) on time payments, (iii) Imagine you are active in the provisioning
types of loan and credit used, (iv) length of loan and of (micro) credit and a customer wants to
credit history and (v) credit capacity used. Until the borrow temporarily a few hundred euros
big data revolution, this approach has seen little from you. How do you determine whether
innovation in making scoring a commodity. it makes financial sense to do business with
this customer? For Joop Bruinzeel, Chief
Today, new technology platforms have opened
Credit Risk Officer (CCRO) at Novum Bank,
the doors for change in credit scoring and big data
this question is just another day’s issue. As a
scoring services are beginning to be available. Loan
provider of micro-credit, Novum Bank daily
and credit decisions are determined in seconds using
provides relatively small amounts (from €100
automated processes based on machine learning
to €600) to customers where traditional
algorithms. The breadth of data that can be used
banks have no interest due to a high risk
for credit scoring has expanded considerably. For
profile. Properly set up and tuned credit risk
each scoring decision, big data applications collect
management is essential.
data from a broad range of external data sources
ranging from social networks, e-commerce data, For assessing credit applications, Novum
economic databases, micro geographical statistics Bank recently started using STATISTICA, the
and other sources. In some cases, big data scoring analytical software solution from StatSoft
technology can use upward of 10,000 data points (now a part of Dell). In this interview I speak
in real-time to asses a customer’s creditworthiness. with Joop Bruinzeel about micro-credit, the
importance of credit scoring and the use of
As an extension to traditional scoring services, new analytical software.
technology companies using big data scoring are
providing scoring-as-a-service options for online Why did you choose STATISTICA?
loan and credit decisions. This type of solution is Joop: “Before I started working for Novum
provided to banks, debt collectors, e-commerce Bank, I immersed myself in modelling. I got
sites, leasing and other financial companies. These back in touch with another company that
systems can integrate into the customer’s existing was specializing in short-term loans and had
systems and/or website. (successfully) made use of STATISTICA. As the
shareholders of Novum Bank required mature
For a valuable use case example of how big data
risk management, STATISTICA was perceived
has transformed the credit scoring arena, see the
as a logical choice to go with.”
side bar “Credit Scoring at Novum Bank.” To assess
credit applications, Novum Bank in the Netherlands What do you like about STATISTICA?
recently started using the Dell STATISTICA, the Joop: “The beauty of STATISTICA is that you
analytical software solution. In the interview, Chief can build decisioning models which you can
Credit Risk Officer, Joop Bruinzeel talks about test on older portfolios (also called backlog
micro-credit, the importance of credit scoring and or backtesting). The workbench also offers
the use of analytical software. the possibility to add your own insights to the
models. That allows us to refine the models, so
that we can achieve better results.”
Read entire case study1
http://www.statsoft.com/Company/Press-Room/STATISTICA-News/EntryId/579/Credit-Scoring-at-Novum-Bank-Data-
1
Mining-defines-success-in-high-risk-lending
7
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
Intel has played an important role in the Hadoop industry, culminating with its alignment with Cloudera in 2014
8
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
9
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
data to use. Some, like the obtain value from bid data analytics Bubble size = GDP Competitive intensity to
adopt bid data
construction industry, are Sectors studied
Highest Moderate
in this report
constrained by the amount Big data ease of capture
High Low
Reflects ability to own or access data and analytics
of data they can capture Higher
and even more hamstrung
Utilities Natural
by their ability to get value Infor-
resources mation
from it. Others, like the Health-
care Finance and
finance industry, both Manufacturing providers insurance
generate a lot of data and Transportation
can put it to use. In the and warehousing
Professional
graphic below from the services
Real estate
U.S. Bureau of Economic Management and rental
Construction of companies
Analysis, it is clear that
the finance industry ranks Administration, support,
highest in terms of its and waste management Accommo-
dation Government
Retail
ability to use and obtain and food and trade
value from big data. Educational services
Lower Arts and entertainment
Determining the appro- Higher
Lower
priate level of engagement Big data value potential
for a big data deployment Reflects value of data and/or competitive advantage achieved
project is an important
Source – US Bureau of Economic Analysis; McKinsey Global Institute analysis
consideration toward insur-
ing the success of the Finance industry is highest ranking in terms of its ability to use and obtain value from big data.
project. For example,
department-level big data projects generally are • To offset flat or declining revenue streams,
more successful than large-scale initiatives which financial services firms need to develop new
routinely fail. An incremental approach is better. big data centric products while also targeting
existing products to new audiences.
Here is a short list of guidelines toward the adoption
of big data for finance: • Use big data and its associated tools not only to
identify risks in real time and improve forensic
• Develop methods and services for the
accounting abilities but also to evaluate the
valuation of data — and extend their role in
risks and rewards of long-term investment in
compliance and internal control to the ethical
new products and new markets.
and effective stewardship of data assets.
• Gaining agility starts with an assessment of
• Unite disparate data from a variety of systems
existing processes and systems. Financial
designed to meet the diversity across regions
industry firms must identify what existing
regarding language, regulations, currency,
practices will not support the progress they
time zone, etc.
need. To get significant advantage in today’s
• Use big data to offer more specialized decision- competitive landscape, they should pursue
making support — often in real time— and technologies that support new, innovative
decide when data can most usefully be shared practices.
with internal and external stakeholders and
monetized as new products or services.
10
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
11
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
Security Considerations
An important application area where big data is Customers question their trust in their bank’s ability
taking a firm foothold in many financial industry to provide security and protect their privacy. Hard-
firms is information security. In conjunction with earned customer loyalty diminishes. How can you
the traditional 3 Vs of big data, financial industry detect fraud and stop attackers before they threaten
firms must consider a fourth V: vulnerability. To your financial institution and its customers? Big
manage big data effectively, you must keep it secure data technologies can help by enabling financial
and compliant with regulatory requirements at all firms to not only capture in near real time every
times (vulnerability). Protecting a vast and growing event that occurs across the entire organization but
volume of critical information — and being able to also provide context to understand these events so
search and analyze it to detect potential threats — is information can be shared to better issue alerts of
more essential than ever. As the software platforms potential and actual threats.
(e.g. Hadoop) supporting this quantity of data
Many finance industry firms are using big data to
move to mainstream use, managing their security
detect and/or prevent fraud. Big data supports
and availability becomes a big data challenge in
what’s known as continuous or behavioral
and of itself, requiring continuous diagnostics and
authentication, a process that can help prevent
monitoring.
fraud. Further, detecting security breaches
using huge volumes of security data along with
The increasingly global nature of the unstructured social media data, combined with
financial services industry makes new big data tools such as Hadoop, enables
it necessary to comprehensively financial industry firms to be more proactive about
security. Big data can enhance data security for the
address international data security finance industry through:
and privacy regulations. • Understanding activity patterns among
customers and the broader industry.
Banking and financial institutions need to secure the
• Sharing of data – critical especially about
storage, transit and use of corporate and personal
emerging attack vectors and threats.
data across business applications, including
online banking and electronic communications of • Increasing reliance on data to predict
sensitive information and documents. The typical attacks, based on trends that are targeting
IT environment consists of a mix of new and legacy the industry.
systems and applications across highly distributed One particularly good solution for finance industry
networks of branch offices, call centers and web data security requirements is Dell’s SharePlex
portals. Many of the traditional point security Connector for Hadoop. Proactive security requires
solutions that are deployed add complexity and data analytics for a business intelligence advantage
management costs, and leave gaps between and essential decision-making insight. The Hadoop
systems and applications that are highly vulnerable framework gives you that, but integrating data can
to attack. The increasingly global nature of the be time consuming, providing only snapshots that
financial services industry makes it necessary quickly become out of date. SharePlex Connector
to comprehensively address international data for Hadoop loads and continuously replicates
security and privacy regulations. changes from an Oracle database to a Hadoop
Financial institutions are top targets of cybercrime. cluster— in near real time to Hive and HDFS, and
While all types of businesses are vulnerable to in real time to Hbase. This gives you all the benefits
attacks by criminals, it’s the security breaches at of maintaining a real-time or near real-time copy of
financial firms that elicit the most media attention, source tables, so your organization can efficiently
public scrutiny and legislator consternation. When and cost-effectively perform big data analytics in
threats occur, it’s more than financial loss at stake. support of enterprise security.
12
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
13
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com
Big Data for Finance
14
www.inside-bigdata.com | 508-259-8570 | Kevin@insideBigData.com