0% found this document useful (0 votes)
2 views15 pages

08 Big Data Introduction

Big Data refers to large and complex datasets characterized by volume, velocity, and variety, which traditional tools struggle to manage. It includes structured, semi-structured, and unstructured data, with technologies like Hadoop and Spark facilitating analysis. Big Data is crucial for evidence-based decision-making, customer insights, and operational efficiency, despite challenges such as data quality and security.

Uploaded by

ilalarukh21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views15 pages

08 Big Data Introduction

Big Data refers to large and complex datasets characterized by volume, velocity, and variety, which traditional tools struggle to manage. It includes structured, semi-structured, and unstructured data, with technologies like Hadoop and Spark facilitating analysis. Big Data is crucial for evidence-based decision-making, customer insights, and operational efficiency, despite challenges such as data quality and security.

Uploaded by

ilalarukh21
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Introduction to Big Data

 Big Data refers to extremely large and complex


datasets beyond the capacity of traditional data
tools

Characterized by the 3 Vs:


 Volume: Massive amounts of data (terabytes to zettabytes)
 Velocity: High-speed data generation and processing
 Variety: Structured, semi-structured, and unstructured data
Introduction to Big Data

Sources of Big Data include:

 Social media, e-commerce, mobile apps, IoT, and business systems

 Traditional tools like Excel or SQL struggle with big data

 Big Data technologies include Hadoop, Spark, and NoSQL databases

 Big Data Analytics refers to advanced techniques for analyzing large


data sets to uncover patterns and trends
Types of data w.r.t structure

 Big Data comes in many forms. Categorizing data


helps determine how to store, process, and analyze
it.

Three major types:


 Structured Data
 Semi-Structured Data
 Unstructured Data
What is Structured Data?

Structured data refers to data that is organized in a predefined


format such as rows and columns. It follows a consistent schema,
making it easy to enter, store, query, and analyze.

 Data organized into tables with rows and columns.


 Follows a predefined schema.
 Easy to store, search, and analyze.
 Examples: SQL databases, Excel spreadsheets, Transaction records
Characteristics of Structured
Data

 High degree of organization (highly organized)


 Stored in relational databases
 Easily queried using SQL
 Used in business operations and reporting
What is Semi-Structured Data?

Semi-structured data doesn't follow a strict tabular format but still


contains tags or markers to separate elements and enforce
hierarchies of records and fields.

 Does not conform to a strict table format.


 Contains tags, markers, or metadata to separate elements.
 More flexible than structured data.
 Examples: XML, JSON, Emails, NoSQL databases
 Email is an example of semi-structured because it has structured and
unstructured parts: subject, sender, receiver body.
Characteristics of Semi-Structured
Data

 Flexible schema (structure can change)


 Not stored in traditional relational databases
 Partially organized
 Easier to analyze than unstructured data
What is Unstructured Data?

 No predefined structure or model.


 Most difficult to store and analyze.
 Requires advanced tools for processing.
 Examples: Videos, images, social media, raw sensor
data, PDFs
Characteristics of Unstructured
Data

 No schema or organization
 Cannot be easily stored in traditional databases
 Requires big data tools (Hadoop, Spark, AI/ML) for
analysis
 Makes up 80–90% of all data today
Importance of Big Data in MIS

1. Enables organizations to make evidence-based decisions


2. Helps identify patterns and trends that are not visible in small
data
3. Supports customer behavior analysis for personalized marketing
4. Enhances risk management and fraud detection
5. Improves operational efficiency through process optimization
6. Facilitates real-time decision-making and performance tracking
7. Drives innovation by revealing unmet customer needs
8. Supports predictive and prescriptive analytics for strategy
formulation
Big Data Technologies

 Hadoop: Open-source framework for distributed


storage and processing
 Spark: Fast in-memory data processing engine
 NoSQL Databases: Handle unstructured data efficiently
(e.g., MongoDB, Cassandra)
 Data Lakes: Store raw, unprocessed data at scale
 ETL Tools: Extract, transform, and load data for analysis
 Cloud Platforms: AWS, Google Cloud, and Azure
support big data processing and storage
 Data Warehouses: Organize structured data for
efficient querying
Applications of Big Data in
Business

 Customer Analytics: Understand preferences, predict churn, personalize


experiences
 Supply Chain Optimization: Monitor and forecast logistics and inventory
 Financial Analysis: Detect fraud, manage risk, improve investment
decisions
 Marketing Campaigns: Target audiences more accurately using data
insights
 Healthcare Analytics: Enhance patient care and predict disease
outbreaks
 HR Analytics: Measure employee performance, reduce turnover, recruit
effectively
 Retail Intelligence: Manage pricing, product placement, and demand
forecasting
Challenges in Big Data
Management

1. Data Quality: Inaccurate or inconsistent data affects


reliability
2. Data Security: Risk of breaches, leaks, and
unauthorized access
3. Integration Issues: Difficulty in consolidating data from
multiple sources
4. Data Storage: High cost and complexity of managing
large volumes
5. Skilled Workforce: Need for data scientists and big data
engineers
6. Regulatory Compliance: GDPR and other laws
demand responsible data handling
Case Study: Netflix and Big
Data

 Netflix uses Big Data to:


 Recommend personalized content to users
 Decide which original shows to produce
 Analyze viewer behavior for engagement strategies
 Optimize streaming quality across devices and
regions
 Reduce churn by predicting subscriber
dissatisfaction
 Leverages predictive analytics and machine
learning algorithms
Summary of Big Data and
Analytics

 Big Data transforms how businesses make decisions


and operate
 Unlocks valuable insights from complex and high-
volume data
 Supports personalized customer experiences and
operational efficiency
 Despite challenges, organizations that harness Big
Data gain a competitive edge
 Integration with MIS ensures data-driven culture
across departments

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy