0% found this document useful (0 votes)
14 views

Data Analytics

IITP

Uploaded by

challengesphere
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Data Analytics

IITP

Uploaded by

challengesphere
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

LECTURE-4-6

Structured and Unstructured Data

Structured data and unstructured data are two different types of data that are sourced,
collected, and scaled in different ways. Structured data is highly organized and easily
decipherable by machine learning algorithms. It is typically categorized as quantitative data
and is managed using structured query language (SQL). Examples of structured data
include dates, names, addresses, and credit card numbers. Structured data is easily used
by machine learning algorithms and business users. It is accessible by more tools compared
to unstructured data. However, structured data has limited usage and storage options. It is
generally stored in data storage systems with rigid schemas. Changes in data requirements
necessitate an update of all structured data
Un-structured Data

On the other hand, unstructured data has no predefined format or organization.


It is more difficult to collect, process, and analyze compared to structured data.
Examples of unstructured data include text documents, images, email
messages, word-processing documents, and PDF files. Unstructured data can
be collected from various sources such as open-ended survey results, social
media comments, and email messages.
Collection of Data Through IOT

Data collection using the Internet of Things


(IoT) involves the use of interconnected
devices, sensors, and systems to collect data
from the physical world and transmit it over
the internet or other networks for storage,
analysis, and decision-making. Here's how data
collection works in IoT:
Sensors

Sensor Deployment: IoT applications


typically start with the deployment of
various sensors and devices in the physical
environment. These sensors can include
temperature sensors, humidity sensors,
motion detectors, GPS modules, cameras,
and many others, depending on the specific
use case
Data Collection using Sensors

Data Acquisition: Sensors continuously monitor and collect


data from the environment they are placed in. This data can be
in the form of temperature readings, motion events, location
coordinates, video feeds, or any other measurable parameter.
The data is often in analog form and may need to be converted
into digital format using analog-to-digital converters.
Data Processing at the Source:

Data Processing at the Edge: In some IoT systems, data


processing may occur at the edge, i.e., on the devices
themselves. This can involve basic data filtering, aggregation, or
the application of simple algorithms to reduce the volume of
data that needs to be transmitted. Edge processing helps in
conserving bandwidth and reducing latency.
Data Transmission:

Data Transmission: Processed or raw


sensor data is transmitted to a central
location or the cloud using
communication protocols such as Wi-Fi,
Bluetooth, cellular networks
(3G/4G/5G), depending on the IoT
application and network availability.
Data Storage

Cloud Storage and Processing: Once


the data reaches the cloud, it is
typically stored in databases or data
lakes. Cloud-based servers and
services then process, analyze, and
store the data. This can involve real-
time data streaming, batch
processing, and data warehousing
technologies.
Data Analysis

Data Analysis: IoT data can be


analyzed for various purposes, such
as monitoring, predictive
maintenance, anomaly detection,
and more. Advanced analytics
techniques, including machine
learning and artificial intelligence,
may be applied to derive valuable
insights and patterns from the data.
Data visualization

Visualization and Reporting:


IoT platforms often provide
dashboards and visualization
tools to present the data in a
user-friendly format. Users
can monitor the data, set up
alerts for specific events,
and generate reports for
decision-making
Smart Equipment's

Feedback and Control: IoT systems


can also be designed to provide real-
time feedback and control. For
example, based on sensor data, an IoT
system might adjust the temperature
in a smart thermostat, control the
lighting in a smart home, or optimize
the operation of industrial equipment.
Security and Privacy

Security and Privacy: Given the


sensitive nature of IoT data,
security measures like encryption,
access controls, and secure
authentication are crucial to
protect data integrity and user
privacy.
Applications

Data collection using IoT has a wide


range of applications, including smart
cities, industrial automation, agriculture,
healthcare, environmental monitoring,
and more. It enables organizations and
individuals to make data-driven
decisions, automate processes, and
improve efficiency in various domains.
Collection of Data From Web and Network

Collecting data from the web and


social networks involves various
techniques and tools, and it's
essential to ensure that data
collection is done ethically and
within legal boundaries. Here's an
overview of how data can be
collected from the web and social
networks:
Web Crawlers

Web Crawlers: Web crawlers, also known


as spiders or bots, are automated
programs that browse websites and
collect information. They follow links on
web pages and index content for search
engines. While web crawlers collect
publicly available data, some websites
have measures in place to block or limit
web crawling
HTML (The Hyper Text Markup Language) Parsing

HTML Parsing: Web scraping involves


parsing the HTML structure of web
pages to extract specific data
elements, such as text, images, and
links. Tools like BeautifulSoup
(Python) and Cheerio (JavaScript) are
commonly used for this purpose
Parser: A computer program that breaks
down text into recognized strings of
characters for further analysis.
DOM

The Document Object


Model (DOM) is a
programming API
(Application Programming
Interface) for HTML and
XML (Extensible Markup
Language) documents. It
defines the logical structure
of documents and the way a
document is accessed and
manipulated
Data Scraping

Data Scraping: Web scraping


techniques can be used to collect
public data from social media
profiles and posts. However,
scraping social media data may
violate platform terms of service,
so it's essential to review and
comply with their policies.
Ethical Considerations

When collecting data from the web and social networks,


consider these ethical considerations:
1. Privacy: Respect user privacy and adhere to privacy
laws and regulations. Avoid collecting personal
information without consent.
2. Terms of Service: Review and comply with the terms
of service of websites and social media platforms. Some
platforms explicitly prohibit data scraping.
3. Data Usage: Be transparent about how you intend to
use the data and ensure it's for lawful and ethical
purposes.
Ethical Considerations

4. Data Quality: Ensure the data you collect is


accurate and reliable. Scraper errors or API
limitations can affect data quality.
5. Data Storage: Securely store the data you
collect to prevent data breaches or
unauthorized access.
6. Frequency: Be mindful of the frequency of
data collection to avoid overloading servers or
violating platform rate limits.

Data collection from the web and social networks can provide valuable insights for various
purposes, such as market research, sentiment analysis, trend monitoring, and more. However,
it's essential to follow ethical guidelines and legal requirements to ensure responsible data
collection and usage

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy