0% found this document useful (0 votes)
23 views

Bi 4 5

Uploaded by

ikher.shivin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views

Bi 4 5

Uploaded by

ikher.shivin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Experiment - 4

Aim :- Data Visualization from ETL process


Software :- Microsoft Power BI

Theory:
Data visualization within the context of Business Intelligence (BI) is intricately tied to the Extract, Transform,
Load (ETL) process, contributing to the effective communication of insights. Theoretical foundations guide this
process:

1. Data Extraction (Extract): In the initial phase of ETL, data is sourced from diverse systems. Theoretical
principles of data compatibility and integration inform decisions on extracting relevant information from
structured and unstructured sources, ensuring data consistency.

2. DataTransformation (Transform): The transformation phase involves shaping and structuring the data.
Theoretical aspects of data cleaning, nomalization, and enrichment are applied, aligning with principles from
relationaldatabase theory. Transformation also encompasses the preparation of data for optimal visual
representation.

3. Data Loading (Load): Loaded data is organized within BI systems, often adopting principles from data
modelling theories. A well-structured data model facilitates efficient querying and supports the creation of
meaningful visualizations.

4. Visualization Design: Theoretical concepts from visual perception and cognitive psychology guide the design
of data visualizations. Principles like Edward Tufte's data-ink ratio, color theory,and Gestalt principles are
employed tocreate visually appealing and informative dashboards and reports.
5. Interactivityand User Experience: Theoretical underpinnings of human-computer interaction influencethe
incorporation of interactiveelements. Users can explore data dynamically, allowing for a more intuitive and
engaging experience, aligning with usability and user experience design theories.
Procedure :
InPower BI Desktop, open the Retail Analysis Sample PBIX file in report view .In the Power BI
service, open theRetail Analysis Sample PBIX file and select Edit.

There are various methods available in Power BÊ for visualization, these include:
1. Area Chart

a. Select to add a new page.


b. Fromthe Fields pane, select Sales >Last Year Sales, and This Year Sales > Value.
13
the Area chart icon from the Visualizations
C Convert the chart to abasic area chart by selecting
pane.

d Select Time > FiscalMonth to add it to the Axis well.


Visualizations > Fields
L.5st Yenr Sates and his Year Sales by fKCaonth Buikd visual
| Search
Last Year Sales This Vear Sales Fiters
>Sales
$4M D:strikt
tem

Store
Time
IsSM FiscaMcoth

R Py CU iscoYear
>OfMonth
$2M
O Period

X-ayis
Feb Mar Apr May un Jul
FiscalMonth

Last Year Sales


This Year Sales

the visual) and choose Sort


To display the chart by month, select the ellipses (top right corner of
either Sort
by > FiscalMonth. Tochange the sort order, select the ellipses again and select
ascending or Sort descending.
2. Doughnut chart
to convert your bar chart to
On the Visualizations pane, select the icon for doughnut chart
adoughnut chart. If Last Year Sales isn't in the Values section of the Visualizations pane, drag
it
there.

b. Select Item > Category to add it to the Legend area of the Visualizations pane.
< Visualizations > Data
Build visual
Filters
Search

D60-intate
>Sales
020-Merns
066-Acce
item
G10 Buyer
Category
FamilyNane
050 Py Segment
G30-Kids
> Store
Tme
D40-Juniors
DPG-Home

Legend
Category

Vakues

14 Last Year Sales


c. Optionally, adjust the size and color of the chart's text.
3. Cards
a
On the Data pane, expand Store and select the Open Store Count checkbox. By default,
Power BIcreates a clustered column chart with the single data value. You can convert thechart
toa card visualization.

Visualizations Data
Build visual
F
Filters Search
>Sales
Distrikt
> tem
Store
Ua Average Selling .
Chain
DCity
OCount of OpenD.
Name
DNew Stores
Xaxs
ONew Stores Target
Add data fields here Ooen Month
Y-axis Open Store Count
ZUpen Year
Open Store Count

b. In the Visualizations pane, select the Card icon.


4. Combochart
From the Fields pane, select Sales > This Year Sales > Value.
b Select Sales > Gross Margin This Year and drag it to the Y-axis well.
C Select Time > FiscalMonth and drag it to the X-axis well.
d. The visualization willbe similar to this one.

This Year Saies and Gross Margin This Vear by FiscaiMonth


This Year Sales Grass Margin This Year
$4M

$3M

$2M

S1M

SOM
Jan Feb Mar Apr May juri Jul Aug
e Select the ellipsis again and choose Sort axis > Sort ascending.
t Convert the column chart to a combo chart. There are two combo charts available: Lineand
stacked column and Line and clustered columnn. With the column chart selected, from the
Visualizations pane select the Line and clustered column chart.
In theupper-right corner of the visual, select the More optionsellipsis (...) and select Sort axis >
FiscalMonth.
Export data
d Show as a table
X Remove
Spotlight
Sort axis FiscalMonth

Total Units This Year

I Sort descending
li Sort ascending
h From the Fields pane, drag Sales > Last Year Sales to the Line y-axis bucket.
i Combochart should look something like this:

This Yezar Saies Sross MarginThis YearLast Year Sales

$34

Ax
Experiment - 5
Aim: Implementation of Classification algorithm.

Software used: Python


Theory:
analyze and categorize data, contributing to
In Business Intelligence (BI), classification algorithms are employed tolearning, particularly supervised learning,
informed decision-making. The theoretical foundation lies in machine the class or category of new, unseen data.
where algorithms learn patterns from labeled historical data to predict
on a labeled dataset, learning relationships between
1. Training Phase: The classification algorithm is initially trained
statistical and probabilistic theories, aiming to identify
input features and predefined classes. This phase aligns with
patterns and correlations within the data.

2. Feature Selection and Engineering: Theoretical principles of feature selection and engineering are applied to
Relevant features are chosen based on their
optimize the model's ability to discriminate between classes. principles.
significance, aligning with information theory and data dimensionality reduction
3. Model Construction: The algorithm constructs a
predictive model, often based on theoretical frameworks such as
These models encapsulate the learned relationships and
decision trees, support vector machines, or neural networks.
form the basis for classifying new data.
statistics guide the evaluation and validation of the
4. Evaluation and Validation: Theoretical concepts from performance and
precision, recall, and accuracy are employed to assess the model's
classification model. Metrics like
generalization to new, unseen data.
integrated into BI systems, where it classifies
5. Deployment in BI: Thetrained and validated classification model is enhances BI capabilities, providing
of machine learning
andcategorizes incoming data. This theoretical applicationassessment, and other classification-based business
insights into customer segmentation, fraud detection, risk
scenarios.

Procedure:
import pandas as pd
import matplotlib.pyplot as plt
Python list.
# Define the rainfall data as a
784.2, 985, 882.8
rainfall = [799, 1174.8, 8 65. 1, 1334.6, 635.4,, 918.5, 685.5, 998. 6,
1071]

time series data.


a Datetime Index to represent the
4 Create a pandas DataFrame with
monthly frequency.
# Start from January 2012 with
"2012-01-01 !"
start date =

17
end date = "2012-12-01"

date range = pd.date range (start=start_date, end-end date, freq='M')


rainfall df = pd. Data Frame (rain fall, index=date range, columns=["Rain fal1 "])

# Print the time series data.


print (rainfall df)

# Create a plot of the time series.

plt.figure (figsize=(8, 4)) # Define the plot figure size (optional) .


plt.plot (rainfall df. index, rainfall df[ 'Rainfall'], marker='o', linestyle=!-)
plt.xlabel (' Date')
plt.ylabel ('Rainfall')
plt.title ('Monthly Rainfall Time Series')
plt.grid (True)

# Save the plot to a file.


plt.savefig ("rain fall. png" )
# Show the plot (optional) .
plt.show ()
Qutput:
When we execute the above code, it produces the following result and chart

Jan Feb M¡r Apr May Jun Jul Aug Sep


2012 799.0 1174.8 865.1 1334.6 635.4 918.5 685.5 998,6 784.2
Oct Nov Dec
2012 985.0 882.8 1071.0

Monthly RainfallTime Series


1300

1200

1100

1000

900

800

700

2012-03 2012-05 2012-07 2012-09 2012-11 2013-01


Date

18

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy