Bi 4 5
Bi 4 5
Theory:
Data visualization within the context of Business Intelligence (BI) is intricately tied to the Extract, Transform,
Load (ETL) process, contributing to the effective communication of insights. Theoretical foundations guide this
process:
1. Data Extraction (Extract): In the initial phase of ETL, data is sourced from diverse systems. Theoretical
principles of data compatibility and integration inform decisions on extracting relevant information from
structured and unstructured sources, ensuring data consistency.
2. DataTransformation (Transform): The transformation phase involves shaping and structuring the data.
Theoretical aspects of data cleaning, nomalization, and enrichment are applied, aligning with principles from
relationaldatabase theory. Transformation also encompasses the preparation of data for optimal visual
representation.
3. Data Loading (Load): Loaded data is organized within BI systems, often adopting principles from data
modelling theories. A well-structured data model facilitates efficient querying and supports the creation of
meaningful visualizations.
4. Visualization Design: Theoretical concepts from visual perception and cognitive psychology guide the design
of data visualizations. Principles like Edward Tufte's data-ink ratio, color theory,and Gestalt principles are
employed tocreate visually appealing and informative dashboards and reports.
5. Interactivityand User Experience: Theoretical underpinnings of human-computer interaction influencethe
incorporation of interactiveelements. Users can explore data dynamically, allowing for a more intuitive and
engaging experience, aligning with usability and user experience design theories.
Procedure :
InPower BI Desktop, open the Retail Analysis Sample PBIX file in report view .In the Power BI
service, open theRetail Analysis Sample PBIX file and select Edit.
There are various methods available in Power BÊ for visualization, these include:
1. Area Chart
Store
Time
IsSM FiscaMcoth
R Py CU iscoYear
>OfMonth
$2M
O Period
X-ayis
Feb Mar Apr May un Jul
FiscalMonth
b. Select Item > Category to add it to the Legend area of the Visualizations pane.
< Visualizations > Data
Build visual
Filters
Search
D60-intate
>Sales
020-Merns
066-Acce
item
G10 Buyer
Category
FamilyNane
050 Py Segment
G30-Kids
> Store
Tme
D40-Juniors
DPG-Home
Legend
Category
Vakues
Visualizations Data
Build visual
F
Filters Search
>Sales
Distrikt
> tem
Store
Ua Average Selling .
Chain
DCity
OCount of OpenD.
Name
DNew Stores
Xaxs
ONew Stores Target
Add data fields here Ooen Month
Y-axis Open Store Count
ZUpen Year
Open Store Count
$3M
$2M
S1M
SOM
Jan Feb Mar Apr May juri Jul Aug
e Select the ellipsis again and choose Sort axis > Sort ascending.
t Convert the column chart to a combo chart. There are two combo charts available: Lineand
stacked column and Line and clustered columnn. With the column chart selected, from the
Visualizations pane select the Line and clustered column chart.
In theupper-right corner of the visual, select the More optionsellipsis (...) and select Sort axis >
FiscalMonth.
Export data
d Show as a table
X Remove
Spotlight
Sort axis FiscalMonth
I Sort descending
li Sort ascending
h From the Fields pane, drag Sales > Last Year Sales to the Line y-axis bucket.
i Combochart should look something like this:
$34
Ax
Experiment - 5
Aim: Implementation of Classification algorithm.
2. Feature Selection and Engineering: Theoretical principles of feature selection and engineering are applied to
Relevant features are chosen based on their
optimize the model's ability to discriminate between classes. principles.
significance, aligning with information theory and data dimensionality reduction
3. Model Construction: The algorithm constructs a
predictive model, often based on theoretical frameworks such as
These models encapsulate the learned relationships and
decision trees, support vector machines, or neural networks.
form the basis for classifying new data.
statistics guide the evaluation and validation of the
4. Evaluation and Validation: Theoretical concepts from performance and
precision, recall, and accuracy are employed to assess the model's
classification model. Metrics like
generalization to new, unseen data.
integrated into BI systems, where it classifies
5. Deployment in BI: Thetrained and validated classification model is enhances BI capabilities, providing
of machine learning
andcategorizes incoming data. This theoretical applicationassessment, and other classification-based business
insights into customer segmentation, fraud detection, risk
scenarios.
Procedure:
import pandas as pd
import matplotlib.pyplot as plt
Python list.
# Define the rainfall data as a
784.2, 985, 882.8
rainfall = [799, 1174.8, 8 65. 1, 1334.6, 635.4,, 918.5, 685.5, 998. 6,
1071]
17
end date = "2012-12-01"
1200
1100
1000
900
800
700
18