Dvm Full Notes
Dvm Full Notes
Data Visualization is the graphical representation of information and data. By using visual
elements like charts, graphs, and maps, data visualization tools provide an accessible way to
see and understand trends, outliers, and patterns in data.
Key Features:
1. Clarity: Simplifies complex datasets into visual formats for easy interpretation.
Examples:
1. Improved Communication: Complex datasets become stories that are easier to share
and comprehend.
2. Pattern Recognition: Visual tools reveal trends, outliers, and correlations not obvious in
raw data.
4. Engagement: Visuals capture and retain attention better than textual or numeric data.
Real-World Relevance:
Visual Perception
Visual perception refers to how humans interpret and understand visual inputs. It plays a
critical role in designing effective data visualizations.
Key Principles:
1. Pre-attentive Processing:
o Certain visual properties (e.g., color, size, position) are noticed almost instantly.
2. Gestalt Principles:
o Continuity: Ensure visual flow aligns with natural reading patterns (e.g., left to
right).
Grammar of Graphics
The Grammar of Graphics is a framework for creating systematic and flexible visualizations. It
underlies many modern tools like ggplot2 in R.
Key Components:
2. Aesthetic Mappings:
3. Geometries:
4. Scales:
o Adjusting data to fit visual scales.
5. Facets:
6. Annotations:
Example Workflow:
Message to Charts
The "Message to Charts" principle emphasizes aligning the visualization with the intended
message. Each chart should tell a specific story based on the data.
4. Simplify:
Example:
• Chart: A line graph comparing Q1 and Q2 sales figures with an annotation marking the
growth.
UNIT – 2
Installing Power BI
Power BI is a powerful business analytics tool developed by Microsoft for creating
interactive reports and dashboards.
Steps to Install:
1. Download:
o Visit the official Microsoft Power BI website.
o Choose the appropriate version (Power BI Desktop for free or Pro for
advanced features).
2. Install:
o Run the downloaded setup file and follow on-screen instructions.
o Ensure all dependencies (e.g., .NET framework) are updated.
3. Sign In:
o Open Power BI Desktop and sign in with your Microsoft account.
System Requirements:
• Operating System: Windows 10 or later.
• Memory: Minimum 4 GB RAM (8 GB recommended).
• Disk Space: At least 2 GB available.
Installing Tableau
Tableau is a widely-used data visualization tool that enables users to create interactive
and shareable dashboards.
Steps to Install Tableau:
1. Download:
o Visit the Tableau official website.
o Download Tableau Desktop (14-day free trial available).
2. Install:
o Run the installer and follow the prompts.
o Accept the license agreement and complete the setup.
3. Activate:
o Sign in with your Tableau account or enter the activation key provided.
System Requirements:
• Operating System: Windows 10 or macOS Mojave (or later).
• RAM: Minimum 4 GB (8 GB recommended).
• Disk Space: At least 1.5 GB free space.
Descriptive Statistics in R
Descriptive statistics summarize and describe the features of a dataset.
Common Descriptive Functions in R:
1. Basic Statistics:
o Mean: mean(data)
o Median: median(data)
o Mode: Custom function needed.
2. Dispersion:
o Range: range(data)
o Variance: var(data)
o Standard Deviation: sd(data)
3. Summarize Data:
o summary(data) provides an overview, including minimum, maximum,
mean, median, and quartiles.
4. Frequency Tables:
o table(data) creates frequency counts.
Example Code:
# Sample Data
values <- c(10, 20, 30, 40, 50)
# Descriptive Statistics
mean_value <- mean(values)
median_value <- median(values)
variance_value <- var(values)
summary(values)
Scatter Plots in R
Scatter plots visualize relationships between two continuous variables.
Creating a Scatter Plot:
Base R:
# Sample Data
x <- c(1, 2, 3, 4, 5)
y <- c(2, 4, 6, 8, 10)
# Scatter Plot
plot(x, y, main = "Scatter Plot", xlab = "X-axis", ylab = "Y-axis", col = "blue")
ggplot2 Package:
library(ggplot2)
# Sample Data
data <- data.frame(x = c(1, 2, 3), y = c(2, 4, 6))
# Scatter Plot
ggplot(data, aes(x = x, y = y)) + geom_point(color = "red") + theme_minimal()
Histograms in R
Histograms represent the distribution of a dataset by dividing it into intervals (bins).
Creating a Histogram:
Base R:
# Sample Data
values <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
# Histogram
hist(values, main = "Histogram", xlab = "Values", col = "green", border = "black")
ggplot2 Package:
library(ggplot2)
# Sample Data
data <- data.frame(values = c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4))
# Histogram
ggplot(data, aes(x = values)) + geom_histogram(binwidth = 1, fill = "blue", color =
"black") + theme_light()
UNIT – 5
Visualization Approaches