Analytics Assignment 2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

1

Contents

1 Introduction.........................................................................................................................3

2 Part 1 – recent development (data democratization)..........................................................3

3 Part 2 – comparative analysis of data visualization concepts.............................................4

4 Part 3(a) – Visualizations and dashboard...........................................................................4

4.1 First visualization (total number of bedrooms sold per ZIP code)..............................4

4.2 Second visualization (price of house sold vs. house year built)..................................5

4.3 Third visualization (sum price of houses sold by ZIP code).......................................6

4.4 Fourth visualization (median price of house sold by ZIP code)..................................7

5 Visualization dashboard......................................................................................................8

6 Part 3(b) – Interpretation of visualizations.........................................................................9

2
1 Introduction

The assignment is a case study for the housing unit sales in King County, US, between the

year 2014-2015. The information has been obtained from Kaggle and then visualizations

have been created with Tableau 22.0 as well as their managerial interpretations have been

covered. The paper also provides a brief account of the recent development in visualizations,

which is data democratization given its relevance to the housing data obtained from Kaggle.

Some insights have also been included for the relevance of the visualization concepts towards

the UAE Ministry of Health.

2 Part 1 – recent development (data democratization)

The relevant concept in this case that is applicable to our visualizations and datasets is that of

data democratization. In many cases, the information that is used but certain parts may be

difficult for everyone to understand. Research by (Flaherty, Sturm, & Farries, 2022) suggests

that one of the challenges associated with data visualizations is the relatively limited

understanding of individual employees due to them working in silos, a problem that can be

solved through data democratization.

For instance, in the chosen dataset, which is the house sales in King County, US, can help us

uncover insights about the housing and market in the said county of the United States.

However, some aspects of the information may not be easily understood unless someone has

specific knowledge for that, such as some of the real estate agents in the country, for that

matter.

So, data democratization refers to the concept of enabling everyone in the company being

comfortable with the data, so that they can work with it or at least interact with it at least –

regardless of their technical knowledge about data analysis or the lack thereof. Therefore, in

our present dataset, data democratization is important, because some details related to our

3
housing sales in the county can be better understood by individuals who were involved in the

transaction.

As per (Sil, Sharma, Jhamb, Marathe, & Sharma, 2021), data democratization in multiple

contexts can help us achieve better results towards getting or gaining a better understanding

of the data, including the ways it is visualized or interpreted as well as the insights it

provides.

3 Part 2 – comparative analysis of data visualization concepts

4 Part 3(a) – Visualizations and dashboard

Relevant to the dataset chosen, which is the housing sales in the King County, the student has

elected to create four data visualizations with Tableau, including the (1) total number of

bedrooms sold per ZIP code, (2) price of house sold vs. year of house built, (3) sum price of

house sold by ZIP code, and (4) median price of house sold by ZIP code. Once the four

visualizations have been created, a dashboard has been created that has been included later in

the section. The next section (3b) interprets the importance of each of the visualizations from

the managerial perspective and how they could be helpful in strategic decision-making in the

boardroom for the company.

4.1 First visualization (total number of bedrooms sold per ZIP code)

The first visualization was focused on the number of bedrooms sold per ZIP code. Therefore, the

student has constructed a heatmap to show which localities have purchased the greatest number of

bedrooms and by extension, houses. Therefore, the heatmap visualization has been shown below. The

variables in the visualization are –

 ‘SUM (Bedrooms)’ – the total number of bedrooms sold, since this is a SUM function of the

number of bedrooms sold

4
 Zip code – the respective ZIP code representing the locality in the county

The greater number of bedrooms sold in a ZIP code, the darker shade of the blue the box

appears in the heatmap. There is another visualization that later talks about the total price of

the houses sold per ZIP code, but the quantity or number of houses sold may not be the same

as the total price of the houses; some neighbourhoods may have higher sales volume of

housing units, but some others may have higher sales revenue figures of housing units.

4.2 Second visualization (price of house sold vs. house year built)

The second visualization was a comparison of two variables, namely ‘Yr_built’ and

‘SUM(Price)’. The definitions of the two variables are as follows –

 Yr_built – the year in which the house was initially built (not renovated)

 SUM (price) – the price of each house sold; the SUM function indicates that it is the

total of all houses sold constructed in the respective years

5
The illustration is a line graph with a trend to show how the year of construction affects the

house prices or what customers are willing to pay for them. The idea of this graph is to better

understand whether there is any trend or not about how the houses are priced on the basis of

when or in which year they have been constructed in the first place.

4.3 Third visualization (sum price of houses sold by ZIP code)

Third visualization is about the median price of house sold classified by the ZIP code. The three

variables included here were ‘MEDIAN (Price)’, ‘SUM (Price)’, and Zip code. The definitions for the

three variables are given below as –

 SUM (Price) – the sum of all the houses sold in a ZIP code (in this visualization)

 Zip code – the ZIP code for the locality in the King County

Since the house sales would bring profits for the real estate companies in the county, the

visualization has been coded in green automatic colour scheme/palette. The higher the sum

price of the house sold in the ZIP code or locality, the greener the said box in the heatmap is.

6
For instance, the 98004 ZIP code is greener than the 98039 ZIP code, because the price or

sum price for 98004 is $429 million approximately, which is greater than that of 98039,

which is $108 million.

4.4 Fourth visualization (median price of house sold by ZIP code)

The fourth and final visualization here considers two variables here once more; including

‘Zip code’ and ‘MEDIAN (Price)’; the definitions here have been mentioned as shown below

 Zip code – which shows the zip code for the locality in the King County

 MEDIAN (Price) – this shows the median price of the house sold in the said locality

or ZIP code

The difference between the fourth and third visualization is that the third visualization is a

heatmap that visualizes the most profitable or ZIP codes or localities with the highest SUM

(Price), while the fourth visualization is focused on the median price of the houses sold per

locality or ZIP code.

7
The visualization has been shown below as noted.

5 Visualization dashboard

The visualization dashboard includes the four visualizations including the number of

bedrooms sold per ZIP code, the sum price of the houses sold per ZIP code, the median price

of houses sold by ZIP code, and finally, the price of house sold vs. the year of house build.

8
6 Part 3(b) – Interpretation of visualizations

The purpose of this section is to better explain and understand how and why each of the four

visualizations in the dashboard are important from the managerial perspective. In other

words, each of those figures has insights that could be likely used by the company to make

better decisions in the boardroom.

The first visualization that examines the number of bedrooms sold per ZIP code indicate that

ZIP code 98052, 98038, and 98006 are among the three localities with the highest sales

volume (in the terms of no. of bedrooms sold) with 2,076, 2,072, and 1,913 units

respectively. Therefore, it appears that the residents in those localities may be purchasing

more properties and hence, it could be possible to further support marketing campaigns to

target the prospective buyers in those areas. Alternatively, the zip codes 98148 and 98039 had

only 179 and 203 bedrooms sold. Therefore, it can be interpretated that those localities do not

9
have buyers for housing properties and thus, any further constructions should be halted or

discontinued to save capital that may be reallocated elsewhere.

The second visualization helps us understand how the year of housing construction affects the

price for which the house may be sold. With the help of the illustration, we can see that

although there is an upward trend, there may be certain cases wherein houses built in specific

years may be sold for very little. For instance, the houses built in 2014 were sold

cumulatively for $382 million, but houses built in 2015 were sold cumulatively for $28.87

million only. However, it is clear that the houses that were constructed in the 1990s were sold

for much less. The managerial takeaway point here is that the company should not try to

purchase older residences and properties with the intention of renovating them and selling

them, because of two reasons, including (1) the additional costs associated with renovation of

the older properties, and (2) the overall drop in the price due to the properties being aged or

dated in terms of their construction dates.

The third visualization shows us the cumulative sales in houses sold per ZIP code. The

purpose of this visualization is to help us as managers better analyse and understand where

and how the total sales revenue from the housing unit sales were the highest. It is clearly

noticed that the zip codes 98004, 98006, and 98052 were among the highest grossing

neighbourhoods as the company was able to sell maximum properties there. Some connection

can be drawn here with the first visualization where it was seen that neighbourhoods 98006

and 98052 also accounted for some of the highest number of bedrooms sold. Therefore, it is a

further confirmation that those zip codes must be targeted by the company for further

property development because there are buyers who also have the capital and willingness to

purchase houses.

10
Finally, the fourth visualization shows us the median price of the houses sold per ZIP code.

This visualization here is created to help us better understand which localities have the

wealthiest customers. The median price appeared to be the highest at $1.892 million in the

zip code 98039, though that is an outlier statistically. Nonetheless, zip codes 98004 and

98040 too had median prices of $1.15 million and $993,750 respectively, indicating that

wealthier customers do live in those zip codes and hence, the company could consider

investing to construct luxury residential properties targeted for a relatively smaller market

segment with adequate capital to purchase those housing units. Alternatively, the median

price ranges from $235,000 in zip code 98002 to $915,000 in zip code 98112 that provides us

an idea of what the prices of housing properties in the county looks like and what most

properties can be priced between.

11

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy