CI Entrance Test

Download as xlsx, pdf, or txt
Download as xlsx, pdf, or txt
You are on page 1of 13

Database

Table Name dwd_login_event


Description Table records all login events onto App/Web of Shopee
Primary keys: user_id, login_datetime
Column Name Data Type Description
user_id bigint user_id login
grass_date bigint date of login event
login_datetime string Time of login event
login_platform string Platform of login: App/Web

Table Name order_item_mart


Description Table records all orders/transactions at product variation level from beginning of time
Primary keys: order_id, item_id, model_id

Column Name Data Type Description


order_id bigint order id of the order
item_id bigint item id of the items purchased in the order Note: for bundle deal, th
model_id bigint model id of the models purchased in the order Note: for bundle dea
buyer_id bigint unique userid of the buyer
checkout_channel string Indicates which platform is the checkout performed on
is_web_checkout tinyint Whether the order was checked out from web portal (web include P
create_timestamp bigint Timestamp of the order when its created (epoch time)
create_datetime string Datetime of the order when its created (local time in string)
complete_datetime string Datetime of the order when its completed / accepted by the buyer
release_timestamp bigint Time when the order should already be completed and escrow sta
release_datetime string Time when the order should already be completed and escrow sta
cancel_datetime string Datetime of the order when its canceled by the buyer (local time in
is_net_order string 1 is net, 0 is cancelled
shop_id bigint unique id of the shop
shop_name string name of the shop
is_official_shop tinyint whether the order is from official shop shop
is_cb_shop tinyint Whether the order is from a cross border shop
seller_id bigint unique userid of the seller
seller_name string username of the seller
seller_shipping_address_state string State of the shipping address of the seller
seller_shipping_address_city string City of the shipping address of the seller
buyer_shipping_address_state string State of the shipping address of the seller
buyer_shipping_address_city string City of the shipping address of the seller
gmv_usd double Gross mechandise value of the order in USD
item_amount bigint quantity of an item in order
main_category string category name of the item
estimate_shipping_fee_usd double The shipping fee calculated by Shopee system based on the buye
buyer_paid_shipping_fee_usd double Buyer paid shipping fee in USD
estimate_shipping_rebate_amt_us double Platform subsidy shipping fee in USD
voucher_rebate_usd double Platform subsidy product price by voucher in USD
fsv_promotion_id bigint Freeship voucher ID, null if not using
pv_promotion_id bigint Platform product discount voucher ID, null if not using

Table Name dim_user


Description Snapshot of user profile
Primary keys: user_id (buyer_id)

Column Name Data Type Description


user_id bigint unique user_id for each user
user_name string User's username
gender bigint User's gender: 1 is Male, 2 is Female, Null is Unknown
birthday date User's birthday
last_login_datetime string last login to app/web of platform
registration_datetime string use registration time - time at which account was created
default_delivery_address_state string default delivery address that is set
default_delivery_address_city string default delivery address that is set

Questions
Suggestions/Tips:
1 Think through on key metrics, what are the significant elements or components involved, with some reason
2 Clear on what are rationale/reference for the assumptions

Q1 Generate monthly reports of DAU (daily unique login users), Daily buyers, Daily orders and Daily GMV
Breakdown the reports by Location
(Template & Queries are required)
WITH dau AS (
SELECT
FORMAT_DATE('%Y-%m', DATE(login_datetime)) AS month,
DATE(login_datetime) AS date_,
dim_user.default_delivery_address_state AS state,
dim_user.default_delivery_address_city AS city,
COUNT(DISTINCT dwd_login_event.user_id) AS dau
FROM dwd_login_event
JOIN dim_user
USING(user_id)
GROUP BY month, date_, state, city),
monthly_dau AS (
SELECT
month,
state,
city,
AVG(daily_dau) AS avg_dau
FROM
dau
GROUP BY
month, state, city),
orders AS (
SELECT
FORMAT_DATE('%Y-%m', DATE(create_datetime)) AS month,
DATE(create_datetime) AS date_,
buyer_shipping_address_state AS state,
buyer_shipping_address_city AS city,
COUNT(DISTINCT buyer_id) AS avg_daily_unique_buyers,
COUNT(DISTINCT order_id) AS avg_daily_orders,
SUM(gmv_usd) AS avg_daily_gmv
FROM
order_item_mart
GROUP BY
month, order_date, state, city),
monthly_orders AS (
SELECT
month,
Column Data Type Description
month STRING Month of the report in YYYY-MM format.
state STRING State of the user's default or buyer's shipping address.
city STRING City of the user's default or buyer's shipping address.
avg_dau FLOAT Monthly average of daily unique login users (DAU) for the sp
avg_daily_unique_buyers FLOAT Monthly average of daily unique buyers for the specified mo
avg_daily_orders FLOAT Monthly average of daily orders for the specified month and
avg_daily_gmv FLOAT Monthly average of daily Gross Merchandise Value (GMV) i
Q2 Generate a reports of daily new customers, and cohort retention from first order month
(Template & Queries are required)
WITH first_order AS (
SELECT
buyer_id,
FORMAT_DATE('%Y-%m', MIN(DATE(create_datetime))) AS cohort_month
FROM order_item_mart
GROUP BY buyer_id
),

cohort_retention AS (
SELECT
f.cohort_month,
COUNT(DISTINCT f.buyer_id) AS total_new_customers_acquired,
FORMAT_DATE('%Y-%m', DATE(o.create_datetime)) AS order_month,
DATE_DIFF(DATE(o.create_datetime), PARSE_DATE('%Y-%m', f.cohort_month),
MONTH) AS month_number,
COUNT(DISTINCT o.buyer_id) AS retained_customers
FROM
first_order f
LEFT JOIN
order_item_mart o
USING(buyer_id)
GROUP BY
f.cohort_month, order_month, month_number
)

SELECT
cohort_month,
total_new_customers_acquired,
order_month,
month_number,
retained_customers,
ROUND(SAFE_DIVIDE(retained_customers, total_new_customers_acquired), 4) AS
retention_rate
FROM
cohort_retention
ORDER BY
cohort_month, order_month;
Column Name Data Type Description
cohort_month STRING The month in which new customers made their first purchas
total_new_customers_acquiredINT The total number of new customers acquired in the cohort m
order_month STRING The month in which subsequent orders were made, formatte
month_number INT The month difference from the cohort month to the order mo
retained_customers INT The number of customers from the cohort who made purcha
retention_rate FLOAT The ratio of retained customers to total new customers acqu

Q3 Find out top 10 items by orders per earch seller segment by month, together with average selling price/item
Seller segment are define by:
- Short Tail (>20 average daily orders)
- Mid Tail (between 10 - 20 average daily orders)
- Long Tail (< 10 average daily orders)
(Queries are required)
WITH seller_segment AS (
SELECT
seller_id,
CASE
WHEN COUNT(order_id) > 20 THEN 'Short Tail'
WHEN COUNT(order_id) BETWEEN 10 AND 20 THEN 'Mid Tail'
ELSE 'Long Tail'
END AS segment
FROM
order_item_mart
GROUP BY
seller_id
),

monthly_stats AS (
SELECT
o.item_id,
FORMAT_DATE('%Y-%m', DATE(o.create_datetime)) AS order_month,
COUNT(o.order_id) AS total_orders,
SUM(o.gmv_usd) AS total_gmv,
AVG(o.gmv_usd / o.item_amount) AS avg_price,
MIN(o.gmv_usd / o.item_amount) AS min_price,
MAX(o.gmv_usd / o.item_amount) AS max_price,

Q4 Segment platform orders, gmv of platform by buyers segment


Candidates can provide criteria to segment buyers group on their own with explanation for rationale

(Criteria, Rationale for Criteria & Queries are required)


Criteria

1. Frequency of Purchase:
- Frequent Buyers: Purchased > 5 times
- Occasional Buyers: Purchased between 1 to 5 times
- New Buyers: First purchase

2. Monetary Value of Purchases:


- High-Value Customers: Total spending > $500
- Medium-Value Customers: Total spending between $100 and $500
- Low-Value Customers: Total spending < $100

3. Recency of Purchase:
- Active Customers: Purchased within the last month
- At-Risk Customers: Last purchase 1-3 months ago
- Inactive Customers: No purchases in the last 3 months

Rationale
Frequent Buyers
This approach & High
utilizes theValue: optimizewhich
RFM model, their is
marketing
a popularstrategies to enhance
method used to
retention and increase average order value. Campaigns may include
segment customers based on their purchasing behavior. This helps identify
exclusive offersoforusers,
distinct groups loyaltyallowing
programsfor designed to reward these valuable
tailored campaigns.
customers.
Example
Recent Buyers: They’re often more likely to buy again, so targeted follow-up
communications or tailored promotions can help maintain their interest.

Inactive Buyers: special promotions or reminders about products they viewed


can entice them back to the platform.

Note
),

buyer_segments AS (
SELECT
buyer_id,
Whiletotal_orders,
the answer below uses SQL queries, I would consider scoring
customers based on these three dimensions using Python as a more flexible
total_gmv,
solution. This also allow developing more conmprehensive scoring systems
last_order_date,
and possibilities
CASE of adding other attributes like engagement metrics,
demographic
WHEN total_orderswhich
information, evaluate
> 5 THEN a more understanding of customer
'Frequent'
profiles WHEN total_orders BETWEEN 1 AND 5 THEN 'Occasional'
ELSE 'New'
END AS frequency,
CASE
WHEN total_gmv > 500 THEN 'High-Value'
WHEN total_gmv BETWEEN 100 AND 500 THEN 'Medium-Value'
ELSE 'Low-Value'
END AS monetary,
CASE
WHEN last_order_date >= DATE_SUB(CURRENT_DATE(),
INTERVAL 1 MONTH) THEN 'Active'
WHEN last_order_date >= DATE_SUB(CURRENT_DATE(),
INTERVAL 3 MONTH) THEN 'At-Risk'
ELSE 'Inactive'
END AS recency
FROM
buyer_stats
)

SELECT
frequency,
monetary,
recency,
COUNT(buyer_id) AS number_of_buyers,
AVG(total_orders) AS avg_orders,
AVG(total_gmv) AS avg_gmv
FROM
buyer_segments
GROUP BY
frequency_segment, monetary_segment, recency_segment
ORDER BY
total_gmv DESC, frequency_segment, monetary_segment,
recency_segment

Q5 Open-Ended Question
Build a Projection Model (From Orders, GMV to P&L) For TikTokShop for 2025
To build a comprehensive projection model for TikTokShop’s performance in 2025, I would combine historic
and specific performance factors. This approach would allow us to segment projections by item category, sh
and make comparisons with competitor platforms such as Shopee and Lazada. Below is a detailed breakdo

1. Establish a Baseline
Data Collection & Cleaning: Gather historical data from 2023 to 2024 on key metrics like daily orders, month
merchandise value (GMV), average order value (AOV), and other relevant financial metrics. Clean and prep

Identify Patterns and Seasonality: Visualize historical data to identify trends over time. This includes analyz
sales spikes during campaigns, holidays, or specific promotional periods. Using seasonal decomposition te
into trend, seasonal, and residual components to better understand how different factors impact overall per

Statistical Testing: Run statistical tests to assess stationarity in the time series data. If non-stationary, differ
the mean and variance before moving forward with further analysis.

2. Forecasting Techniques
Model Selection: To find the optimal forecasting model, I would explore several approaches:

Linear Regression: Identify independent variables (e.g., MAUs, marketing spend) that may impact the depe
fitting a regression model, we can estimate how changes in each variable (e.g., a 1% increase in MAUs) mi

Moving Averages: Use simple or weighted moving averages to smooth out short-term fluctuations, highlight
moving average, for example, can be useful to visualize underlying trends by minimizing seasonal noise.

Advanced Time Series Models: Models like ARIMA or Holt-Winters Exponential Smoothing account for sea
more accurate forecasts. These models assign greater weight to recent data, which is particularly valuable
conditions.

3. Key Revenue Drivers


Order Growth Rate: Project order growth based on historical seasonality and TikTok’s user growth estimate
a 20% increase in user base in 2025, we can anticipate a related rise in orders, especially during high-traffic

Revenue Streams: Estimate platform fees (e.g., 10% commission on GMV) and advertising revenue potent
growing. For example, if ad engagement trends are rising, we could see an increase in ad revenue as part o

4. Key Cost Drivers


Fixed & Variable Costs: Account for both fixed and variable costs, such as marketing and user acquisition e
and platform maintenance. By analyzing past spend ratios (e.g., marketing costs as a percentage of GMV),
marketing expenses historically average 15% of GMV, this can guide budget estimates moving forward.
level from beginning of time

der Note: for bundle deal, the bundle is split into the actual item, and the actual itemid of the physical item is recorded
e order Note: for bundle deal, the bundle is split into the actual item, and the actual model id of the physical item is recorded

t performed on
m web portal (web include PC and Mobile Web)
d (epoch time)
local time in string)
ed / accepted by the buyer after receiving all the parcels for the order (local time in string)
completed and escrow started
completed and escrow started
d by the buyer (local time in string)
system based on the buyer and seller location, parcel size

ull if not using

Null is Unknown

ount was created

olved, with some reasonable assumptions to enable you to proceed with your thought process

ders and Daily GMV


s shipping address.
shipping address.
in users (DAU) for the specified month and location.
yers for the specified month and location.
the specified month and location.
chandise Value (GMV) in USD for the specified month and location.
made their first purchase, formatted as 'YYYY-MM'.
acquired in the cohort month.
ders were made, formatted as 'YYYY-MM'.
ort month to the order month.
cohort who made purchases in the order month.
otal new customers acquired, rounded to four decimal places.

verage selling price/item, min price, max price, and orders and gmv coverage
tion for rationale
I would combine historical data, relevant market trends,
ions by item category, shop metrics (like follower counts),
ow is a detailed breakdown of my proposed methodology:

s like daily orders, monthly active users (MAUs), gross


metrics. Clean and prepare this data to ensure accuracy.

me. This includes analyzing seasonal variations, such as


asonal decomposition techniques, we can break down data
actors impact overall performance.

. If non-stationary, differencing may be required to stabilize

proaches:

hat may impact the dependent variable, such as GMV. By


% increase in MAUs) might affect GMV or order counts.

rm fluctuations, highlighting longer-term trends. A 3-month


mizing seasonal noise.

moothing account for seasonality and trends, providing


h is particularly valuable for forecasting in dynamic market

k’s user growth estimates. For instance, if TikTok expects


pecially during high-traffic periods.

vertising revenue potential, particularly if ad spend is


e in ad revenue as part of the total GMV.

ng and user acquisition expenses, fulfillment and logistics,


s a percentage of GMV), we can project future expenses. If
ates moving forward.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy