Description
I performed a data analysis for theLook eCommerce, a dummy fashion e-commerce company with the goal of gaining valuable insights.
The analysis involved calculating the monthly growth of inventory by product category, identifying the category that showed the lowest business growth over the past year, and creating a retention cohort in 2022.
Project Details
Type | Tools |
---|---|
RevoU Assignment | Google BigQuery |
Google Sheets | |
Tableau |
Dataset Overview
theLook is a fictitious eCommerce clothing site developed by the Looker team.
The dataset contains information about customers, products, orders, logistics, web events and digital marketing campaigns.
The contents of this dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and evaluation.
This dataset is public and hosted in Google BigQuery.
Business Background
You are a data analyst in a fashion e-commerce company called theLook. Currently, the company is in the optimization mode caused by the potential crisis in 2023. The management has decided to cut off resources in some categories the lowest growth in the past 1 year. On another side, they want to continue the analysis by understanding their inventory stock growth and the retention behaviors of the users.
Objectives
- Find categories with the lowest business growth.
- Calculate monthly growth of inventory in percentage breakdown by product categories.
- Create monthly user retention cohorts.
Data used will be data from 2022.
1. Categories with lowest business growth
SQL Syntax
WITH main AS(
SELECT COUNT(oi.id) AS products_sold,
ii.product_category AS category,
DATE(DATE_TRUNC(o.created_at, MONTH)) AS year_month,
SUM(oi.sale_price) AS revenue,
SUM(oi.sale_price) - SUM(ii.cost) AS profit
FROM `bigquery-public-data.thelook_ecommerce.orders` AS o
JOIN `bigquery-public-data.thelook_ecommerce.order_items` AS oi
ON oi.order_id = o.order_id
JOIN `bigquery-public-data.thelook_ecommerce.inventory_items` AS ii
ON oi.inventory_item_id = ii.id
WHERE EXTRACT(YEAR FROM o.created_at) IN (2021, 2022)
AND o.status = 'Complete'
GROUP BY category, year_month
ORDER BY category, year_month
),
cte_revenue AS(
SELECT *
FROM(
SELECT * EXCEPT(profit),
LAG(revenue) OVER(PARTITION BY category ORDER BY year_month) AS previous_revenue
FROM main
)
),
cte_profit AS(
SELECT *
FROM(
SELECT * EXCEPT(revenue),
LAG(profit) OVER(PARTITION BY category ORDER BY year_month) AS previous_profit
FROM main
)
)
SELECT main.category,
main.year_month,
main.products_sold,
rev.revenue,
rev.previous_revenue,
pro.profit,
pro.previous_profit,
(rev.revenue - rev.previous_revenue) / rev.previous_revenue AS revenue_growth,
(pro.profit - pro.previous_profit) /pro.previous_profit AS profit_growth
FROM main
JOIN cte_revenue AS rev
ON main.category = rev.category AND main.year_month = rev.year_month
JOIN cte_profit AS pro
ON main.category = pro.category AND main.year_month = pro.year_month
WHERE EXTRACT(YEAR FROM main.year_month) IN (2022)
ORDER BY category, year_month
Visualization
Link the query to Google Sheets and create a visualization using Tableau.
There are 3 visualizations within this dashboard:
- BCG Matrix:
- Profit CMGR : profit in latest month / profit in first month 1 / [latest month - first month] - 1
- Market Share : revenue product / total revenue
- Profit / revenue growth time-series chart.
- Categories breakdown by its revenue, profit, and number of products sold.
2. Monthly Growth of Inventory
SQL Syntax
WITH main AS(
SELECT product_category AS category,
DATE(DATE_TRUNC(ii.created_at, MONTH)) AS year_month,
count(ii.id) AS stock
FROM `bigquery-public-data.thelook_ecommerce.inventory_items` AS ii
LEFT JOIN `bigquery-public-data.thelook_ecommerce.order_items` AS oi
ON ii.id = oi.inventory_item_id
WHERE EXTRACT(YEAR FROM ii.created_at) IN (2021, 2022)
AND ii.sold_at IS NULL
GROUP BY category, year_month
),
cte_inventory AS(
SELECT *
FROM(
SELECT *,
LAG(stock) OVER(PARTITION BY category ORDER BY year_month) AS previous_stock
FROM main
)
)
SELECT *,
(stock - previous_stock) / previous_stock AS inventory_growth
FROM cte_inventory
WHERE EXTRACT(YEAR FROM year_month) IN (2022)
ORDER BY category, year_month
Visualization
Link the query to Google Sheets and create a Pivot Table.
3. Monthly User Retention Cohorts
SQL Syntax
WITH first_purchase AS(
SELECT user_id,
MIN(DATE(DATE_TRUNC(created_at, MONTH))) AS purchase_month
FROM `bigquery-public-data.thelook_ecommerce.orders`
WHERE EXTRACT(YEAR FROM created_at) IN (2022)
AND status = 'Complete'
GROUP BY user_id
),
-- only 1 purchase per month tracked here
activity AS(
SELECT o.user_id,
DATE_DIFF(
DATE(DATE_TRUNC(o.created_at, MONTH)),
fp.purchase_month,
MONTH
) AS month_number
FROM `bigquery-public-data.thelook_ecommerce.orders` AS o
JOIN first_purchase AS fp
ON o.user_id = fp.user_id
WHERE EXTRACT(YEAR FROM o.created_at) IN (2022)
AND status = 'Complete'
GROUP BY o.user_id, month_number
),
cohort_size AS(
SELECT purchase_month,
COUNT(user_id) AS num_user
FROM first_purchase
GROUP BY purchase_month
ORDER BY purchase_month
),
retention AS(
SELECT fp.purchase_month,
activity.month_number,
COUNT(activity.user_id) AS num_user
FROM activity
JOIN first_purchase AS fp
ON activity.user_id = fp.user_id
GROUP BY fp.purchase_month, activity.month_number
ORDER BY fp.purchase_month, activity.month_number
)
SELECT retention.purchase_month,
size.num_user AS cohort_size,
retention.month_number,
retention.num_user,
retention.num_user/size.num_user AS retention_percentage
FROM retention
JOIN cohort_size AS size
ON retention.purchase_month = size.purchase_month
ORDER BY purchase_month, month_number
Visualization
Link the query to Google Sheets and create a Pivot Table.
Slide Deck (with insights)
Thank you for taking the time to visit my portfolio website and view my projects!
I appreciate your interest and support. If you have any feedback or questions, please don’t hesitate to reach out to me. Once again, thank you for your time and consideration.
» Let’s connect on LinkedIn! «
» Visit my Tableau Public profile to see my other dashboards «