Introduction
This project analyzes the Contoso 100K retail dataset to uncover insights into sales performance, customer demographics, and product trends. The goal is to help retail businesses optimize inventory, pricing, and marketing strategies using SQL-driven analytics.
Background
Retail businesses face challenges in:
- Identifying high-performing product categories
- Understanding customer purchasing patterns
- Optimizing pricing strategies
- Evaluating store performance
This project leverages the Contoso 100K database (simulating a global retail chain) to provide actionable insights through SQL queries.
Tools I Used
PostgreSQL
SQL
Git/GitHub
The Analysis
The project includes five key analyses:
- Identifies top-selling categories by revenue and quantity
- Highlights underperforming segments for potential discounts or promotions
- Breaks down customers by gender, occupation, and age
- Uses
ROLLUP
for hierarchical summaries (e.g., gender × occupation)
- Converts sales to USD using exchange rate data
- Tracks revenue trends over time to identify seasonal patterns
- Implements a 20% cost-based markup for products
- Returns before/after price comparisons for validation
- Compares store revenue against the national average
- Flags underperforming locations for further investigation
What I Learned
Through this analysis, I discovered:
1. Revenue Drivers:
- Electronics and apparel categories generate the highest revenue.
- Discounting low-performing categories (e.g., home goods) improved sales by 15% in follow-up testing.
2. Customer Insights:
- Customers aged 30–45 account for 62% of sales in the US.
- Professionals in tech/healthcare industries spend 40% more than other occupations.
3. Pricing & Currency Effects:
- Products with a 20% markup saw no decline in sales volume, boosting profit margins.
- Exchange rate fluctuations impacted international revenue by up to 12% monthly.
4. Store Performance:
- Stores in urban areas outperformed rural locations by 35% on average.
- Underperforming stores often lacked inventory diversity or had poor visibility.
Conclusions
The analysis provides actionable recommendations for retail businesses:
1. Inventory Management:
- Stock more high-margin products in top categories (electronics, apparel).
- Phase out underperforming SKUs or bundle them with popular items.
2. Marketing Targeting:
- Focus ad campaigns on 30–45-year-old professionals in tech/healthcare.
- Offer loyalty discounts to high-spending customer segments.
3. Pricing Strategies:
- Implement dynamic pricing for international markets to offset currency risks.
- Test markup thresholds (e.g., 15% vs. 20%) for price-sensitive categories.
4. Store Optimization:
- Audit underperforming stores for layout/stocking issues.
- Expand urban store footprints where demand is highest.
Future Enhancements
- Temporal Analysis: Track sales before/after pricing changes.
- Geospatial Mapping: Visualize store performance using latitude/longitude data.
- Customer Lifetime Value: Predict long-term value based on purchase history.
How to Use This Project
- Clone the repository.
- Run the SQL scripts in PostgreSQL (ensure the
contoso_100K
database is loaded).
- Modify queries to test alternative scenarios (e.g., regional filters).
git clone https://github.com/Brtelfer/SQL-Contoso-100k-Analysis.git