Split
Data Cleaning
String Manipulation
- Implemented rstrip(), lstrip(), contains() and split operations on marathon data
- Developed standardized cleaning pipelines for inconsistent data entries
Python
JSON
Data Parsing
- Utilized json, flatten_json, and json2txttree libraries for complex data extraction
- Created optimized pathways for nested JSON structure navigation
Python
Pandas
RegEx
- Performed conditional string alterations across large datasets
- Implemented regex patterns for pattern matching and transformation
Python
Seaborn
Matplotlib
- Created interactive visualizations for complex datasets
- Developed dashboard-ready figures for research presentations
Python
Pandas
Market Analysis
- Conducted market basket analysis using advanced grouping techniques
- Identified product associations through transactional data clustering
SQL
Data Analysis
Business Intelligence
- Analyzed a 100,000-row retail dataset to uncover sales trends and patterns
- Developed SQL queries to extract meaningful business insights
- Created visualizations to communicate findings to stakeholders
SQL
Data Analysis
Market Research
- Examined job market data to identify trends in SQL-related positions
- Developed queries to analyze salary ranges, skill requirements, and job locations
- Provided insights for career development in data-related fields
SQL
PgAdmin4
Market Research
- Created a pseudo database to simulate a real world raw data
- Automated the cleaning process and split the cleaned data into two tables
- Used psycopg2 and sqlalchemy to send pandas dataframes into a Pgadmin4 Postgresql database
- Used to_sql() to query the resulting databases with SQL