Home

Data Analysis & Engineering


Recommendation System


Recommender System using Association Rule Mining

Generate frequent itemsets and association rules for a recommender system.


Movie Recomendation Systems with Collaborative Filtering

Apply cosine similarities in user-based and item-based collaborative filtering to build a movie recommendation system.



Marketing Analysis

Effectiveness Analysis of Marketing Mix Activities

Analyzed the Ad Effectiveness and Carryover Effect of marketing channels, proposed marketing resources allocation to maximize advertising efficiency

Conjoint Analysis in TV Design

Use conjoint analysis to determine the reaction of the market to a new TV design.



Pricing Analytics

Product Discount Optimization Model

Developed a new discount strategy for the company’s four main products to improve the expected revenue.



A/B Testing

Udacity Free Trial Screener

Designed and analyzed the A/B testing experiment of Udacity's new Free Trial Screener feature

Mobile Games A/B Testing with Cookie Cats

Analyzed the impact of the placement of in-app purchase ad on player retention





Text Mining


Text Mining America's Toughest Game Show

Spot trends in the types of questions asked in the game show

Text Mining in AI Startups Names on AngelList

Find out the most popular words used by AI startups in their names using text mining



Financial Analysis

Modeling The Volatility Of Us Bond Yields

Build a model to study the nature of volatility in the case of US government bond yields

Risk and Returns: The Sharpe Ratio

Calculating the Sharpe Ratio for the stocks of Facebook and Amazon



Network Analysis

A Network Analysis of Game of Thrones

Analyzed the co-occurrence of the characters in the Game of Thrones



Other Case Studies

Kidney Stone Treatment and Simpon's Paradox

Explore Simpon’s paradox using multiple regression and other statistical tools



Data Engineering

Build A Print Date Pipeline with Airflow

A Simple Print Date Pipeline

Song Track Popularity Analysis Database

Prepared an informative database to help find important song features leveraging open source web APIs and information scraped from online

GCP Dataflow with Python

Implemented a word counting workflow on GCP Dataflow with Python.

Data Clustering in San Francisco Neighborhoods

Get the most common venue categories in each neighborhood in San Francisco and visualize the neighborhoods and their emerging clusters



MOOC


Introduction to Airflow in Python

The course covers concepts of Airflow and the implementation of data engineering workflows in production


Building Data Engineering Pipelines in Python

Ingesting data with Singer, Creating data pipelines with PySpark, Testing Data Pipeline, Managing and orchestrating a workflow


Market Basket Analysis in Python

Perform Market Basket Analysis using the Apriori algorithm, standard and custom metrics, association rules, aggregation and pruning, and visualization