Data Science

All projects listed on this page can be found on my GitHub.

Fandango
Movie Ratings Analysis

While the median is the same for both distributions, 
										the mean and mode are higher in 2015 (approximately. 5% and 11% respectively relative to 2016.

In October 2015, a data journalist named Walt Hickey analyzed movie ratings data and found strong evidence to suggest that Fandango's rating system was biased and dishonest. He published his findings in this article.

In this project, I used sample statistics and kernel density plots to show that Fandango popular movie ratings have decreased on average by approximately 5% in 2016, a year after Hickey's report.

Market Analysis
for Targeted Ads

US has the largest percentage of new users. India has almost twice as more new users as 
									Canada does. It is unclear which is the second best country to advertise in.

In this project, I anlyze customer survey data for an e-learning company to determine the top two markets for targeted advertisement.

After performing outlier treatment to obtain a representative sample, I suggest three advertising strategies based on the findings from my analysis along with a dashboard for easy interpretation.

6/49 Lottery
Mobile App Predictions

Assorted colors of lottery balls

The purpose if this project is to contribute to the development of a mobile app that aims to help users better estimate their chances of winning the 6/49 Lottery. Using historical data from the national 6/49 lottery game in Canada, I create the logical core of the app and calculate probabilities allowing users to answer questions like:

  • What is the probability of winning the big prize if we play 40 different tickets (or any other number)?
  • What is the probability of having at least five (or four, or three, or two) winning numbers on a single ticket?

CIA Factbook
SQL Analysis

Satelite photo of Earth taken from space.

In this project, I use sqlite3 to query data from the CIA World Factbook for exploratory data analysis.

This analysis included investigations to discover patterns and to spot anomalies in the dataset.