Predicting Enron Fraud

Testing and evaluating numerous machine learning techniques to determine best option for predicting fruad occurances in Enron email dataset. The most efficient predictor ended up being an Adaboost algorithm with 50 n_estimators. This method using decision tree as a 'weak learner' came out with about 85% accuracy, p-value of 39, and an r-squared of around 32. Originally conducted for Udacity Nanodegree project.


Predicting a Country's Happiness

A look at the world happiness index and an evaluation of the factors that contribute to the general happiness of a countries population. Variables were primarily focused on economic, political, and elegantarian factors. Originally conducted as a project for my SNHU Applied Stats II class. As it ended at around 30 pages, included is only a subset of the full project.

more ...

Coffee Buyer Survey

Analysis of a quick three-day survey conducted on google forms regarding customer preferences and reasons for buying coffee beans. Not necessarily a randomized sample as it was distrbuted through social media means, yet still an interesting view on people's preferences when considering where to purchase coffee beans. Project originally conducted for a research methodologies class at Southern New Hampshire University.

more ...

US Name Trends

Interactive D3.js visualization which allows for the input of several names and plots popularity of these names over time. Data obtained from kaggle.com 'US baby names' dataset. While this chat is interesting, it does not necessarily provide a deep look into the trends of names throughout the US and was primarily done for practice utilizing D3.js's interactive options. I also found I get an absurd amount of enjoyment in comparing name popularity after finishing this project.
more ...

US Birth Rates

Created for practice with D3.js after exploring US baby name dataset obtained from Kaggle.com. Thought it would be interesting to view the birth rates of the population in the US, split by region and state. Also included a hover function to display the most popular names in each of these respective subsets per year.
more ...

Goodreads Exploration

Exploratory data analysis of Goodreads.com 'Best Books Ever' ranking list. User opinion and voting on books was scraped with help of Kimono API, then cleaned and aggregated to facilitate analysis. Originally conducted for Udacity Nanodegree project.
more ...