Forecasting Chicago Crime Rates with SARIMA

My last project of automating a data import pipeline for Chicago's crime data created the perfect environment for using past crime rates to predict future. I use SARIMA time-series forecasting to predict weekly crime rates 6-months out for the city and create a heatmap of location by time to further identify crime trends for Chicago.

more ...

Automating an ETL Pipeline w/ Python and MySQL

I find myself often working with data that is updated on a regular basis. Rather than manually run through the etl process every time I wish to update my locally stored data, I thought it would be beneficial to work out a system to update the data through an automated script. I use python and MySQL to automate this etl process using the city of Chicago's crime data.

more ...

2019 TUN Data Challenge

Brief overview of the TUN Data Challenge I conducted with a team as part of SNHU experiential learning course. We work with client, marketing, and interaction data from non-profit, Hire Heroes USA to answer business problems specified by the organization. Our process involved: Defining goals, cleaning data, exploratory data analysis, statistical analysis, data visualization, and communicating results. Final results from the contest are still pending, expected in late July.

more ...

Identifying Advertisements with ANN's

I pull data from the UCI Machine Learning Repo and use it to train a model which can identify advertisements based upon their image size and URL terminology. I work through cleaning the data, attempting a few different fitting algorithms, and end with some parameter-tuning of an ANN. My final model results in over 97% accuracy in classifying advertisements in testing. Originally conducted for a Machine Learning course as SHHU focused on the R language.

more ...

Parallel Coordinates Plot using Plotly

Over the holiday season I heard several discussions on which charities are best to donate to and why some are better than others. With this in mind, I thought it would be interesting to examine the stats which set one charity over another and find a way to visualize these in an effective manner. With some help from Charity Navigator, I was able to source and collect the appropriate information and thought it a great time to finally give Plotly a go.

more ...

Multiple Linear Regression to Predict Consumer Spending

As in the last post, here's some more work in excel with economic variables. This time I use value forecasts of 30y mortgage, unemployment, and personal income rates, figured in a similar manner as before (annual growth/change rates - 10y moving averages) to predict future levels of personal consumption expenditures. I run a multilinear regression analysis to forecast PCE based upon the three independent variables and end up with some pretty strong results and an adjusted R-squared of .974.

more ...