Forecasting Economic Time-Series Variables in Excel

I grabbed some economic data from FRED and attempt to forecast future values for each variable using Excel. Econ variables of focus for this project include GDP, CPI, and the unemployment rate for the U.S. This was originally done for my Macro-econometrics class and provided me some nice time with Excel which I admittedly use too little.

more ...

Character Co-occurrence Network Diagram w/ NetworkX in Python

Being a big fan of fantasy novels, I've always had an interest in how characters within books with massive character lists all interweave and connect together. I've also had interest for awhile now in visualizing some type of complex network with networkX in Python. Oathbringer is the most recent book from Brandon Sanderson's Stormlight Archive series and it was a perfect option for me to combine both of these interests. The following is the code I used to parse through the etext of the novel and create a character co-occurence network diagram. Although certain decisions made throughout the process may not be perfect for representing direct 'co-occurrences', I found the resulting visual to be an interesting look at relationships seen throughout the book.


Classifying Comment Toxicity w/ NLTK and NB

I've wanted to get more practice with natural language processing, so I grabbed a dataset of Wikipedia comments from a past Kaggle challenge to attempt to classify toxicity of each comment. Train data was a collection of over 150k comments connected to user-defined classifications of 'toxic', 'severe toxic', 'obscene', 'insult', 'threat', 'hate'. Here's the process I took to create a model which identifies these particular classifications of future comments!

First is to import libraries we'll be using


Nightingale Rose in R

Summary: I recently tracked my daily coffee consumption and thought it would be interesting to find a fun way to visual it. After some exploration of options in R, I decided to give a Nightingale Rose Diagram a try.


Titanic Survival Classification w/ XGBoost

Summary: It's been alittle over three years since I attempted my first ML project of classifying survival of the titanic.. A pretty famous dataset and task for this field I'd say. I achieved a accuracy of around 75% in my first attempt at this (I believe with a random forest model). I though it would be nice to revisit this and give it another go now that I'm a little more experienced. Additionally, I've wanted to explore ensemble learned some more, particularly XGBoost. Let's dive in!


Lyric Analysis

An assortment of word clouds I made from lyrics I scraped for a school project. Size of each word is relative to frequency of usage by the artist in (up to) 150 of their most recent songs. Mount Eerie references nature a lot while Aesop Rock just references A LOT of things. Harry and the Potters, well... .

more ...