Flexdashboards

Everyone knows high-level languages are wonderful. The issues of yester-year are long gone; we no longer have to build dashboards from scratch. Being able to query data, and display directly through R is here, and I don’t see it going anywhere anytime soon. Flexdashboard is an open source package in R that allows users to display dashboards directly in browser. It also has publishing support through RStudio. Assuming you have moderate traffic (like me), it’s perfect!

Read More
Mixed Effects & Air Fares

In the United States, various factors affect the airfare prices that consumers see. An issue is that a single linear model may not fit the data, as prices can vary drastically across different routes. We will obtain Quarterly US Air Fare and Volume data and utilize a mixed effects model to predict air fare prices. The original dataset contains 108,602 observations and 14 attributes and there are no missing values. For the sake of simplicity, in this paper, we only consider a random sample of 50 routes as the original dataset contains 4177 routes.

Read More
Plotly for College Analysis

Community colleges for many is a stepping stone to a larger dream of four year university. However, many of the risk factors that prevent students from succeeding in school are highly prevelant in the student body of 2-year institutions. In this post, we explore open college data. This dataset includes the name of the college, the aggregated number of people who passed a particular subject by transfer level and race.

Read More
Predicting Churn with KNN

Although there are many examples, churn prediction is one of the classical applications of Data Science that works. Churn prediction gives businessmen and bussinesswomen the power to catch those consumers who are likely to leave the company. They can in turn take appropriate measure to keep their business. In this project, we consider a dataset of 7,043 observations. The goal here is to predict, with a fair amount of accuracy, which observations are likely to churn.

Read More
Python in R Markdown?

R Markdown supports Python integration with the package reticulate.The nice thing about R Markdown is that it is the first scripting language I have ever learned. Therefore, I am almost too comfortable with it. It is my go to for most of my project because it’s so easy to get a proof of concept with new ideas. The downfall of it - until now - is that I was unaware that we could use PYTHON IN R MARKDOWN!

Read More
SQL, Databases & R

SQL is a querying language used to manipulate databases. On its own, it is a powerful tool in any Data Scientist’s toolbox. When you pair SQL with R (or Python), you have indespensible foundations to conduct even the basic analytics. This is not so much a lesson on SQL, but a lesson on connecting SQL with R via various packages. Moreover, we will use the packages dbplyr and dplyr to manipulate the datasets in a later blog.

Read More