Posts

Structural Machine Learning in R: Predicting Probabilistic Offender Profiles using FBI's NIBRS Data

Image
What is Structural Machine Learning? Most machine learning tasks are designed for classifying data, but what if you have multiple outcomes? Not only do you have multiple outcomes (Y's or output variables), but you also need to enforce specific relationships among those outcomes and your predictors (X). Traditional ML can do many things, but this is not one of them. Traditional ML classifiers have one outcome and they attempt to classify that univariate outcome. A genre of machine learning that handles multiple outcomes and allows data scientists to specify a structure among all variables of interest is called "structured prediction". This genre of ML has existed in the literature for many years, but isn't something I've come across much, so I figured I would present a simple form of structured prediction and motivate it with an illustrative example. The Problem I'll motivate this with the idea of criminal profiling. It's an interesting subject and somethin

In God We Trust ... Not Partinsanship

Image
The United States of America has always been about its people and relied upon a common spirit of gratitude to our forefathers and God for inspiration and guidance. In recent times, however, it seems people are more inclined to rely upon partisan thought and policy for how we should think and do. Before anyone can engage politically, they are asked to declare to which party they belong. 2020 has provided extraordinary challenges to all, which would be difficult in any time or circumstance. These issues are further exacerbated by the petty squabbling, and divisive speech and actions of elected officials. Current events seem to push every American citizen to choose a side. But this begs another question. Why are there sides? Are we not all citizens of this country? Do we not all wish for a better tomorrow and a better world? Is it requisite that we are all the same, or agree on every issue to live in unity and peace? Why is the nation more divided than ever, when we need to be banding tog

Two-Step fix for rJava library installation on Mac OS

I've been using R for some time and over the years I've had one consistent nagging problem.  rJava Perhaps the single most temperamental library in the whole history of R. If you are like me, you likely try to avoid anything Java based, like using openxlsx  instead of xlsx . I don't use Java but a number of libraries I do use, have it as a dependency. For example, I like to use qdap because it has a lot of nice tools for qualitative analysis, which of course uses Java. The big problem is that rJava never installs properly and gives some error along the lines of not being able to find jdk files, jni.h, or Java home when you try to call the library. I have a couple quick steps here that can get rJava up and running quickly. I haven't noticed this issue in Windows which means the library is probably written for Windows and the developer hasn't bothered to make it function out-of-the-box on Mac OS. Two quick steps and you can get rJava working in R on Mac OS. Downloa

Mine Cryptocurrency with your Raspberry Pi!

Image
A year or so ago I got a Raspberry Pi for the purpose of building an emulation station, which works wonderfully. As time has gone on, I've wondered if there isn't more I could be doing with my Raspberry Pi as it is capable of much more than nostalgic gaming. At the same time, I've become more of a crypto enthusiast as the advent of blockchain technology continues to revolutionize the way people can improve and simplify financial transactions, among other aspects of the computing industry. My intention isn't to provide a plug for blockchain tech, rather describe a way to combine two interests. Before I get into how to get your raspberry pi to mine cryptocurrency, let's first be honest about the profitability of such an endeavor. Many mining algorithms are very complex and take a lot of computational power, not to mention, the cost of powering such a system. For this reason, people have to build mining farms to make such an endeavor profitable. Whereas, raspberry

Integrating Data Management and Data Analytics with R and postgreSQL.

Image
Those wanting to be successful in data analytics increasingly have to become well versed in managing their data. That means it is no longer sufficient to just learn R or an analytic platform. You also need to be competent with SQL or some similar database platform. As I am a huge advocate of open source applications, I will be using postgreSQL although that certainly isn't the only SQL platform that R can work with. I've got R to work with MySQL, Oracle SQL, postgreSQL, and Microsoft Server SQL. I already have a post on how to connect R to these platforms, though I don't get into Microsoft SQL Server because it is a painful (not worth it) process to do this if you are running OSX. ( https://www.lazybayesian.com/2019/05/connecting-r-to-sql-database-postgresql.html)  Y ou would end up using RODBC package or something similar to get it done but you end up needing to use homebrew to install other tools on your machine to even to get that to work, and it just keeps going. If y

Which Game is the Scariest? Alien: Isolation, Dead Space, Dead Space 2, or Silent Hill 2? An R Halloween Analysis!

Image
I wanted to get into the halloween spirit by doing some kind of horror themed analytics post. The idea of combining R, data analytics, and the macabre isn't as straightforward as some may think (yes, that was a joke). While I don't care for horror movies, for some reason, I enjoy survival horror video games. Not playing them, of course. I'm far too squeamish for that. I usually watch youtube videos of other people playing them to spare myself a panic attack. I'm the kind of guy who would start playing the game and once the atmosphere became intense, I would just go, "NOPE", turn off the game and walk away. Of the survival horror games I've seen, the Dead Space franchise is up there. I also love the Alien franchise, though that franchise has suffered from a number of awful releases (including movies). Alien: Isolation is a gem, whose intense atmosphere makes every footstep nerve-racking. Lastly, I wanted to include another game that I haven't see

Online Statistics Tutor: Linear Regression - Understanding and Interpreting Linear Regression

Image
Simple Linear Regression is a staple in every statistical toolbox. The idea is to estimate a linear relationship between a  dependent variable  ( Y  or your outcome) and an  independent variable  ( X  or your predictor variable). That is, we estimate the equation of a line through data points that minimizes the vertical distance of the data points to that line. From this we can better understand how X affects Y. This analysis can be used for predictive purposes, as well. In this post I plan on only addressing some basic principles about regression in order to best understand what it is and how to use it. I will focus on Scatterplots and linear relationships. Point-slope equation for a line and how it works. Estimating slope coefficients. Interpreting the slope. Brief mention of other regression concepts (which I may address in later posts).  Scatterplots and Linear Relationships If you are not already familiar with what a scatterplot is, it is merely a graphical method t