Prediction Using Naive Bayes

Lets go back in time to 1980. Judge Brennan had been on the Supreme Court since 1956 and remained a judge until 1989. Let’s try to predict his votes in the 1980s based on his voting history.

In the dataset, note that the voting code works as follows:

[1] means that he voted with the majority.

[2] means he dissented.

We will use three different machine learning algorithms on the same data to predict his voting choices in order to see which algorithm performs best. We starting again with Naive Bayes.


# Load packages.


library(e1071)

We begin by dividing our dataset in two parts: one pre-1980 and one post-1980. We use the pre-1980 to train our model and post-1980 to predict.

 # Creating training and test set.

voting_pre1980 <- voting[c(1:3368),c(2:10)]
 

We first train our model on the training data.

voting_post1980 <- voting[c(3369:4746),c(2:10)]

model <- naiveBayes(voting_pre1980[,-9], as.factor(voting_pre1980[,9]))

Next, we predict out of sample based on our test data.


prediction <- predict(model, newdata = voting_post1980[,-9])
 
prediction_Bayes <- as.data.frame(prediction)

Finally, we compare our actual row assignment to our prediction.

 
 
prediction_Bayes <- cbind(voting_post1980$vote,prediction_Bayes)
 
colnames(prediction_Bayes) <- c("vote","prediction")
head(prediction_Bayes)
 ##
   vote       prediction
1 majority majority
2 majority majority
3 majority dissent
4 majority majority
5 majority majority
6 majority majority
 

So, how well did our prediction perform? To compare the quality of our prediction we determine the number of correct predictions

# We again calculate the number of correct assignments.

hits <- 0
for (row in 1:nrow(prediction_Bayes)) {
if (prediction_Bayes$vote[row] ==prediction_Bayes$prediction[row] ) {
hits <- hits+1
}
}

 
correctness_Bayes <- hits/length(prediction_Bayes$vote)

 
correctness_Bayes
 ## [1] 0.6748911


In 67% our prediction proved correct. This is far off from perfection, but better than a 50:50 guess.

 

To further evaluate the performance of the algorithm, we can take a look at the confusion matrix. If the algorithm had predicted all values correctly, all actual decisions (rows) would match the predicted decisions (columns) and the lower left and upper right cell would be 0.

# Compare the results in a confusion matrix

table(prediction_Bayes$vote,prediction_Bayes$prediction)
 ##
    dissent majority
dissent   184      275
majority  173      746
 

 

We see that the algorithm got it wrong both ways. Some dissents were mistakenly predicted as majority votes and some majority votes were mistakenly predicted as dissents.

access_time Last update February 16, 2021.

chat networking coding local-network layer menu folders diagram panel route line-chart compass search flow data-sharing search-1 message target translator candidates studying chat networking coding local-network layer menu folders diagram panel route line-chart compass search flow data-sharing search-1 message target translator candidates studying