Data Science · NLP

Twitter Election Analysis

Research Overview

In my latest research project, I investigated how social media analytics can predict elections. By analyzing millions of tweets about the 2020 US Presidential candidates, I discovered some interesting correlation between Twitter engagement and actual voting patterns.

Using advanced natural language processing and machine learning techniques, I developed models to achieve higher accuracy in predicting state-level electoral results. This work demonstrates how data and sentiment analysis can broaden our understanding of political behavior in the digital age.

Twitter's Political Mood

My analysis revealed striking differences in how Twitter users perceived the candidates. Biden received significantly more positive tweets (39.1%) compared to Trump (34.4%), while Trump faced more negative sentiment (36.5% vs Biden's 25.9%). These sentiment patterns aligned remarkably well with the final election results.

The bar plot below shows the distribution of positive, neutral, and negative tweets for each candidate during the election period. The disproportionate volume of negative sentiment toward Trump is an indicator of final electoral outcomes.

Sentiment Analysis Comparison

Geographic Patterns: Popular Vote

One of the best ways to visually see how popular elections might go is to look at geographic heat maps of the US. Here we have mapped, state-wise, which candidate had the highest voter outcome. Red represents the Republican nominee Donald Trump, blue the Democratic nominee Joe Biden. The electoral patterns show big regional variation — states like New York and California lean heavily toward Biden, while Texas and Florida lean Trump.

Popular Vote

Sentiment & Engagement Realities

My research mapped sentiment and engagement across the United States, revealing fascinating regional variations. Coastal states like New York and California generally showed more positive sentiment toward Biden, while sentiment in central and southern regions was more mixed.

The Twitter engagement analysis of metrics (likes, retweets, and replies) showed that Biden consistently generated higher engagement than Trump across almost all states. Top states by engagement volume reveal a consistent pattern: Biden outperformed Trump in 9 out of 10 states analyzed.

Geographic Heat Map
Engagement Metrics by State
Twitter Engagement by State

Predictive Correlation & Machine Learning

Looking at the correlation plots we can see there is a negative relationship between the number of engagements with tweets concerning Trump and relative vote share — meaning in states where there is lower engagement with Trump-related tweets, there is a higher relative vote share, emphasizing that not all engagements are positive.

Relationship Between Twitter Engagement and Voting Patterns
Relationship Between Twitter Sentiment and Voting Patterns

Finally, as a way to combine engagement and sentiment metrics into machine learning models, I trained the data on multiple algorithms. The ensemble methods (Random Forest, Gradient Boosting, and a Tuned Model) achieved perfect accuracy, correctly predicting the winner in every state in the test set. Even the simpler Logistic Regression model performed exceptionally well with 93.33% accuracy.

Machine Learning Model Performance

Impact & Conclusions

This research demonstrates that systematic analysis of Twitter sentiment and engagement can yield substantial predictive power for electoral outcomes. The implications extend beyond electoral predictions, suggesting new methods for understanding public opinion, tracking political sentiment in real-time, and potentially enhancing democratic processes through data science.

For more details, access the full research paper here ↗.