From: 11th May 2019
To: 6th June of 2019
Data size: 3.3 GB
Users: 194k unique
On 23 June 2016, the United Kingdom held a referendum, whether the British people prefer to stay in the European Union or leave. In this referendum, the people voted 51.9% supporting leaving the EU. As a result, the Government invoked Article 50 of the Treaty on European Union, starting a two-year process which was due to conclude with the UK’s exit on 29 March 2019. This process is since referred to as “Brexit”, which is used as a shorthand way of saying the UK leaving the EU – merging the words Britain and exit to get Brexit.
In this project, we are going to study how the public’s views towards Brexit developed before, during and after the European Elections of 26 May 2019, an action that the British Parliament wanted to avoid, but had to attend, nevertheless. The public’s sentiment was studied through the views expressed on Twitter, a famous social network, where people are free to express their views in short sentences. For the prediction of the sentiment, the emotion and the geolocation of the public’s views, a set of machine learning algorithms were employed.
Number of Tweets
The above statistics confirm the idea that the European elections on the 26th of May largely influenced the public’s interest on Brexit. Obviously, there was a lot of interest on the week preceding the elections (21/05/2019 to 24/05/2019) and people were preoccupied with how the elections would go and what impact they would have on UK-EU politics. On the day before the elections (25/05/2019) most people took a break (!) on the subject and on the day of the elections (26/06/2019), as well as the next, people were trying to understand the results. This slowly faded away and things returned to normal on the 30th of May.
As expected, hashtags related to Brexit and the European Elections was the most popular. In more detail, #brexit, #euelections2019, #remain, #peoplesvote, #theresamay were between the most popular. Also hashtags about politicians that had a strong voice in favour of Brexit such as #farage, #nigelfarage and #borisjohnson had a lot of tweets. From the UK parliament parties the most popular party was the Brexit Party with the #brexitparty hashtag, which was the one that was the winner of the EU elections in Britain. Donald Trump’s visit was also discussed a lot, with hashtags such as #trumpukvisit, #trumpvisituk, #trump, #trumpvisit. Finally, strong presence had hashtags that were taking a clear stance in favor or against Brexit, such as #stopbrexit, #leave, #nodeal, #remainernow.
On the sentiment information extraction task our focus is to identify the overall sentiment of the tweets and classify them in one of two main categories: negative and positive. The dataset we used to train our models for this task is the Large Movie Review Dataset v1.0.
As we can see from the chart on the left, the BERT model is the best performing model with an accuracy of 0.8720 followed by the Linear SVM and Naïve Bayes models. Random forest was the worst performing model with a significant difference. The voting classifier, which selects the prediction with the majority votes, decreased the accuracy our best model. Although BERT was the top performing model, the fine-tuning phase of the BERT model takes significantly more time than training the other models.
In the bar chart below, we can see the average sentiment score per day and the number of tweets from each side (positive/negative). It’s clear that there is a trend for positive sentiment to dominate everyday except one single day. The 2nd June 2019. That day, we had a waterfall of events starting with a UK poll bringing Nigel Farage and the Brexit Party leading with 26% winning the elections. Following by Donald Trump’s support to Farage and his claim that UK should be ready to the EU without any deal.
We trained four machine learning algorithms in order to identify six primary emotions, anger, disgust, fear, joy, sadness and surprise, in tweets. The dataset we used for the training was the Hashtag Emotion Corpus (aka Twitter Emotion Corpus, or TEC) dataset. It contains 21,000 tweets from about 19,000 different people and they are labeled with six different emotions: anger, disgust, fear, joy, sadness and surprise. This dataset was created by selecting tweets that had been labeled with emotion hashtags and using that hashtag as a label.
As with the sentiment task, BERT was the best performing model on the emotion dataset. Multinomial Naïve Bayes and Linear SVM were close on performance with Random Forest to be the least performing model as was at the sentiment task. The disadvantage of BERT model is that its fine tuning phase took considerable more time than the training of the other models.
By looking the data one obvious conclusion is that fear was the main emotion on the topic of #Brexit during the period before and after the EU elections. We can see three peaks on the fear curve, on 22 and 27 of May and on 2nd of June. On May 22 was the date that Theresa May, the Prime Minister of the United Kingdom, announced the “New Brexit Deal” and 27th was the next day of the EU election. On June 2nd, as we mentioned on the sentiment analysis section, a poll released that put the Brexit Party in top spot for next general election and Donald Trump announced his support for Brexit.
It comes as no surprise to see that Brexit is not a local event that concerns the people of Europe alone, but also the rest of world is aware of the situation and the ongoing events. Apart from Europe, USA is very keen on commenting on the events and especially on the dates that their president visited the UK and also offered support on pro-Brexit politicians. In addition, India, Saudia Arabia and Japan were also very active in commenting the news.
EU election results as of right now: Nigel Farage leading in England Marine Le Pen leading in France Salvini leading in Italy A global mass awakening is happening and there is nothing that the global elites or their media henchmen can do to stop it. #BREXIT #MAGA
👋 Huge poll Please only vote if you don’t mind retweeting Not necessarily who you would vote for, but Who would win a general election tomorrow? #politics #GeneralElectionNow #brexit #remain #leave
Watch the former head of the WTO, sat next to Andrea Leadsom back in 2016, perfectly predict our current situation, 3 years down the line. Amazing. #brexit #BrexitShambles #BorisJohnson #EUelections2019 #RemainSurge #RemainerNow #RevokeArticle50 @mrjamesob #TrumpVisitUK https://t.co/S2xltHYUrB
By acquiring the Twitter handles of most of the MPs, we were able to use the Twitter API to get the tweets of these MPs. So, we downloaded all tweets by the MPs and then filtered only those that contain the word “Brexit”.
On the map on the right, we can see the number of tweets and the tweet sentiment presented on the UK map, based on each MP’s constituency. While it is obvious that most tweets come from the MPs of the greater area of London, there are also many tweets coming from the MPs from other important cities of Scotland and Wales, namely Edinburgh and Cardiff. It is odd that there are not many tweets from the MPs of Northern Ireland. There are also not many tweets from the MPs of rural UK.
On the graph below, we modelled the sentiment of the MPs, based on the party that they belong to. By focusing on the main UK parties (Conservative and Labour), we see the great difference in sentiment between these 2 parties. There are instances (especially during the EU elections on the 26/05/2019 and the visit of Donald Trump on 03/06/2019) where the sentiment has a vast disparity. Other parties show a smaller difference in sentiment.
Business Solutions Consultant at Gas Distribution Company
Director of Product Development at Innovation Accelerator Foundation
Data Scientist at Hattrick Ltd