Before the 2019 Australian Federal Election this weekend, we dive into the twittersphere to see what the candidates, parties and public have been saying and apply a Machine Learning algorithm to mimic our would-be leaders’ tweets!

Social media has become a driving force in the world of politics. For better or for worse, politicians now have a platform which can be used to share their message, connect with their constituents and influence those who they wish to represent.

With the 2019 Australian Federal Election on this Saturday, 18 May, a few members of the young Data Analytics Working Group (yDAWG) put their collective minds together to investigate what the candidates, parties and the public are saying about the election. This article is the first of the yDAWG’s 2019 election coverage, with a post-election analysis planned to look at the results and the reactions.

Without aiming to predict the results on Saturday, this pre-election analysis aims to showcase analysis that can be undertaken on the Twitter coverage of the election in the weeks leading up to the vote.

Twitter and Political Strategy

Campaign strategy differs between the two major parties on the brink of the Federal Election. We could spend hours searching the internet, deconstructing the press, to understand what the parties are focusing on; or alternatively, apply a text analysis techniques to some tweets posted on Twitter. By looking at the distribution of words or phrases used, is there a way for the time-poor to quickly see what audiences each party is targeting, or what policy areas  they are emphasising, ahead of the Election?

To investigate this, tweets were sourced from both the Labor (@AustralianLabor) and Liberal (@LiberalAus) parties in the last 14 months. After some data cleansing and removing retweets (as they were mostly repeating existing tweets) over this time, the two major parties accounted for just under 5000 tweets.

6434 – number of tweets posted altogether by the Labor and Liberal official twitter accounts in last 14 months

Both parties are emphasising that there are significant policy differences between them this election. Tax seems to be a key policy differentiator between the two main parties, with clear policies that affect low and middle-income earners as well as housing investors and retirees. This is followed by differing appetites for action on climate change. Other differentiators include union priorities, education, health and immigration. But is this evident in the parties’ twitter campaigns?


Using a python package called Scattertext, tweets can be visualised in a way that is both informative and interactive. In the context of the upcoming Federal Election, the text visualisation allows you to quickly determine the words frequently or infrequently used by one party or another, which hopefully points users to different populations targeted or different policies campaigned.

What does this graphic actually show us? Words or phrases which appear close to the upper-left and lower-right corners differentiate the parties in terms of policy divisions. In the upper-left corner, words like “climate” and “tafe” (in the form of a dot, see if you can find it) are frequently used by Labor but infrequently or never used by Liberals. Likewise, terms frequently used by Liberals and infrequently by Labor occupy the bottom-right corner. These include, as noted above, “higher taxes” (as an attack on Labor in respect to the division of tax policy) and “building our economy” (a clear echo of the Liberals’ campaign slogan).

Terms are coloured by their association. Those that are more associated with Labor are blue, and those more associated with Liberal are red.

Terms that are frequent and used commonly by both parties are displayed on the far-right of the visualisation. For example, #auspol came up as both parties liked to use this hashtag. Moreover, judging by the spacing between words, most seem to be just hashtags which aren’t really issue-focused.

Public Sentiment of Leaders

Understanding the sentiment of the public preoccupies hours upon hours of political staffers, strategists, pundits, and media, as well as a few actuaries suffering from insomnia. Fortunately for all these, Twitter provides the perfect platform to understand how voters react to the leader’s campaigns, announcements and failures, albeit with an obvious bias towards social media users. By leveraging out-of-the-box machine learning models from Google’s Cloud Natural Language API, public sentiment can be estimated from tweets from the public that mention the two Prime Ministerial contenders.

Looking at 32,000 tweets in a 10-day spread during the election campaign, the public had an average negative sentiment for both major parties, and for mentions of the two leaders, a worrying trend for both political hopefuls.

Will the real ScoMo please stand up?

Armed with this understanding of sentiment and word-choice, the power of data can be leveraged to build models and deliver outcomes  – for good or for bad. No article covering the election is complete without a discussion on fake news, or bots taking over the results. Unfortunately, we did not identify any suspect behaviour.

Instead, we created our own twitter model to produce tweets that mimic those originating from the accounts of Australia’s potential leaders. A fake Scott Morrison and a fake Bill Shorten. To show the lighter side of data analytics, by implementing a Machine Learning algorithm (a Markov Chain generator) we can produce synthetic tweets based on an existing library of historical tweets – about 3200 tweets from each candidate.

The process is quite simple, but the results will really make many people simultaneously amused and worried. There are of course some hilarious and ridiculous phrases produced by this model, and many that are just plainly incomprehensible – or just plain wrong – such as Fake Bill Shorten declaring that “Labor will introduce a 15% GST”.

There are some interesting results, however. For example, the Fake Scott Morrison announcing that “That’s why today we’ve announced $78m to help those on lower incomes, on pensions…”

But, even a simple model produces some poetic results – such as the Fake Bill Shorten claiming “I seek to lead, ready to govern”. Clearly, in this forecasted tweet, there is an element of humour, but as with many applications of data analytics, models such as these raise an important question. If this is a simple model, how challenging would it be to build a model that could drive meaningful and potentially harmful changes in public sentiment with fake tweets that fool the typical politically disengaged citizen?

Clearly more analysis is needed – fortunately, we have an election worth of data right around the corner.

This is the first article in a series produced by the young Data Analytics Working Group investigating the applications of data to the 2019 Federal Election.
Look out for the next article in this series by yDAWG, as well as an Analytics Snippet coming shortly, to show you exactly how they created the twitter model to produce synthetic tweets!

CPD: Actuaries Institute Members can claim two CPD points for every hour of reading articles on Actuaries Digital.