Nowadays, we spend surprisingly a lot of time online connected to the World Wide Web and various social media platforms, filling idle minutes on the bus by browsing some social media feed, or procrastinating on the web if our concentration on our task at hand fails. Social media has indeed risen to command a more central role in our lives than before.
Some have seen this as an opportunity for unifying people across different parts of the world, and for fostering more lively public debate about societal issues. However, Web 2.0 did not turn out exactly as pictured by these optimists. We have witnessed that the freedom of interaction provided by the Internet has its pros and cons. While people do engage with strangers online to deliberate matters both political as well as those of leisure, this connectivity also has a darker side. The global web plays host to a range of different human behaviors both constructive and unconstructive.
We have seen that social media can be used as a tool for political manipulation, persuasion and promotional uses in order to win attention, people’s allegiance or market share. These might often be veiled in a disguise to hide the true intentions behind what is communicated online: for example, fabricating political support through fake networks of social media user accounts, or manufacturing fabricated photos like in the case of the fake Pentagon explosion.
We have also seen that social media can be used to mobilize people, both for activism with good intentions, but also to target and harass individuals – or to troll others online. This dissertation has focused on uncovering the inner workings of trolling in online conversations: how trolling plays out in different online forums, what its conversational characteristics are like, and how it can be detected using computational modeling.
Many of us might remember how Russian propaganda campaigns in 2014 sought to defame Finnish child protective services to garner attention, influence public opinion and direct discussion away from the war in Ukraine. In the wake of these events, ‘trolls’ became a regular topic for discussion, as journalist Jessikka Aro and others investigated the online propaganda machine. They were also widely targeted by Internet trolls, who attempted to harass them into silence and to stop them from reporting about the workings of the troll factory.
The 2016 U.S. presidential election brought trolls to global attention, when vast networks of fake Twitter user accounts – called ‘trolls’ – were used to amplify pro-Trump content and to defame Hilary Clinton’s campaign.
Although trolls ‘escaped the Internet’ and into the collective consciousness in the wake of these events, trolling is not a novel phenomenon. Already in the 90s, research has shown that ‘trolling’, in its various forms, is ubiquitous in online platforms – a common Internet phenomenon.
There are various forms of trolling, and many different attempts have been made by researchers to define what trolling is. Some define it as ‘baiting newbies’ in chat rooms, some as aggressive behavior online, some as automated fake user networks on Twitter, and some as identity deception. Although trolling may not always be problematic, recent events have shown that trolling that seeks to manipulate of societal and political conversations, and trolling that targets minorities or specific groups of people can do a lot of damage. This makes it important for us to find novel methods for identifying, moderating, and mitigating such behaviors.
According to Claire Hardaker, who has done in-depth qualitative research on trolling already before Twitter trolls became famous, a troll is an Internet user who acts like they sincerely wish to be part of a group, but whose real intention is to cause disruption and to trigger or exacerbate conflict for the purposes of their own amusement.
Based on this definition, me and my colleagues endeavored to gather a dataset representing such trolling behaviors on several online platforms that involve conversational interactions between users. These ranged from online news comment sections to conversation spaces on Reddit. We managed to compile a significant number of interactions that portrayed different types of trolling: from aggressive trolling to hypocritical nitpicking trolls, and from trolling seeking to endanger others to trolling that continuously sought to digress online discussions from their original course. We also found trolling in conversations with different themes: not only do trolls target societal and political discussions around matters such as Brexit or climate change; also more commonplace, quotidian conversations can often be bombarded by these troublemakers. In fact, hobby and leisure-oriented discussion communities are frequently harassed by trolls.
Comparing the trolling interactions that took place, I found that trolls selectively use different strategies when targeting different conversations. In societal and political discussions, trolls tended to use more indirect or covert trolling styles, for example, subtle digressions of conversation, hypocritical nitpicking, or manipulation of the discussion by creating an antipathetic frame for the topic. Meanwhile, in less serious free time and hobby related conversations, trolling styles tended to be more overt: directly aggressing or provoking others, endangering others by giving out bad advice, or posting shocking or taboo content in the discussion. This suggests that the unwritten principles of social behavior considered most important can differ across online conversation spaces. Trolls, being aware of the fact, know how to select the right strategies to disrupt each space, based on what rules carry weight there.
Based on the finding that trolling is very differently acted out in different conversation forums, and the fact that there are various different types of trolling – how could we possibly come up with a way to capture all of these under one definition, or use one method to identify them all?
Let us focus on conversational norms for a bit.
Norms are an important part of human interaction. There are many unwritten (and written) rules outlining how we are expected to act in social situations. Some rules are dependent on time and place, and cultural context, like should we shake someone’s hand or kiss their cheek to greet them. Some are pervasive across different contexts, like the rule that we should attempt to be polite and to avoid offensive or aggressive language. However, as mentioned, trolling can often be indirect, which means that it cannot always be identified based on verbal aggression or use of profanity. Some trolls act covertly, deceptively manipulating their interlocutors.
Seeing trolling as deceptive manipulation, I was interested in what exactly makes interaction with trolls so elusive.
Human interaction tends to strive toward symmetry: by symmetry I mean a coherent exchange of ideas and relevance between action pairs like question–answer. There tendencies are common across languages.
I wanted to investigate if I could find common patterns in how trolls respond to others, by analyzing action pairs that are commonly used in conversation both offline and online.
The unwritten norms of conversation, according to Conversation Analysis research, dictate that certain actions taken in conversation expect a certain type of answer. When one asks a question, one expects an informative response. Also, an accusation of misbehavior expects an account or explanation, and a request expects an acceptance of the request or explanation why the request cannot be fulfilled.
Although this, of course, is a simplification of the complex nature of interaction, we expected that we might still find that trolls repeatedly defy these norms to disrupt conversations, more often than other participants. Indeed, in my research, I found that conversations with trolls included a significantly higher number of cases where these norms of responding were broken. In other words, whereas ordinary constructive conversation strives for symmetry between action pairs, trolling seemed to repeatedly strive toward breaking this symmetry. This was achieved in different ways: ignoring actions that required an answer, challenging others, and providing answers that were mismatched or unexpected in relation to what had previously been said. Trolls frequently and selectively used asymmetric responses to manipulate others into responding and continuing the interaction. This was in common for all different types of trolling included in our dataset.
Our tendency to follow the unwritten principles of conversation serves to keep interaction coherent and understandable. However, these principles are often unconscious. We also tend to interpret violations of these principles in positive light, thinking that there is a good reason for each slight, for example, that the other person did not see our message. Trolls are quite happy to take advantage of this tendency.
We also wanted to know if these findings could be used to identify trolling using computational detection. Definitions of trolling have often emphasized the intentionality of provocative, digressive or deceptive behaviors related to trolling. However, since intentionality is an elusive concept and difficult for computational modeling, we decided that it is important to focus on features that are observable in interaction. So, here “by trolling we refer to conversational trolling, which manifests as strategically deceptive interaction online, aiming to confuse, provoke and manipulate others into participating in pointless and even harmful discussions” [1].
To identify trolling, we analyzed what a conversation was like, looking at how often conversational norms were broken, how many asymmetric responses there were, what actions participants took and what politeness strategies they used. We found that these features were important when using Machine Learning and Natural Language Processing methods to computationally detect trolling in online conversations. By computationally analyzing the paired actions I described earlier, and norms related to them, with the computational model developed we were able to correctly identify 92% of all trolling or non-trolling conversations in our dataset.
This means that the model is better at detecting trolling as compared to previous trolling detection models in the field. Metrics aside, trolling, in other words, can be effectively characterized by continuous use of asymmetric responses, and repeated disruptions of conversation and its norms.
Finally, it must be noted that deception and manipulation are, of course, not novel phenomena by far. They are part of human interaction.
The famous philosopher Paul Grice once said that the nature of communication is to influence and to be influenced by others. In other words, the concept of influence is not black and white; rather, it exists on a continuum.
This means that it is important to understand the different shades of human interaction in their communicative context. Trolling, for instance, is an Internet-born concept, but the strategies of interaction it utilizes are not new. As I have shown in my dissertation, it is important to pay attention to the features of conversational interaction, and how norms are followed or broken, to be able to identify manipulative online behaviors like trolling more efficiently. Bearing this in mind, I think, will make it easier to make sense of the range of online behaviors that may or may not aim to influence the way we think or behave.
Henna Paakki, 29.11.2024
The full dissertation is available at: https://urn.fi/URN:ISBN:978-952-64-2083-7
[1] Henna Paakki, Heidi Vepsäläinen, Antti Salovaara, and Bushra Zafar. 2024. Detecting Covert Disruptive
Behavior in Online Interaction by Analyzing Conversational Features and Norm Violations. ACM Trans. Comput.-Hum. Interact. 31, 2, Article 20 (January 2024), 43 pages. https://doi.org/10.1145/3635143

