⬇ Dataset (CSV) πŸ“„ Annotation Guidelines πŸ“‹ Annotator Feedback πŸ” Explore Dataset Online 🧩 Most Frequent Schemes Online 🌐 MediaEval 2026

Overview

This dataset consists of political tweets annotated for the presence of enthymemes β€” arguments in which a key component is left implicit. For each tweet, multiple independent annotators determine whether an implicit premise, an implicit conclusion, or no implicit component is present, reconstruct the full propositional structure of the argument, and identify the underlying Walton argumentation scheme.

The tweets cover two topics in British political discourse: immigration policy and COVID-19 vaccination. They were drawn from the tropes corpus of Flaccavento et al. (2025) and selected to balance both topics and enthymeme types across the dataset.

A central design principle is the preservation of annotator disagreement. Rather than reducing multiple judgements to a single ground truth, all individual labels and reconstructions are retained and released alongside the data, enabling research into annotation variation as a substantive signal rather than noise to be discarded.

⚠ The dataset contains language directed at immigrants that some readers may find offensive. This reflects the nature of the source material and has not been filtered.

Annotation Schema

Each tweet is annotated independently by multiple annotators. Train and dev instances are annotated by five annotators each; test instances by three. Every annotator provides the following for each tweet:

Enthymeme Type

One of three labels: implicit_premise (an unstated supporting assumption the argument relies on), implicit_conclusion (a claim that follows from stated premises but is never expressed), or none (all components are explicit, or no argument is present).

Argument Reconstruction

The annotator writes out the full set of propositions β€” premises and conclusion β€” constituting the argument. The implicit component is marked with the tag (implicit). The example below illustrates a complete reconstruction.

Tweet
"Deterring the plans of illegal people smugglers is essential to controlled immigration. We should support all plans to stop them."
Reconstructed argument
Premise 1 β€” implicit: Controlled immigration is desirable.
Premise 2 β€” explicit: Deterring the plans of illegal people smugglers is essential to controlled immigration.
Conclusion β€” explicit: We should support all plans to stop them.

Walton Argumentation Scheme

Annotators classify the argument using Walton's taxonomy of argumentation schemes. The most frequently attested schemes in the dataset include Argument from Cause to Effect, Argument from Inconsistent Commitment, Argument from Motive, Argument from Source Credibility, and Argument from Consequences. The full taxonomy, critical questions, and abstract scheme forms used in annotation are documented in the annotation guidelines.

β„Ή The complete annotation schema, scheme definitions, worked examples, and edge-case rules are available in the Annotation Guidelines (PDF).

Data Splits & Release Schedule

The dataset is released in three stages. The train and dev sets released in mid-March are supersets of the initial sample.

Data Format

The dataset is distributed as CSV files β€” one per annotator per split β€” alongside a merged file aggregating all annotations. Each row corresponds to one tweet as annotated by one annotator.

FieldDescription
tweet_idUnique tweet identifier
tweet_textRaw tweet content
topicimmigration or vaccine
annotator_idAnonymised annotator code
labelimplicit_premise, implicit_conclusion, or none
schemeWalton scheme name, or None
prop_1 … prop_3Reconstructed propositions with inline role tags
implicit_textExtracted text of the implicit proposition (convenience field)

Within proposition fields, the role of each proposition is marked inline. The implicit component carries the tag (implicit) appended to its text β€” e.g. "Controlled immigration is desirable. (implicit)".

Motivation

Enthymemes are among the most pervasive β€” and most underexplored β€” features of persuasive discourse. By leaving a key premise or conclusion unstated, an argument invites the reader to supply it themselves, producing the subjective impression that the inference is their own. This mechanism is especially effective in short-form political communication, where space is constrained and emotional register is prioritised over logical explicitness.

Detecting and reconstructing implicit argument components is directly relevant to computational fact-checking, misinformation research, and argument mining more broadly. A system capable of recovering the unstated premise underlying a political claim has taken a meaningful step toward auditing that claim's logical structure.

Most existing argument mining corpora treat annotation disagreement as noise to be minimised. This dataset treats it as a feature: genuine interpretive plurality is preserved and the resource is designed to support research into learning from disagreement rather than collapsing it into a single authoritative label.

References

Quick Access

Dataset at a Glance

Annotation Summary

Labels

Per Annotation

Affiliated Task

Organizers