Post Annotations

Overview

Annotations have been added to the Post object from all v2 endpoints that return a Post object. Post annotations offer a way to understand contextual information about the Post itself. Though 100% of Posts are reviewed, due to the contents of Post text, only a portion are annotated.

Entity annotations (NER): Entities include people, places, products, and organizations, and are delivered in the entity payload section. They are programmatically assigned based on what is explicitly mentioned (named-entity recognition) in the Post text.
Context annotations: Derived from analyzing a Post’s text, context annotations include a domain and entity pairing to help discover Posts on topics that may have been previously difficult to surface. We currently use 80+ domains to categorize Posts. A CSV file of available context annotation entities is available in our Github repository.

Post Annotation Types

Entities

Entity annotations are programmatically defined entities within the entities field and are reflected as annotations in the payload. Each annotation has a confidence score and indicates where in the Post text the entities were identified (using start and end fields). The entity annotation types include:

Person - Examples: Barack Obama, Daniel, George W. Bush
Place - Examples: Detroit, Cali, San Francisco
Product - Examples: Mountain Dew, Mozilla Firefox
Organization - Examples: Chicago White Sox, IBM
Other - Examples: Diabetes, Super Bowl 50

Context

Last updated: June 2022 Context annotations are delivered in the context_annotations field of the payload. They are inferred based on semantic analysis of keywords, hashtags, handles, etc., in the Post text and result in domain and/or entity labels. Currently, we use 80+ domains, as shown in the table below.

Domain Categories	Domain Codes
3: TV Shows	46: Brand Category
4: TV Episodes	47: Brand
6: Sports Events	48: Product
10: Person	54: Musician
11: Sport	55: Music Genre
12: Sports Team	56: Actor
13: Place	58: Entertainment Personality
22: TV Genres	60: Athlete
23: TV Channels	65: Interests and Hobbies Vertical
26: Sports League	66: Interests and Hobbies Category
27: American Football Game	67: Interests and Hobbies
28: NFL Football Game	68: Hockey Game
29: Events	71: Video Game
31: Community	78: Video Game Publisher
35: Politicians	79: Video Game Hardware
38: Political Race	83: Cricket Match
39: Basketball Game	84: Book
40: Sports Series	85: Book Genre
43: Soccer Match	86: Movie
44: Baseball Game	87: Movie Genre
45: Brand Vertical	88: Political Body
46: Brand Category	89: Music Album
47: Brand	90: Radio Station
48: Product	91: Podcast
54: Musician	92: Sports Personality
55: Music Genre	93: Coach
56: Actor	94: Journalist
58: Entertainment Personality	95: TV Channel [Entity Service]
60: Athlete	109: Reoccurring Trends
65: Interests and Hobbies Vertical	110: Viral Accounts
66: Interests and Hobbies Category	114: Concert
67: Interests and Hobbies	115: Video Game Conference
68: Hockey Game	116: Video Game Tournament
71: Video Game	117: Movie Festival
78: Video Game Publisher	118: Award Show
79: Video Game Hardware	119: Holiday
83: Cricket Match	120: Digital Creator
84: Book	122: Fictional Character
85: Book Genre	130: Multimedia Franchise
86: Movie	131: Unified Twitter Taxonomy
87: Movie Genre	136: Video Game Personality
88: Political Body	137: eSports Team
89: Music Album	138: eSports Player
90: Radio Station	139: Fan Community
91: Podcast	149: Esports League
92: Sports Personality	152: Food
93: Coach	155: Weather
94: Journalist	156: Cities
95: TV Channel [Entity Service]	157: Colleges & Universities
109: Reoccurring Trends	158: Points of Interest
110: Viral Accounts	159: States
114: Concert	160: Countries
115: Video Game Conference	162: Exercise & Fitness
116: Video Game Tournament	163: Travel
117: Movie Festival	164: Fields of Study
118: Award Show	165: Technology
119: Holiday	166: Stocks
120: Digital Creator	167: Animals
122: Fictional Character	171: Local News
130: Multimedia Franchise	172: Global TV Show
131: Unified Twitter Taxonomy	173: Google Product Taxonomy
136: Video Game Personality	174: Digital Assets & Crypto
137: eSports Team	175: Emergency Events
138: eSports Player

Note: Domain 131 (Unified Twitter Taxonomy) refers to X’s User Facing Interest Taxonomy. This taxonomy helps to power features on the platform such as Topics.

Requesting Annotations

Sample Request

curl --location --request GET 'https://api.x.com/2/tweets/1212092628029698048?tweet.fields=context_annotations,entities' --header 'Authorization: Bearer $BEARER_TOKEN'

Sample Response

{
    "data": {
        "context_annotations": [
            {
                "domain": {
                    "id": "119",
                    "name": "Holiday",
                    "description": "Holidays like Christmas or Halloween"
                },
                "entity": {
                    "id": "1186637514896920576",
                    "name": "New Years Eve"
                }
            },
            {
                "domain": {
                    "id": "119",
                    "name": "Holiday",
                    "description": "Holidays like Christmas or Halloween"
                },
                "entity": {
                    "id": "1206982436287963136",
                    "name": "Happy New Year: It’s finally 2020 everywhere!",
                    "description": "Catch fireworks and other celebrations as people across the globe enter the new year.\nPhoto via @GettyImages"
                }
            },
            {
                "domain": {
                    "id": "45",
                    "name": "Brand Vertical",
                    "description": "Top level entities that describe a Brand's industry"
                }
            },
            {
                "domain": {
                    "id": "46",
                    "name": "Brand Category",
                    "description": "Categories within Brand Verticals that narrow down the scope of Brands"
                },
                "entity": {
                    "id": "781974596752842752",
                    "name": "Services"
                }
            },
            {
                "domain": {
                    "id": "47",
                    "name": "Brand",
                    "description": "Brands and Companies"
                },
                "entity": {
                    "id": "10045225402",
                    "name": "Twitter"
                }
            }
        ],
        "entities": {
            "annotations": [
                {
                    "start": 144,
                    "end": 150,
                    "probability": 0.626,
                    "type": "Product",
                    "normalized_text": "Twitter"
                }
            ],
            "urls": [
                {
                    "start": 222,
                    "end": 245,
                    "url": "https://t.co/yvxdK6aOo2",
                    "expanded_url": "https://x.com/LovesNandos/status/1211797914437259264/photo/1",
                    "display_url": "pic.x.com/yvxdK6aOo2"
                }
            ]
        },
        "id": "1212092628029698048",
        "text": "We believe the best future version of our API will come from building it with YOU. Here’s to another great year with everyone who builds on the Twitter platform. We can’t wait to continue working with you in the new year. https://t.co/yvxdK6aOo2"
    }
}

Sample App

Check out the Post Entity Extractor on Glitch to easily discover context annotations in Posts and see how this feature works.

Frequently asked questions

Context annotations

The questions below are specific to the context annotations element of Tweet annotations. For more details, please see the Overview page.

How does Twitter context annotations work?

Twitter classifies Tweets semantically, meaning that we curate lists of keywords, hashtags, and @handles that are relevant to a given topic. If a Tweet contains the text we’ve specified, it will be labeled appropriately. This differs from a machine learning approach where a model is trained specifically to classify text (in this case, Tweets) and produce a probability score alongside the output/classification.

How do I know that your data is complete and trustworthy?

Twitter’s annotations are curated by domain experts using research and QA processes that have been refined over the course of several years. The process is supported by custom tooling to scale data tracking as far as we are able to maintain excellent precision and recall. In addition, our data is audited regularly by an internal team, having received a precision score of ~80% for the past several quarters.

How do you ensure precision?

Team members QA our entities on a daily basis to ensure high precision and recall. Additionally, our work is audited quarterly by an internal team, which manually reviews 10,000 Tweets across all of our domains to calculate a precision score.

How do you decide what to track?

For some domains, like sports and TV, we rely on automated ingest to build out our graph. In the News domain, we track data around stories published by the Twitter Moments team. Otherwise, the team uses a variety of research methods to identify topics to track that garner a high amount of conversation on the platform.

What historical support is available with Tweet Annotations?

Data tracking begins the moment an entity is published; therefore, we do not annotate Tweets that were published prior to a given entity being tracked. For example, if an upstart brand/company is added to the taxonomy, we will not retroactively annotate Tweets about that brand prior to when the annotation was added.

Is Twitter able to annotate Tweets in non-english languages? If so, which languages and does the coverage of Tweets being annotated change?

Yes. Language coverage can vary depending on the domain and the market. English and Japanese are included in the majority of the biggest entities. Below, is a list of the languages and main markets that are covered today:

English (US, UK)
Japanese (Japan)
Portuguese (Brazil)
Spanish (Argentina, Mexico, Spain)
Hindi (India)
Arabic (Saudi Arabia)
Turkish (Turkey)
Indonesian (Indonesia)
Russian (Russia)
French (France)

Coming soon (~H2 2021):

German (Germany)
Tamil (India)

Below is a table of the top 15 countries ordered by the most coverage of annotated Tweets:

Rank	Country code	Country	% of Tweets annotated
1	IN	India	41%
2	VN	Vietnam	36%
3	GB	Great Britain	36%
4	EC	Ecuador	35%
5	PE	Peru	33%
6	US	United States	32%
7	CA	Canada	32%
8	AU	Australia	31%
9	JP	Japan	31%
10	PH	Philippines	30%
11	SG	Singapore	30%
12	MY	Malaysia	30%
13	MX	Mexico	30%
14	GB	Great Britain	29%
15	NG	Nigeria	29%

What underlying 'semantics' does Twitter rely on to annotate a Tweet?

Tweet annotations consist of the following semantics to annotate a Tweet:

Accounts - we can annotate tweets from a given handle or mentioning this handle
Hashtags
Keywords/phrases

For customers that are familiar with the filtered steaming APIs such as PowerTrack, the semantics used by annotations are similar in principle to the boolean rules defined to filter a stream of Tweets. If a Tweet matches the underlying semantic conditions, it will be tagged accordingly.

Why do some Tweets have entities associated with them while others do not?

The goal is to annotate as many Tweets as possible; however, there are several reasons why some Tweets are not annotated:

Some Tweets aren’t semantically rich enough to be labelled and can’t be tagged with our current annotation rules
Some Tweets aren’t topical
The Tweet is about a very ephemeral topic that’s not in our graph
We don’t cover the language/market
We cover the language/market but we’re missing a topic or a specific term/account/hashtag related to a topic we already track

When there are multiple domains (for example, [3,30]), does the Entity ID remain the same?

An entity can be part of multiple domains. The domain IDs will change but the entity ID remains the same. Donald Glover is a person (domain 10), an actor (domain 56) and a musician (domain 54) but his entity ID is still 875072662527029248.

Do you have an established timeline for show/movie tracking? In other words, how long is a show/movie tracked before/after release?

Tracking starts a month prior to the release. For popular blockbusters, like a Marvel movie, we can start tracking them as soon as they start teasing about an upcoming release.

Do movies have a locale filter similar to the one for TV shows?

No, they do not.

Overview

Posts

Users

Direct Messages

Likes

Lists

Spaces

Communities

Community Notes

Trends

Media

X Activity

Usage

Stream Connections

Compliance

Enterprise v2

Likes Streams

GNIP 2.0

Post Annotations

Overview

Post Annotation Types

Entities

Context

Requesting Annotations

Sample Request

Sample Response

Sample App

Frequently asked questions

Context annotations

Overview

Posts

Users

Direct Messages

Likes

Lists

Spaces

Communities

Community Notes

Trends

Media

X Activity

Usage

Stream Connections

Compliance

Enterprise v2

Likes Streams

GNIP 2.0

​Overview

​Post Annotation Types

​Entities

​Context

​Requesting Annotations

​Sample Request

​Sample Response

​Sample App

​Frequently asked questions

​Context annotations

Overview

Post Annotation Types

Entities

Context

Requesting Annotations

Sample Request

Sample Response

Sample App

Frequently asked questions

Context annotations