Skip to main content

Overview

Annotations have been added to the Post object from all v2 endpoints that return a Post object. Post annotations offer a way to understand contextual information about the Post itself. Though 100% of Posts are reviewed, due to the contents of Post text, only a portion are annotated.
  1. Entity annotations (NER): Entities include people, places, products, and organizations, and are delivered in the entity payload section. They are programmatically assigned based on what is explicitly mentioned (named-entity recognition) in the Post text.
  2. Context annotations: Derived from analyzing a Post’s text, context annotations include a domain and entity pairing to help discover Posts on topics that may have been previously difficult to surface. We currently use 80+ domains to categorize Posts. A CSV file of available context annotation entities is available in our Github repository.

Post Annotation Types

Entities

Entity annotations are programmatically defined entities within the entities field and are reflected as annotations in the payload. Each annotation has a confidence score and indicates where in the Post text the entities were identified (using start and end fields). The entity annotation types include:
  • Person - Examples: Barack Obama, Daniel, George W. Bush
  • Place - Examples: Detroit, Cali, San Francisco
  • Product - Examples: Mountain Dew, Mozilla Firefox
  • Organization - Examples: Chicago White Sox, IBM
  • Other - Examples: Diabetes, Super Bowl 50

Context

Last updated: June 2022 Context annotations are delivered in the context_annotations field of the payload. They are inferred based on semantic analysis of keywords, hashtags, handles, etc., in the Post text and result in domain and/or entity labels. Currently, we use 80+ domains, as shown in the table below.
Domain CategoriesDomain Codes
3: TV Shows46: Brand Category
4: TV Episodes47: Brand
6: Sports Events48: Product
10: Person54: Musician
11: Sport55: Music Genre
12: Sports Team56: Actor
13: Place58: Entertainment Personality
22: TV Genres60: Athlete
23: TV Channels65: Interests and Hobbies Vertical
26: Sports League66: Interests and Hobbies Category
27: American Football Game67: Interests and Hobbies
28: NFL Football Game68: Hockey Game
29: Events71: Video Game
31: Community78: Video Game Publisher
35: Politicians79: Video Game Hardware
38: Political Race83: Cricket Match
39: Basketball Game84: Book
40: Sports Series85: Book Genre
43: Soccer Match86: Movie
44: Baseball Game87: Movie Genre
45: Brand Vertical88: Political Body
46: Brand Category89: Music Album
47: Brand90: Radio Station
48: Product91: Podcast
54: Musician92: Sports Personality
55: Music Genre93: Coach
56: Actor94: Journalist
58: Entertainment Personality95: TV Channel [Entity Service]
60: Athlete109: Reoccurring Trends
65: Interests and Hobbies Vertical110: Viral Accounts
66: Interests and Hobbies Category114: Concert
67: Interests and Hobbies115: Video Game Conference
68: Hockey Game116: Video Game Tournament
71: Video Game117: Movie Festival
78: Video Game Publisher118: Award Show
79: Video Game Hardware119: Holiday
83: Cricket Match120: Digital Creator
84: Book122: Fictional Character
85: Book Genre130: Multimedia Franchise
86: Movie131: Unified Twitter Taxonomy
87: Movie Genre136: Video Game Personality
88: Political Body137: eSports Team
89: Music Album138: eSports Player
90: Radio Station139: Fan Community
91: Podcast149: Esports League
92: Sports Personality152: Food
93: Coach155: Weather
94: Journalist156: Cities
95: TV Channel [Entity Service]157: Colleges & Universities
109: Reoccurring Trends158: Points of Interest
110: Viral Accounts159: States
114: Concert160: Countries
115: Video Game Conference162: Exercise & Fitness
116: Video Game Tournament163: Travel
117: Movie Festival164: Fields of Study
118: Award Show165: Technology
119: Holiday166: Stocks
120: Digital Creator167: Animals
122: Fictional Character171: Local News
130: Multimedia Franchise172: Global TV Show
131: Unified Twitter Taxonomy173: Google Product Taxonomy
136: Video Game Personality174: Digital Assets & Crypto
137: eSports Team175: Emergency Events
138: eSports Player
Note: Domain 131 (Unified Twitter Taxonomy) refers to X’s User Facing Interest Taxonomy. This taxonomy helps to power features on the platform such as Topics.

Requesting Annotations

Sample Request

curl --location --request GET 'https://api.x.com/2/tweets/1212092628029698048?tweet.fields=context_annotations,entities' --header 'Authorization: Bearer $BEARER_TOKEN'

Sample Response

{
    "data": {
        "context_annotations": [
            {
                "domain": {
                    "id": "119",
                    "name": "Holiday",
                    "description": "Holidays like Christmas or Halloween"
                },
                "entity": {
                    "id": "1186637514896920576",
                    "name": "New Years Eve"
                }
            },
            {
                "domain": {
                    "id": "119",
                    "name": "Holiday",
                    "description": "Holidays like Christmas or Halloween"
                },
                "entity": {
                    "id": "1206982436287963136",
                    "name": "Happy New Year: It’s finally 2020 everywhere!",
                    "description": "Catch fireworks and other celebrations as people across the globe enter the new year.\nPhoto via @GettyImages"
                }
            },
            {
                "domain": {
                    "id": "45",
                    "name": "Brand Vertical",
                    "description": "Top level entities that describe a Brand's industry"
                }
            },
            {
                "domain": {
                    "id": "46",
                    "name": "Brand Category",
                    "description": "Categories within Brand Verticals that narrow down the scope of Brands"
                },
                "entity": {
                    "id": "781974596752842752",
                    "name": "Services"
                }
            },
            {
                "domain": {
                    "id": "47",
                    "name": "Brand",
                    "description": "Brands and Companies"
                },
                "entity": {
                    "id": "10045225402",
                    "name": "Twitter"
                }
            }
        ],
        "entities": {
            "annotations": [
                {
                    "start": 144,
                    "end": 150,
                    "probability": 0.626,
                    "type": "Product",
                    "normalized_text": "Twitter"
                }
            ],
            "urls": [
                {
                    "start": 222,
                    "end": 245,
                    "url": "https://t.co/yvxdK6aOo2",
                    "expanded_url": "https://x.com/LovesNandos/status/1211797914437259264/photo/1",
                    "display_url": "pic.x.com/yvxdK6aOo2"
                }
            ]
        },
        "id": "1212092628029698048",
        "text": "We believe the best future version of our API will come from building it with YOU. Here’s to another great year with everyone who builds on the Twitter platform. We can’t wait to continue working with you in the new year. https://t.co/yvxdK6aOo2"
    }
}

Sample App

Check out the Post Entity Extractor on Glitch to easily discover context annotations in Posts and see how this feature works.

Frequently asked questions

Context annotations

The questions below are specific to the context annotations element of Tweet annotations. For more details, please see the Overview page.
Twitter classifies Tweets semantically, meaning that we curate lists of keywords, hashtags, and @handles that are relevant to a given topic. If a Tweet contains the text we’ve specified, it will be labeled appropriately. This differs from a machine learning approach where a model is trained specifically to classify text (in this case, Tweets) and produce a probability score alongside the output/classification.
Twitter’s annotations are curated by domain experts using research and QA processes that have been refined over the course of several years. The process is supported by custom tooling to scale data tracking as far as we are able to maintain excellent precision and recall. In addition, our data is audited regularly by an internal team, having received a precision score of ~80% for the past several quarters.
Team members QA our entities on a daily basis to ensure high precision and recall. Additionally, our work is audited quarterly by an internal team, which manually reviews 10,000 Tweets across all of our domains to calculate a precision score.
For some domains, like sports and TV, we rely on automated ingest to build out our graph. In the News domain, we track data around stories published by the Twitter Moments team. Otherwise, the team uses a variety of research methods to identify topics to track that garner a high amount of conversation on the platform.
Data tracking begins the moment an entity is published; therefore, we do not annotate Tweets that were published prior to a given entity being tracked. For example, if an upstart brand/company is added to the taxonomy, we will not retroactively annotate Tweets about that brand prior to when the annotation was added.
Yes. Language coverage can vary depending on the domain and the market. English and Japanese are included in the majority of the biggest entities. Below, is a list of the languages and main markets that are covered today:
  1. English (US, UK)
  2. Japanese (Japan)
  3. Portuguese (Brazil)
  4. Spanish (Argentina, Mexico, Spain)
  5. Hindi (India)
  6. Arabic (Saudi Arabia)
  7. Turkish (Turkey)
  8. Indonesian (Indonesia)
  9. Russian (Russia)
  10. French (France)
Coming soon (~H2 2021):
  1. German (Germany)
  2. Tamil (India)
Below is a table of the top 15 countries ordered by the most coverage of annotated Tweets:
RankCountry codeCountry% of Tweets annotated
1INIndia41%
2VNVietnam36%
3GBGreat Britain36%
4ECEcuador35%
5PEPeru33%
6USUnited States32%
7CACanada32%
8AUAustralia31%
9JPJapan31%
10PHPhilippines30%
11SGSingapore30%
12MYMalaysia30%
13MXMexico30%
14GBGreat Britain29%
15NGNigeria29%
Tweet annotations consist of the following semantics to annotate a Tweet:
  • Accounts - we can annotate tweets from a given handle or mentioning this handle
  • Hashtags
  • Keywords/phrases
For customers that are familiar with the filtered steaming APIs such as PowerTrack, the semantics used by annotations are similar in principle to the boolean rules defined to filter a stream of Tweets. If a Tweet matches the underlying semantic conditions, it will be tagged accordingly.
The goal is to annotate as many Tweets as possible; however, there are several reasons why some Tweets are not annotated:
  • Some Tweets aren’t semantically rich enough to be labelled and can’t be tagged with our current annotation rules
  • Some Tweets aren’t topical
  • The Tweet is about a very ephemeral topic that’s not in our graph
  • We don’t cover the language/market
  • We cover the language/market but we’re missing a topic or a specific term/account/hashtag related to a topic we already track
An entity can be part of multiple domains. The domain IDs will change but the entity ID remains the same. Donald Glover is a person (domain 10), an actor (domain 56) and a musician (domain 54) but his entity ID is still 875072662527029248.
Tracking starts a month prior to the release. For popular blockbusters, like a Marvel movie, we can start tracking them as soon as they start teasing about an upcoming release.
No, they do not.
I