Fully Tagged Global Broadcast Transcript / Text Feed (GBTF)

Broadcast TV and radio transcripts are collected from cable providers, satellite, streaming or terrestrial broadcast. Information extraction and enrichment tasks on the associated transcripts that include: insertion of full broadcast metadata and fully-tagged companies with their ticker symbols. Broadcast TV and radio transcripts are collected from cable providers, satellite, streaming or terrestrial broadcast. Multiple global capture sites record video and audio 24/7, and Finch Computing, our technology partner, performs a variety of information extraction and enrichment tasks on the associated transcripts that include: insertion of full broadcast metadata, as well as fully-tagged and resolved companies with their ticker symbols. THERE IS OPPORTUNITY AND ALPHA EMBEDDED IN BROADCAST LANGUAGE Unique: A machine-readable source of aggregated content in context, not available in a single feed elsewhere. Trusted: News and events become validated when they are aired on television, relative to social media. Additive: Fully disambiguated exogenous content -- in context -- is a generator of incremental alpha

24

countries

24%

popularity

Alexandria Transcript Text Analytics

Enhance your investment analysis with global coverage of company transcripts that can be turned into a top-down factor. Consistently capture sentiment and topics in transcripts to forecast risk, direction and volatility. Every transcript is broken down into sections, which are then classified by topics. There are 250 topics, including Accounting, Commodity Costs, Guidance, Product Recall, Regulations, Technology, and more. Leap beyond lexicon and price-derived NLP. Alexandria’s programs can match a professional investment analyst with >91% accuracy, processing each article in 1/10th the blink of an eye, across the thousands of call transcripts published every day from around the world. Alexandria’s global transcripts dataset is the application of proprietary algorithms on company transcripts; as well as on earnings, investor, and all public company calls. Each transcript is parsed into many sections of structured data, each showing detail on what was said in that section including the topic, the sentiment, the speaker and more. For more information, please visit https://www.alexandriatechnology.com/ *Powered by FactSet

1

countries

72%

popularity

Real-time global news data: Refreshed daily from over 8000 publications across 200+ countries

Our comprehensive news dataset is regularly updated to include news data from across the web. Data is structured and enriched with our proprietary tagging system based on the content. Our comprehensive news dataset is regularly updated to include news data from across the web. Our news covers over 8000 media publications and this number is growing every day. News records cover 200+ countries and 10+ of the world’s most spoken languages. Data is structured and enriched with our proprietary tagging system. We clean and update data to make it simple and effective to ingest. This happens within one hour of publication, creating a speedy, regularly updated dataset. We are currently generating over 50m data points per month, including, media, raw content, language and smart category tagging. All tags are cleaned routinely to avoid duplicates and to allow customers to effectively integrate news data into their projects. Tagging is the most important attribution to each data item. When an article is collected, the publisher-defined tagging (when available) is attached to the news item. Before tagging is useful for a data consumer, they are normalized so different spelling, use of special characters or casing is ignored.

232

countries

73%

popularity

Overtone Journalistic Content Bot/Human Indicator Dataset

Benefits: Our data is based on qualitative signals in the text, not any metadata. Format & attributes: CSV with UID + score + confidence. Coverage: Currently ~4,000 news outlets (can be customised / expanded upon request). English + Spanish available. Scale: from 1 – 100,000 rows at a time. We indicate how likely a piece of content is computer generated or human written. Content: any text in English or Spanish, from a single sentence to articles of 1,000s words length. Data uniqueness: we use custom built and trained NLP algorithms to assess human effort metrics that are inherent in text content. We focus on what's in the text, not metadata such as publication or engagement. Our AI algorithms are co-created by NLP & journalism experts. Our datasets have all been human-reviewed and labeled. Dataset: CSV containing URL and/or body text, with attributed scoring as an integer and model confidence as a percentage. We ignore metadata such as author, publication, date, word count, shares and so on, to provide a clean and maximally unbiased assessment of how much human effort has been invested in content. Our data is provided in CSV/RSS/JSON format. One row = one scored article. CSV contains URL and/or body text, with attributed scoring as an integer and model confidence as a percentage. Integrity indicators provided as integers on a 1–5 scale. We also have custom models with 35 categories that can be added on request. Data sourcing: public websites, crawlers, scrapers, other partnerships where available. We generally can assess content behind paywalls as well as without paywalls. We source from ~4,000 news outlets, examples include: Bloomberg, CNN, BCC are one each. Countries: all English-speaking markets world-wide. Includes English-language content from non English majority regions, such as Germany, Scandinavia, Japan. Also available in Spanish on request. Use-cases: assessing the implicit integrity and reliability of an article. There is correlation between integrity and human value: we have shown that articles scoring highly according to our scales show increased, sustained, ongoing end-user engagement. Clients also use this to assess journalistic output, publication relevance and to create datasets of 'quality' journalism. Overtone provides a range of qualitative metrics for journalistic, newsworthy and long-form content. We find, highlight and synthesise content that shows added human effort and, by extension, added human value.

106

countries

65%

popularity

Data Annotation and Labeling Services for Natural Language Processing (NLP)

Scale your text and audio annotation with a team skilled in understanding and interpreting complex, nuanced language. Our data analysts combine your business context with their understanding of language, syntax, and sentence structure to accurately parse and tag text according to your specifications. We can extract meaning from raw audio and text data to advance your NLP project. From information extraction to sentiment analysis, we can help you unlock the hidden insights contained within written text and verbal language, powering your NLP algorithms and machine learning models. Samples/Tables Included: - This service is available for text, documents, and audio data. Fields Included: - Labels can include transcription, classification, tokens, tags, sentiment, entities, relationships, document categorization Sources: - Raw data files (documents, text, images, audio) are provided by the consumer/client - Structured labeling output files are provided by CloudFactory Expected Workflow: - Raw data (documents, text, images, audio) is shared with CloudFactory through the consumer/client’s Snowflake account - Labeling guidelines are shared with CloudFactory - Access to any necessary tools is provided to CloudFactory - CloudFactory provides seed service to kick off project with a cross-functional team of highly-skilled data analysts - CloudFactory annotates and labels the raw data based on the guidelines and feedback - CloudFactory shares the structured annotation output files through CloudFactory’s Snowflake account

N/A

countries

85%

popularity

Save to list

Save to list

We care about your privacy

Your account

Your Datahub

Broadcast Transcript Feed with Sentiment Analysis (GBTS)

Description

Your datahub

How can we help?

Something went wrong

Ticket submitted, we will be in touch!

How can we help you?Type your business problem

Broadcast Transcript Feed with Sentiment Analysis (GBTS)

Description

Geographics

🇦🇺Australia

🇧🇸Bahamas

🇮🇪Ireland

🇨🇦Canada

🇳🇿New Zealand

🇸🇬Singapore

🇧🇲Bermuda

🇬🇧United Kingdom

🇨🇳China

🇯🇲Jamaica

🇿🇦South Africa

🇧🇧Barbados

Insights

Similar products

Fully Tagged Global Broadcast Transcript / Text Feed (GBTF)

Alexandria Transcript Text Analytics

Real-time global news data: Refreshed daily from over 8000 publications across 200+ countries

Overtone Journalistic Content Bot/Human Indicator Dataset

Data Annotation and Labeling Services for Natural Language Processing (NLP)

Geographics

🇦🇺Australia

🇧🇸Bahamas

🇮🇪Ireland

🇨🇦Canada

🇳🇿New Zealand

🇸🇬Singapore

🇧🇲Bermuda

🇬🇧United Kingdom

🇨🇳China

🇯🇲Jamaica

🇿🇦South Africa

🇧🇧Barbados

Similar products

Fully Tagged Global Broadcast Transcript / Text Feed (GBTF)

Alexandria Transcript Text Analytics

Real-time global news data: Refreshed daily from over 8000 publications across 200+ countries

Overtone Journalistic Content Bot/Human Indicator Dataset

Data Annotation and Labeling Services for Natural Language Processing (NLP)

Insights