The Brain Sentiment Indicator extracts sentiment and related metrics like buzz and volume from public financial news for 6000+ US stocks from many thousands of financial media sources in 33 languages.
The sentiment scoring technology is based on a combination of various natural language processing techniques. The sentiment score assigned to each stock is a value ranging from -1 (most negative) to +1 (most positive) that is updated with a daily frequency. For each stock the sentiment score corresponds to the average of sentiment for each piece of news and it is available on two time scales; 7 days and 30 days.
Additional fields measuring the number of stories published and the level of attention (buzz) received from financial media are also available.
This dataset contains historical data from August 2016 until July 2022 for trial purposes.
In production mode the live dataset is updated daily with new files every day within 6am UTC.
The main schema is called SENTIMENTS and contains two tables:
- SENTIMENTS_DAYS_7 contains the metrics calculated taking into account the news articles of past 7 days
- SENTIMENTS_DAYS_30 contains the metrics calculated taking into account the news articles of past 30 days
Some of the key fields in each table are:
- CALCULATION_DATE: The calculation date for the sentiment score and related metrics in format YYYY-MM-DD. For example a CALCULATION_DATE equal to '20181010' means that for each company the metrics are caculated using the average of news articles for the previous time interval (7 days or 30 days depening on the selected table). In the case of 7 days horizon this corresponds to the news articles between 2018-09-03 00:00:00 UTC and 2018-10-09 23:59:59 UTC.
- NAME: The company name.
- COMPOSITE_FIGI: The FIGI composite code (https://www.openfigi.com) that identifies the stock across related exchanges in the same country.
- PRIMARY_EXCHANGE_TICKER: The stock ticker on the primary exchange, e.g. AAPL
- VOLUME: Number of news articles detected for the company in the previous time interval (7 days or 30 days depening on the selected table).
- SENTIMENT_SCORE: Sentiment score from -1 to 1 where 1 is the most positive and -1 the most negative. The sentiment score is calculated as an average of sentiment of news articles collected for the specific company in the previous time interval (7 days or 30 days depening on the selected table).
- BUZZ_VOLUME: Buzz score that quantifies how much attention in terms of news VOLUME one company is receiving compared to the past. This is calculated by considering the VOLUME distribution of past six months. Then the buzz is calculated as current VOLUME minus average of VOLUME for past 6 months in units of standard deviations. A value close to 0 means that the stocks is covered by a VOLUME of stories similar to its past average, a value larger than 0 gives how many standard deviations the current VOLUME is larger than average. The value is reported only if there are enough stories in the past to estimate a reliable value.
DISCLAIMER
The content of this dataset is not to be intended as investment advice. The material is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Brain. Brain makes no guarantees regarding the accuracy and completeness of the information expressed in the dataset.