Brain Machine Learning proprietary platform is exploited to generate a daily stock ranking based on the predicted future returns of a universe of largest 1000 U.S. stocks on five time horizons: 2,3, 5, 10 and 21 trading days. The universe is updated yearly.
The model implements a voting scheme of machine learning classifiers that non linearly combine a variety of features with a series of techniques aimed at mitigating the well-known overfitting problem for financial data with a low signal to noise ratio.
Model inputs include stock specific features such as fundamentals and price-volume related metrics, market data such as volatility and other financial stress indicators, and calendar related signals such as day or month anomalies.
This data set contains historical data from August 2016 and live data updated daily within 4am UTC.
It is important to note that the provided ranking score has a meaning only if used to compare different stocks to perform a ranking. For example a typical use case is to download the stock ranking for a large stock universe for a given day, e.g. 500 stocks or the full universe of 1000 stocks, then order the stocks by ranking score (field "ML_ALPHA", see fields description below or data dictionary) and go long the top K stocks, or build a long-short strategy going long the top K and short the bottom K stocks.
The main schema is called "STOCK_RANKING" and contains five tables, one for each prediction time horizon:
- "STOCK_RANKING_NEXT_DAYS_2" contains the predicted stock rankings based on the predicted future returns for next 2 trading days
- "STOCK_RANKING_NEXT_DAYS_3" contains the predicted stock rankings based on the predicted future returns for next 3 trading days
- "STOCK_RANKING_NEXT_DAYS_5" contains the predicted stock rankings based on the predicted future returns for next 5 trading days
- "STOCK_RANKING_NEXT_DAYS_10" contains the predicted stock rankings based on the predicted future returns for next 10 trading days
- "STOCK_RANKING_NEXT_DAYS_21" contains the predicted stock rankings based on the predicted future returns for next 21 trading days
- "STOCK_UNIVERSE" contains the the stock universe. The stock universe corresponds to the set of stocks for which the system is providing a prediction for the given date. The stock universe is updated annually. In general every day approximately the 98% of the stock universe is covered.
The key fields the stock ranking tables are:
- CALCULATION_DATE: The calculation date for the stock ranking score in format YYYY-MM-DD.
- COMPOSITE_FIGI: The FIGI composite code (https://www.openfigi.com) that uniquely identifies the company stock across related exchanges in US.
- TICKER: The stock ticker.
- ML_ALPHA: Score related to the predicted return on the time horizon of next N trading days, where N = 2, 3, 5, 10, 21 depending on the selected table. More specifically the assigned score ML_ALPHA is related to the confidence of a Machine Learning classifier in assigning the stock to a class 0 (underperforming with respect to the median of the universe in the next N days) or class 1 (overperforming with respect to the median of the universe in the next N days). It is important to note that the ranking score has a meaning only if used to compare different stocks to perform a ranking. A typical use case is to download the stock ranking for a large stock universe for a given day, e.g. 500 stocks or the full universe of 1000 stocks, then order the stocks by ML_ALPHA score and go long the top K stocks, or build a long-short strategy going long the top K and short the bottom K stocks.
DISCLAIMER
The content of this dataset is not to be intended as investment advice. The material is provided for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation or endorsement for any security or strategy, nor does it constitute an offer to provide investment advisory or other services by Brain. Brain makes no guarantees regarding the accuracy and completeness of the information expressed in the dataset.