World Quant Brain Alpha Documentation
World Quant Brain Alpha Documentation
World Quant Brain Alpha Documentation
Table of Contents
1. Research Consultant
2. Finance Basics
3. Quantitative Analysis
4. How to Use the Brain Platform
5. Methods of Analyzing the Stock Market
The objective this guide is to provide a basic overview of the Brain consultant program
and simple bite-sized concepts to help you get started on alpha making. This is by no
means an exhaustive list, but after completing this guide, you will gain sufficient
knowledge to try your hands at simulating ideas. It will be a learning process, many have
done this before, and so you can do it too regardless of your background. Good luck!
How to Use the BRAIN Platform – Fast Expression, Operator and Data Field
Research Consultant
Become a Consultant with us
A Global Community
Who is eligible?
Users need to score at least 10,000 points on the WorldQuant Challenge to be eligible for
this role. At present, we are offering this opportunity exclusively to residents of:
Mainland China
Hong Kong
Taiwan
Kenya
Korea
Indonesia
India
Malaysia
Singapore
Simulate Alphas Data Competitions (2) Team
UK
Vietnam
Thailand
USA
Finance Basics
The stock market refers to public markets that exist for issuing, buying, and selling stocks
that trade on a stock exchange or over-the-counter. Stocks, also known as equities,
represent fractional ownership in a company, and the stock market is a place where
investors can buy and sell ownership of such investible assets [1].
Seeking Returns
Investors seek to profit from buying stocks by selling their stock for a profit if the stock
price increases from their purchase price. For example, if an investor buys shares of a
company’s stock at $10 a share and the price of the stock subsequently rises to $15 a
share, the investor can then realize a 50% profit on their investment by selling their
shares [1]. Returns is defined by WorldQuant as the return on capital traded: .
It signifies the amount made or lost during the period and is expressed in %
Long a Stock
Taking a long position in a stock simply means buying it, and if the stock increases in
value, you will make money.
Short a Stock Simulate Alphas Data Competitions (2) Team
On the other hand, taking a short position in a stock means borrowing an equity that you
do not own, usually from your broker, then selling it, and hoping that it declines in value.
When that happens, you can buy it back at a lower price than you have paid for it and
return the borrowed shares to your broker.
Defining Volume
Volume is the amount of an asset or security that changes hands over some period of
time, often over the course of a day. For instance, stock trading volume could refer to the
number of shares of a security traded between its daily open and close. Trading volume,
and changes to volume over the course of time, are important inputs for technical traders
[2].
The opening price is the price at which a security first trades upon the opening of an
exchange on a trading day. The closing price is the price of the final trade before the close
of the trading session. These prices are important because they are used to create
traditional line stock charts, as well as when calculating moving averages and other
technical indicators [3, 4].
[2] What is volume of a stock, and why does it matter to investors? (2003, November 23).
Investopedia.
https://www.investopedia.com/terms/v/volume.asp#:~:text=Volume%20is%20the%20am
ount%20of,its%20daily%20open%20and%20close
[4] Opening price: Definition, example, trading strategies. (2005, July 3). Investopedia.
https://www.investopedia.com/terms/o/openingprice.asp#:~:text=The%20opening%20pri
ce%20is%20the,is%20its%20daily%20opening%20price
Quantitative Analysis
There are many methods to determine whether to long (buy) a stock or to short it.
Quantitative analysis (QA) in finance is an approach that emphasizes mathematical and
statistical analysis to help determine the value of a stock. Quantitative trading analysts
(also known as "quants") use a varietyAlphas
Simulate
of data—including
Data
historical investment and stock
Competitions (2) Team
market data—to develop trading algorithms and computer models. The information
generated by these computer models helps investors analyze investment opportunities
and develop what they believe will be a successful trading strategy [5].
Utilizing the quantitative analysis approach, the BRAIN platform is a web-based simulator
of global financial markets that was created to explore Alpha research. It accepts an Alpha
expression as input and plots its Profit and Loss (PnL) as output.
The input expression is evaluated for each financial instrument, every day over historical
dates, and a portfolio is constructed accordingly. BRAIN platform invests in each financial
instrument according to the value of the expression. It takes positions (either buying or
short selling) and assigns weights to each instrument.
What is an Alpha?
Weights
Imagine market data being a matrix, with each row representing one date and each
column representing one stock. For example, the matrix for close price data could look
like this:
Simulate Alphas Data Competitions (2) Team
The role of the Alpha expression is to transform the input matrix to an output vector of
weights, with each weight corresponding to one of the stocks. The Alpha output vector,
having weights as values corresponding to each instrument in the Universe, could look
something like this:
Table: Output vector to indicate the direction as well as sizing for company A, B and C.
Once we have got the weights of the stock from the Alpha expression, the next step is to
get each day’s profit and loss (PnL).
From the above table, I have weight_A = 0.2, weight_B = -0.5 and weight_C = 0.3. Now the
amount of money I have got to invest is called the "booksize". Suppose my booksize is
100 USD. So I calculate the money I want to invest in each of the stocks:
Now, I buy 20 USD worth of A, sell 50 USD worth of B and buy 30 USD worth of C. Now I
have a portfolio, which is worth a total of 100 USD.
I keep this portfolio for one full day, and sell it the next day in the simulation period. Now
Simulate Alphas Data Competitions (2) Team
in one day, the prices of stocks A, B, C have changed. So the total value of my portfolio
has also changed, say from 100 USD to 105 USD. So, I have made a profit of 5 USD on that
day.
Now I again calculate the Alpha values for the stocks, and again calculate weights, and
again trade 100 USD worth of portfolio. [Note: In BRAIN platform, we use constant book
size for all the days, regardless of whether your portfolio makes money or loses money.]
This is repeated for each day in the simulation period to calculate and plot the cumulative
PnL.
A good Alpha would ideally have consistently increasing PnL, high Annual Return, and
more importantly, few fluctuations in the cumulative profit graph. If the standard
deviation is low, there would be lesser fluctuations in the graph. If the graph shows high
fluctuations/volatility, despite the returns being high, the Alpha will not be deemed good
enough.
WorldQuant aims to develop equity long-short market neutral alphas that have low
volatility and risk. Such investments are attractive because they are expected to produce
substantially better risk-adjusted returns than long-only portfolios. Equity long-short
market neutral strategy is used commonly by hedge funds, with the goal of minimizing
exposure to the market and profit from the changes in the spread between two stocks.
Enough theory for now, let’s delve into how to write your first alpha on the BRAIN
platform.
To all the non-coders, the good news of using BRAIN is that no prior coding experience is
required.
BRAIN uses the fast expression language that consists of two main elements: Data fields
and Operators.
One of our user has asked us: Are there plans to allow users to utilize Python/MATLAB/R
to interact with the BRAIN API, to analyze datasets and submit the alpha vectors?
Our reply is that: the BRAIN platform is currently available for use only with Fast
Expressions language. Regarding APIs, we currently do not prohibit programmatic access
to BRAIN platform when API communication is conducted with low intensity.
Data fields refer to a named collection of data, for example 'open price' or 'close price'.
Datasets are a collection of data fields. For example, ‘open price’ and ‘close price’ can be
found in the price volume dataset here. Most users typically start out with price volume
and fundamental datasets.
Operators
Here are some common examples of alphas build using data fields and operators:
Simulate Alphas Data Competitions (2) Team
You can try out some sample alphas by clicking on the Example button (bottom left
corner) on the simulator page. Leverage the hint and test out a few simulations! Click
here to try now: Simulate Page
You might ask us, how do you come up with ideas for new alphas? This section will take
you through two alpha ideas utilizing technical analysis and fundamental analysis, and
explain the thought process behind them.
Technical Analysis
Across the industry, there are hundreds of patterns and signals that have been developed
by researchers to support technical analysis trading. These include trend lines, channels,
moving averages, and momentum indicators [6].
Volume as an indicator
If a company’s stock has high volume, it means that many people are buying and selling
the stock. Suppose our hypothesis is that the company with more shares traded is more
desirable than another company with low volume. We will then allocate more weight to
the company with higher volume.
One way to express this idea is through the alpha expression here:
1 volume
Simulate Alphas Data Competitions (2) Team
Open example alpha in Simulate
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Fundamental Analysis
Analysts could compare the company’s growth rates to the industry and sector that it
operates in, along with the other information provided, to see if the company is valued
correctly [7].
Inventory Turnover
Financial ratios are ratios of fundamental data which give insights into the health and
investment decisions of the company. One common financial ratio is the inventory
turnover. It is a type of activity ratio that measures how quickly a company sells through
and replaces its inventory. It is calculated as:
The hypothesis is that a stock with higher inventory turnover ratio will have a better
performance and thus be allocated more weight.
The alpha expression is as such:
1 inventory_turnover
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Additionally, there are two frequently used operator categories: time series and cross
sectional.
Time series analysis can be useful to see how a given variable changes over time.
Suppose you wanted to analyze a time series of daily closing stock prices for a given stock
over a period of one year. You could obtain a list of all the closing prices for the stock
from each day for the past year and analyze the time series data with technical analysis
tools to know whether the stock’s time series shows any seasonality. This will help you to
determine if the stock goes through peaks and troughs at regular times each year [8].
Alternatively, you can use cross-sectional analysis, where you compare a particular
Simulate Alphas Data Competitions (2) Team
company to its industry peers. Cross-sectional analysis may focus on a single company
for head-to-head analysis with its biggest competitors or it may approach it from an
industry-wide lens to identify companies with a particular strength. Essentially, cross-
sectional analysis shows an investor which company is best given the metrics you care
about [9].
We hope you’ve enjoyed this guide! To get started, you may click on the Example Button
(bottom left corner) on the Simulate Page. There you will find some sample alphas with
hints to improve it.
If you have any research related questions, you can check out our Community forum and
post your questions there.
[6] Technical analysis: What it is and how to use it in investing. (2003, November 24).
Investopedia. https://www.investopedia.com/terms/t/technicalanalysis.asp
[7] Fundamental analysis: Principles, types, and how to use it. (2003, November 23).
Investopedia. https://www.investopedia.com/terms/f/fundamentalanalysis.asp
[9] What is cross sectional analysis and how does it work? (2007, May 21). Investopedia.
https://www.investopedia.com/terms/c/cross_sectional_analysis.asp
Introduction to Alphas
Table of Contents
1. Alphas
2. Alpha Lifecycle
3. Weights
4. Assigning weights
5. Positive and Negative weights
6. BRAIN platform
BRAIN platform is a web-based tool for backtesting trading ideas. An Alpha is a concrete
trading idea that can be simulated historically.
Alphas
In BRAIN platform, an 'Alpha' refers to a mathematical model or strategy, written as an
expression, which places different bets (weights) on different instruments (stocks), and is
expected to be profitable in the long run. After a user enters an Alpha expression that
consists of data, operators and constants, the input code is evaluated for each instrument
to construct a portfolio. Then BRAIN platform makes investments in each instrument for a
one-day period in proportion to the values of the expression. The process repeats each
day.
Alpha Lifecycle
First, one might peruse blogs, journals and research papers on the internet to come up
with an idea. The Alpha expression is entered in BRAIN platform and operations (like
truncation,neutralization, decay) are performed on the raw Alpha. BRAIN platform makes
investments (goes long or short) for all the instrumentsof the universe chosen in the
Settings panel and the PnL is simulated. Then the performance is calculated (Sharpe,
Turnover, Returns) as seen in the Simulation Results page. And if the Alpha is not deemed
worthy, the Alpha idea is revised. Else, it enters production.
Weights
In simple terms, BRAIN platform uses an Alpha to create a vector of weights, with each
weight corresponding to one of the stocks in the selected universe. These weights may or
may not be market neutralized, as per your neutralization setting (by market, industry,
sub-industry or none). This creates a portfolio for each day in the simulation period,
which can then be used to calculate that day's Profit and Loss (PnL).
Assigning weights
Suppose in the Expression box, you type in 1/close, and set the simulation settings as
follows: Region = US, Universe = TOP3000, Delay = 1, Decay = 0, Neutralization = None,
Truncation = 0. Now, once you hit "Simulate" button, then for each day in the Simulation
Duration (5 years), the simulator does the following:
It calculates 1/close (using the closing price for the previous day), for each instrument in
the basket "US: TOP3000" (i.e. top 3000 stocks in the US, by market capitalization). This
creates a vector of 3000 values (one for each stock). This vector is then normalized, i.e.
divided by the sum of its values(so that all the values sum up to 1). This creates a vector
of "weights" for all the stocks, which is called a "Portfolio". Each weight represents the
fraction of money invested in that stock. If our booksize is $20 Million, then the money
invested in each stock is $20M x (weight of that stock in the portfolio). This is done for
Simulate Alphas Data Competitions (2) Team
each day in the simulation period, and at the end of each day the total profit or loss made
by our portfolio is calculated.
It is easy to invest $100 in stock, but negative positions (shorts) are common too. E.g. one
can get $100 now by shorting a stock (i.e. investing -$100), which must be bought back
later. PnL would be the opposite of $100 invested, as seen in the table below. Negative
weights are called short positions and positive weights are called long positions. Typically
investors take short positions when they expect stock price to decrease and long
positions (i.e. buying stocks) when they expect price to increase. Please refer to
Investopedia for more details on short selling.
The below table gives an example of payoffs from a $100 short and a $100 long position,
for a 1% price change in each direction. For simplicity, it does not account for dividends,
margin and financing costs.
BRAIN platform
BRAIN platform is a web-based simulator of global financial markets that was created to
explore Alpha research. It accepts an Alpha expression as input and plots its Profit and
Loss (PnL) as output. The input expression is evaluated for each financial instrument,
every day over historical dates, and a portfolio is constructed accordingly. BRAIN platform
invests in each financial instrument according to the value of the expression. It takes
positions (either buying or short selling) and assigns weights to each instrument. The
weights are then scaled to book size (amount of money invested), based on which a PnL
graph is plotted. These weights are not constant; they change over time based on current
information and the history of the changes of some variables (such as prices, volumes,
etc.).
Simulate Alphas Data Competitions (2) Team
Prev: *Read this First * - Starter Pack Next: Introduction to BRAIN Expression Language
Simulate Alphas Data Competitions (2) Team
Table of Contents
1. What is Fast Expression?
2. Characteristics of Fast Expression
Data fields
Operators
3. Further Knowledge of Fast Expression
The goal of using “Fast expression” on BRAIN is to provide a clear and concise way to
express complex ideas and algorithms that can be easily understood by other developers
and researchers. By abstracting away the details of the underlying implementation, it can
allow BRAIN users to focus on the high-level logic of their algorithms, rather than getting
bogged down in the implementation details.
Just like how an English sentence consists of a subject, verb and object; Fast expression
can include data fields, operators and numerical values.
Data fields
Data fields refer to a named collection of data, for example 'open price' or 'close price'.
Simulate Alphas Data Competitions (2) Team
Operators
/* helps to create block comments that span multiple lines of text, while */ denotes
the end of the comment. Comments consist of explanatory text to help understand
what the code does. [1]
; (semicolon) acts as a semicolon in a sentence, separating the end of one sentence
from the beginning of another sentence. For the last line of the code (line 13) ; is not
needed. [2]
The last sentence of the entire expression is the alpha expression that the BRAIN
simulator use to calculate the positions to take in each stock. [3]
Lastly, Fast expression does not have classes, objects, pointers, or functions.
In summary, Fast expression provides a clear and concise way for users to express
complex ideas and algorithms. Don’t worry if you’re not familiar with Fast expression yet.
With a bit of practice, we believe you’ll pick it up in no time!
Table of Contents
1. Understanding Your Results
2. Passing IS Stage and Troubleshooting
3. Common Error Messages
This Intermediate guide aims to further your understanding of the alphas you have
simulated. The documentation will provide you with an in-depth understanding of
commonly used operators and get you up to speed to improve your ability to create a
high-performing alpha.
If you’ve followed the examples in the Starter pack, chances are you’ve ended up with the
first 2 graphs. What both graphs have in common is that they have multiple periods of
significant losses, producing a graph with high fluctuations. This means that your
simulated portfolio could lose a large percentage of its value in one day, and that
wouldn’t be ideal. Rather, a good alpha should produce a steadily rising PnL chart (3rd
graph) with few fluctuations and no major drawdown.
In-sample (IS) Summary
Simulate Alphas Data Competitions (2) Team
In-sample simulation uses data over a 5-year timeframe, and tests out how well your
alpha performs in the historical period. After the simulation, you will see the IS Summary
row with 6 metrics: Sharpe, Turnover, Fitness, Returns, Drawdown, and Margin.
Sharpe
This ratio measures the excess return (or risk premium) per unit of deviation of returns of
an Alpha. It takes the mean of the PnL divided by the standard deviation of the PnL. The
higher the Sharpe Ratio or Information Ratio (IR), the more consistent the Alpha’s returns
are potentially likely to be, and consistency is an ideal trait. The passing requirement for
Sharpe on the BRAIN platform is to be above 1.25.
M ean(P nL)
Sharpe = √ 252 ∗ ( )
Stdev(P nL)
Turnover
Turnover of an Alpha is metric that measures the simulated daily trading activity, i.e., how
often the Alpha trades. It can be defined as the ratio of value traded to book size. The
higher the turnover, the more often a trade occurs. Since trading incurs transaction costs,
reducing turnover is generally an ideal trait. The passing requirement for turnover on the
BRAIN platform is to be between 1% and 70%.
Fitness
Fitness of an Alpha is a function of Returns, Turnover & Sharpe. Fitness is defined as:
abs(Returns)
F itness = Sharpe ∗ √
max(T urnover, 0.125)
Good Alphas generally have high fitness. You can seek to improve the performance of
your Alphas by increasing Sharpe (or returns) and reducing turnover. The passing
requirement for fitness on the BRAIN platform is to be greater than 1.0.
Returns
Simulate Alphas Data Competitions (2) Team
Returns is the amount made or lost by the Alpha during a defined period and is
expressed in percentages. BRAIN defines returns as:
AnnualizedP nL
AnnualReturn =
0.5 ∗ BookSize
Drawdown
Drawdown of an Alpha is the largest reduction in simulated PnL during a given period,
expressed as a percentage. It is calculated as follows:
Margin
Margin is the simulated profit per dollar traded of an Alpha; calculated as:
P nL
M argin =
T otalDollarsT raded
One of the most common challenges users face is Low Sharpe, and users commonly
see that their Sharpe ratio is below the specified cutoff. How do you get a higher
Sharpe? We suggest that you can either increase you Alpha return or reduce your
volatility. Read more here.
Another challenge is the weight test that measures the capital concentration in each
stock. You might see these error messages in your IS tests: “Maximum weight on an
instrument is greater than 10%” OR “Weight is too strongly concentrated” OR “Too
few instruments are assigned weight.” Common fixes to this include: Adding range-
Simulate Alphas Data Competitions (2) Team
normalized functions such as rank, setting truncation at 0.1, and using ts_backfill.
Read more here.
Another difficulty is that the Sub-universe Sharpe is not above cutoff. This means
that the Sharpe in the sub-universe must be higher than at least one threshold.
There are 2 thresholds that scale down Sharpe with sub-universe size.
Thus, you can try to improve the Sub-Universe Sharpe by increasing the Universe of
instruments (i.e. selecting Top3000).
Check your spelling of the data fields and operators and ensure that your expression is
logical. The tokens (operators and keywords) allowed in your Alpha expression can be
found in the Available Market Data and Available Operators pages. Alpha expressions
also accept integers and floating point numbers.
Unit warnings are provided for reference in simple cases and do not prevent submission.
Usually, this warning appears when data fields having two different units are added or
multiplied. E.g. if you add "close" to "cap". "close" has units of price but "cap" has units of
price*shares. You can safely ignore these warnings if you're sure the alpha correctly
handles data units.
Prev: Introduction to BRAIN Expression Language Next: Intermediate Pack - Improve your Alpha [2/3]
Simulate Alphas Data Competitions (2) Team
Table of Contents
1. Use Different Operators
2. Change Simulation Settings
Divide (/)
Imagine market data being a matrix, with each row representing one date and each
column representing one stock. For example, the matrix for close price data of stocks in
universe US TOP3000 would look like this:
And the matrix for open data of above stocks would look like this:
Simulate Alphas Data Competitions (2) Team
Say you enter an Alpha expression like close/open in the Simulate page found in the
Alphas dropdown tab. When you click Simulate, the BRAIN platform will evaluate the
Alpha expression against the matrix of market data for each date and each stock.
Rank(x)
Description: the Rank operator ranks the value of the input data x for the given stock
among all instruments, and returns float numbers equally distributed between 0.0 and
1.0
For example:
The numbers imply that if you have $126, you must use $100 to go long stock E (~80% of
your total capital). So, your strategy would depend crucially on how the last stock
performs. But, isn’t that too risky? Applying the rank function to the alpha expression
rank(sales/assets), you get:
This time you see that the stock with the largest weight occupies only 40% of your
portfolio.
In your first Alpha simulation, you left the simulation settings on default. Changing certain
simulation settings may help you improve your Alpha results. We will go through Region,
Universe, Neutralization, Decay and Truncation. The other settings will be covered in a
Simulate Alphas Data Competitions (2) Team
later guide.
Region
Region refers to the market in which the Alpha will simulate trades, for example, the U.S.
equity market or Chinese equity market.
Universe
Universe is a set of trading instruments ranked by their liquidity. For example, “US:
TOP3000” represents the top 3,000 most liquid stocks in the U.S. market.
Decay
Decay is used for averaging the Alpha signal within a specified time window. The settings
perform linear decay on the Alpha. Tip: Decay can be used to reduce turnover, but decay
values that are too large will attenuate the signal.
Truncation
Truncation sets the maximum weight for each stock in the overall portfolio. It aims to
guard against excessive exposure to movements in individual stocks. The recommended
setting is between 0.05 and 0.1 (entailing 5-10%).
Simulate Alphas Data Competitions (2) Team
Neutralization
Market risks and industry specific risks are prevalent risks within equities. However, these
risks can be reduced by creating long-short neutral portfolios using a concept called
neutralization. After neutralizing the portfolios to market or industry specific groups, no
net position is taken with respect to that group, i.e. allotting the same amount of dollars
in long (buying) and short (selling) positions. That way, you are less exposed to risk,
whether the entire market goes up or down.
If the hypothetical booksize is 20 million, we would end up investing $10 million in long
positions and $10 million in short positions. Thus, no net position is taken with respect to
the market. In other words, the long exposure cancels out the short exposure completely,
making this hypothetical strategy market neutral.
The three different neutralization methods determine which groups are used for
neutralizing Alpha values. The correct choice of neutralization depends on the logic or
formula used by the Alpha. The results should indicate which neutralization will be most
effective.
Prev: Intermediate Pack - Understand Results [1/3] Next: 10 Steps to Start on BRAIN platform
Simulate Alphas Data Competitions (2) Team
Prev: Intermediate Pack - Improve your Alpha [2/3] Next: WorldQuant Challenge
Simulate Alphas Data Competitions (2) Team
Documentation
WorldQuant Challenge
Table of Contents
1. Overview
2. Scoring criteria
Summary
Details
Overview
The WorldQuant Challenge is a perpetual, online, solo competition. Users can submit
Alphas to improve their scores and ranking.
Individuals who score 10,000 points may be eligible to receive an invitation for the
research consultant opportunity, subject to other criteria(e.g. if they are residents in
countries where the BRAIN consultant program is offered). Users who make it to Gold
and Silver levels will have access to special training sessions and videos through the
Events page.
New users are automatically enrolled into the challenge. The Leaderboard ranks all
eligible users and can be filtered by country, university and/or city.
Scoring criteria
Summary
1. Your score is based on the quantity and quality (performance in the 5 year in-
sample period) of Alphas that you submit on the platform
2. Your score also depends on quantity and quality of Alphas submitted by other users
that day
3. Score is calculated per day (EST timezone), and not per Alpha
4. Highest daily score you can achieve is 2,000. Typically, this involves submitting 1 to 2
Simulate Alphas Data Competitions (2) Team
alphas a day
5. There are no negative points. Your score cannot decrease
6. Scores refresh once every day at 3 AM EST
7. Participants with the same score will have the same rank
8. You can reach 3 levels in WorldQuant Challenge:
1. Bronze (score > 1,000)
2. Silver (score > 5,000)
3. Gold (score > 10,000)
Details
Each day, all Alphas submitted by a user accumulated and two factors are calculated:
Quantity Factor: Larger the number of Alphas you submit during a day. Larger the
factor, higher your score
Quality factor: Quality factor is calculated as an average of the quality factor of all Alphas
submitted during the day. Larger the factor, higher your score. It depends on the
following settings and results in the in-sample period:
Both factors are then normalized across all the users who submitted at least one Alpha
on that particular day. Your final daily score is then function of normalized Quantity and
Quality Factors. The daily score is capped at 2,000 points.
Alphas are created and simulated on the Simulate page in the Alphas dropdown tab. To
run your first simulation, click on the gear icon at the top right-hand side corner. This will
open the settings panel. Here, select “US: TOP3000” for Region and Universe,
“Subindustry” for Neutralization and apply your settings. Make sure both Code and Result
are ticked by clicking on them. In the Alpha expression text box, enter -Delta(close, 5) for
now and click on "Simulate". The Simulation Result page will show a graph for Cumulative
Profit. This graph can be zoomed in to plot area for shorter time periods (1 month or 1
year).
The display consists of 2 graphs, one for PnL vs. Time and the other for Sharpe Ratio vs.
Time.
In the Stats tab, a good Alpha would have consistently increasing PnL and high Annual
Return, Sharpe Ratio, % Profitable Days and Profit per Dollar Traded. It should have low
Drawdown and Turnover. And more importantly, it shouldn’t have high fluctuations in the
cumulative profit graph. If the standard deviation is low, there would be lesser
fluctuations in the graph. If the graph shows high fluctuations/volatility, despite the
returns being high, the Alpha will not be deemed good enough. An Alpha is considered to
be “good” if:
The graph above for Alpha expression -Delta(close, 5) shows several significant
drawdowns, as well as a flattening of returns in 2017. The table below marks this Alpha as
Inferior (Needs Improvement). PnL and Sharpe for 2017 drop low, and drawdown is large
in 2014 and 2015. This Alpha is Inferior (Needs Improvement) due to high volatility and
low returns.
1 -delta(close, 5)
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Use the green refreshing button in the Correlation block to get the information about the
correlation of the currently simulated Alpha with the Alphas in your own OS (Out-of-
Sample) pool. This will be explained further in the Simulation Results page.
The image below shows the Properties of the Alpha. You can name your Alpha, assign a
category and color code, and add user-defined tags to them. You can add a brief
description about your Alpha for your reference. Suggestion - keep the number of user-
Simulate Alphas Data Competitions (2) Team
defined tags low so that they don't proliferate and are easily searchable in the My Alphas
page.
To Submit Alpha for OS Test, click the "Submit Alpha" button in the Submission tab of the
results panel. This will check if the Alpha meets the Correlation and Sharpe criteria before
submitting it.
Test Period
The Test Period is a feature designed to enhance your Alpha and SuperAlpha testing
process. This tool allows you to set a separate test period from your IS period, providing a
more flexible approach to testing your research ideas.
The Test Period feature is designed to help you avoid overfitting. It allows you to divide
your In-Sample (IS) period into a Train and Test period. The Train period can be utilized to
develop your Alphas and SuperAlphas, while the Test period is ideal for validating them.
An Alpha or SuperAlpha that is developed based on the simulation results of Training
Period and performs well in both periods is likely a strong candidate for submission and
may have avoided overfitting.
While choosing a Test period does not directly affect the simulation, it influences the
statistics and the visualization. The submission tests will run on the entire 5-year period,
with the simulation running on the entire 5-year IS. However, if a testing period is chosen,
the simulation stats will be divided into two sections: one covering the training period and
another for the test period.
Navigating the Feature:
Simulate Alphas Data Competitions (2) Team
1. Selecting the Test Period: In Simulation Settings, you can define a test period
corresponding to the final 0-5 years of the IS period. By default, no test period is set
(0 years).
2. Visualizing the Test Period: The Stats Summary defaults to the training period. You
can view the stats for the test period by clicking on the “Show test period” button.
3. Identifying the Test Period on Graphs: The lines representing the test period on the
graphs are colored orange.
4. Choosing the Stats Summary: You can select between the Stats Summary for the
test period or the entire IS period by choosing the “TEST” or “IS” in the Summary
section, respectively.
5. Hiding the Test Period: A button “Hide test period” allows you to hide the test
period, if desired. Note that an Alpha or SuperAlpha can only be submitted when
the Test Period is revealed by clicking on the “Show test period” button.
6. Understanding the Stats: The yearly IS stats are divided between Train and Test
periods, represented by blue and orange indicators respectively.
A. Orange - test period PnL, Blue - Train period PnL. B. View IS summary by selecting
different periods
Prev: Simulate your first Alpha Next: How to choose the Simulation Settings
Simulate Alphas Data Competitions (2) Team
Table of Contents
1. Language
2. Instrument type
3. Region and Universe
4. Delay
5. Decay
6. Truncation
7. Neutralization
8. Pasteurize
9. Nan Handling
10. Unit Handling
The settings panel can be found by clicking the Settings button at the top right hand
corner of the Simulate page. You can specify parameters like language, instrument type,
universe, delay, neutralization, etc., which will be applied to your next simulation after
clicking the "Apply" button.
Language
Fast Expression is available on BRAIN platform. To learn more, refer to Available
Operators
Instrument type
Only Equity instrument type can be used at the moment
Universe is a set of trading instruments prepared by BRAIN platform. For example, "US:
TOP3000" represents the top 3000 most liquid stocks in the US market (determined by
highest average daily dollar volume traded).
Delay
Delay refers to the availability of data, relative to decision time. In other words, delay is
the assumption of when we can trade stock once we decide on a position.
Assume that you are looking at the data today before market close and you decide that
you want to buy stock. We can choose an aggressive approach and trade stock in the time
left till market close. In this case, the position is based on data available on the same day
(today). This is called Delay 0 simulation.
Alternatively, we could choose a conservative trading strategy and trade stock the next
day(tomorrow). Then the position is achieved tomorrow and it is based on today’s data. In
this case, there is a lag of 1 day. This is called Delay 1 simulation. In expression language,
delay is applied automatically and you do not have to bother about it.
Decay
This performs a linear decay function over the past n days by combining today’s value
with previous days’ decayed value. It performs the following function:
Legal values for Decay: Integer 'n' where n >= 0. NOTE: Using negative or non-integer
values for Decay will break simulations.
Tip: Decay can be used to reduce turnover, but decay values that are too large will
attenuate the signal.
Truncation
The maximum weight for each stock in the overall portfolio. When it is set to 0, there is no
restriction.
Legal values for Truncation: Float 'x' where 0 <= x <= 1 (NOTE: Any values for Truncation
Simulate Alphas Data Competitions (2) Team
outside this range can impact/break simulations.)
Tip: Truncation aims to guard from being too exposed to movements in individual stocks.
The recommended setting is between 0.05 and 0.1 (entailing 5-10%).
Neutralization
Neutralization is an operation used to make our strategy market/industry/sub-industry
neutral. When Neutralization = “Market” it does the following operation:
Basically, it makes the mean of the Alpha vector zero. Thus no net position is taken with
respect to the market. In other words, the long exposure cancels out the short exposure
completely, making our strategy market neutral.
When Neutralization = Industry or Subindustry, all the instruments in the Alpha vector are
grouped into smaller buckets corresponding to industry or sub-industry and
neutralization is applied separately to each of the buckets. For illustration of
industry/subindustry classification, see GICS (note: this is not necessarily the same
classification standard used by BRAIN platform).
Pasteurize
Pasteurization replaces input values with NaN (pasteurizes) for instruments not in the
Alpha universe. When Pasteurize = ‘On’, inputs to will be converted to NaN for
instruments not in the universe selected in Simulation Settings. When Pasteurize = ‘Off’,
this operation does not happen and all available inputs are used.
Pasteurized data has non-NaN values only for instruments in the Alpha universe. While
pasteurized data contains less information, it may be more appropriate when considering
cross-sectional or group operations. The default Pasteurize setting is ‘On’. Researchers
can switch it to ‘Off’ and use pasteurize(x) operator for manual pasteurization.
Example
Assume the following settings are used: Universe TOP500, Pasteurize: ‘Off’. The following
code calculates the difference between sector rank of sales_growth in Alpha universe and
sector rank of sales_growth among all instruments:
1 group_rank(pasteurize(sales_g
Simulate Alphas Data Competitions (2) Team
Open example alpha in Simulate
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
The pasteurize operator in the first group_rank pasteurizes input to the Alpha universe
(TOP500), while the second group_rank ranks sales_growth among all instruments.
Nan Handling
NaNHandling replaces NaN values with other values. If NaNHandling: ‘On’, NaN values are
handled based on operator type. For time series operators, if all inputs are NaN, 0 is
returned. For group operators returning one value per group (e.g. groupmedian,
groupcount), if the input value for an instrument is NaN, the value for the group is
returned.
If NaNHandling : ‘Off’, NaNs are preserved. For time series operators, if all inputs are NaN,
NaN is returned. For group operators, if the input value for an instrument is NaN, NaN is
returned. Researchers should handle NaNs manually in this case. The default setting
NaNHandling value is ‘Off’. Some ways to manually handle NaN values can replicate “On”
behavior.
Example
1 ts_zscore(etz_eps, 252)
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Assume NaNHandling = ‘On’. Then for a stock with etz_eps == NaN for all 252 days, 0 is
returned. However, ts_zscore(x, d) also returns 0 when x == tsmean(x, d), which is
different from x == NaN (“no data is available”). This means that NaNHandling = ‘On’
increases coverage, but may introduce ambiguous information into the Alpha.
Example
1 groupmax(sales, industry)
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
When NaNHandling = ‘Off’ and sales is NaN for a given instrument, the operator’s output
is NaN. When NaNHandling = ‘On’ and sales is NaN for a given instrument, the operator’s
output is the maximum value of sales in the instrument’s industry.
Unit Handling
Unit Handling option allows raising a warning when incompatible units are used in an
operator. This warning appears if expression uses data fields that are incompatible, for
example, a warning will be shown for an attempt to add price to volume.
The below post illustrates in detail how the BRAIN platform works and what happens in
the background when you simulate an alpha. Even though you’ll never need to do these
calculations yourselves, developing an intuition for them will help you in the alpha making
process.
Imagine market data being a matrix, with each row representing one date and each
column representing one stock. For example, the matrix for close price data of stocks in
universe US TOP3000 would look like this:
When you input the simulation settings and click “Simulate”, the BRAIN platform will
evaluate the Alpha expression against the matrix of market data for each date in a 5 year
span, taking a long or short position for each financial instrument to generate the PnL
chart.
Behind the scenes, seven steps, or operations, are performed before the final PnL chart is
Simulate Alphas Data Competitions (2) Team
generated.
Normally, in an alpha simulation, there would be between 200 and 3,000 stock
instruments in the universe. But to better understand this concept, we’ll assume a
hypothetical scenario in which the simulation universe has only eight stocks. We simulate
the expression rank(-returns) with market neutralization, Delay 1 and Decay 0 settings
for now.
The hypothesis in this expression is that we want to buy, or go long on, those stocks
tomorrow that had negative or comparatively lower returns today, and we want to sell, or
go short on, those stocks tomorrow that had positive or comparatively higher returns
today.
We’ve used the rank operator, which ranks the input values inside the operator and
return values equally distributed between zero and 1. This is an example of a reversion
idea.
In Column B, we have the eight stocks in the alpha vector. Column C shows the returns of
these stocks as of February 1st. These serve as the input data of the alpha expression.
Step1: Evaluate the expression for each stock to generate the alpha vector for the given
date.
In our case, this date would be February 2nd, because we’ve assumed Delay 1 settings.
The Delay 1 setting uses data as of T-1 date to create the alpha vector as of T date.
To produce the alpha vector, the simulator performs the rank operation on negative
returns and produces a vector of values corresponding to each stock.
Simulate Alphas Data Competitions (2) Team
The resulting vector depends on the operators used in the alpha expression. In our case,
since we’ve used the rank operator, we see equally distributed values between 0 and 1 in
Column D. Note that the stock with the lowest return has the highest value, and vice
versa, in line with our hypothesis.
Step 2: From each value in the vector, subtract the average of the vector values in the
group. Sum of all vector values = 0. This is called neutralization.
The group can be the entire market, but we can also perform this neutralization
operation on sector, industry or subindustry groupings of stocks.
Since we have only eight stocks in our simulation universe, we’ve assumed to neutralize
the stocks over the market.
So we take the average of the numbers in Cell D12 and subtract the average from each
stock. This gives us a new vector in Column F. Note that both the sum and the average of
these new numbers are now zero. Also, the sum of positive values is equal to the sum of
negative values.
Step 3: The resulting values are scaled or ‘normalized’ such that absolute sum of the
alpha vector values is 1. These values can be called as normalized weights.
Simulate Alphas Data Competitions (2) Team
That means, we sum the absolute values of each row and find the sum, which is 2.3. Then
we divide each row by this sum, which results in normalized values. By normalize, we
mean that the total absolute sum of Column H is 1. We can also call this vector a
normalized vector of weights.
Note: On each iteration/day, the expression rank(-returns) will have access to all the data
for returns up to that day, and the matrix will grow by one line every day until it reaches
the most recent date. The role of the expression is to transform the input matrix to an
output vector of weights as we see in this hypothetical example.
Step 4: Using normalized weights, the BRAIN simulator allocates capital (from a fictitious
book of $20 million) to each stock to construct a portfolio.
Column J has a total of $20 million of fictional money allocated to the stocks, using the
normalized weights in Column H. This means we have a position of minus $4.4 million in
Stock 1 — that is, we’ve shorted $4.4 million worth of Stock 1 — and a long position of
$0.6 million in Stock 5. That is, we’ve invested $0.6 million in Stock 5.
This is called long-short market neutralization, and it’s the backbone of creating these
predictive models, or alphas, on BRAIN. With this technique, a strategy can have the
potential to be profitable regardless of the direction of the market.
Step 5: Calculate next day PnL generated by the alpha based on observed stock returns
the next day
That is, after allocating dollar positions on the stocks, we calculate the PnL generated by
Simulate Alphas Data Competitions (2) Team
each stock, based on the returns each stock had that day.
Suppose the actual returns on these stocks as of February 2nd are as shown in Column K.
We see that although we expected Stock 1 and Stock 2 to fall in price, they actually went
up, so we had a loss, shown in Column L.
We expected Stock 6 to go up in price, but it stayed flat. So we were wrong about three
stocks, but we were right about five. In total, we made a gain of $0.03 million on this day
with our alpha, calculated by adding the PnLs of all stocks in our vector. This is how the
simulator calculates the PnL generated by the alpha for any given date.
Step 6: Perform the operations in Step 1 to Step 5 for each date in a several-year history
span also called the In-sample period (IS) to get daily PnL generated for each day
For each day, the expression is evaluated and the values in the Alpha output vector
represent the weights to allocate to each stock. Alpha weights are not how much you
want to buy or sell, but a weighting position you would reach this day. These weights are
multiplied by book size (total money invested in the portfolio) to get the dollar value held
in each stock. For example, if the Alpha weight (after neutralization and scaling) for MSFT
is 0.2423, then we’ll have MSFT stocks with the total value 0.2423*book size.
The weight can be negative, meaning you would take a short position on these stocks. If
the value is positive, you would take a long position on these stocks, i.e. buy the stocks. A
NAN value would mean no weight is allocated to that instrument (i.e. no money is
allocated). The value of stocks you buy/sell on a particular day is determined by the
difference between weights today and weights yesterday. The percentage of your
portfolio traded in a day (by dollar value) is called ‘turnover’. The turnover reported in
simulation results is the average daily turnover over the simulation.
Step 7: Calculate the cumulative PnL of the alpha from the start of the in-sample period
to get the PnL chart of the alpha.
Based on those daily positions, PnL is calculated and displayed. By default, the BRAIN
platform will normalize your weights (according to the operations you enter) and create a
portfolio of $20 million (total booksize)
Simulate
worth of equity.
Alphas Data
(Note that a portfolio is just a Team
Competitions (2)
collection of securities.)
This can be better understood with the help of the PnL chart of the alpha in our example
rank(-returns)
In this chart, we have an IS period of five years, from February 2016 to January 2021.
Using the steps we discussed in our example, the simulator would calculate the daily PnL
of the alpha and derive the cumulative PnL chart, as we see here. Note that the two years
from February 2021 to January 2023 are not visible to us in the simulation window. That’s
called the out-of-sample, or the OS, period. After you submit an alpha, several tests are
run to analyze the alpha’s performance in the OS period. An alpha that passes both the
in-sample and out-of-sample tests can be said to be a robust alpha.
This is how the BRAIN simulator creates the PnL chart from an alpha.
In our example, we’ve assumed that we’re using market neutralization and Decay 0
settings. But if we used any other neutralization settings, the same operations would be
performed on the alpha.
Say we have 80 stocks in our simulation universe — ten industries with eight stocks each.
The simulator would perform the same operations (first Step 1 to Step 5) on each of the
ten groups and finally add the PnL from each group to get the daily PnL of the alpha and
create the cumulative PnL chart (Step 6 and Step 7)
However, if we introduce decay into our alpha settings, an additional step must be
performed to get the final alpha vector.
Suppose we use a decay of 3 in our simulation settings. The final vector of weights in the
alpha would be calculated by combining today’s value with the previous day’s decayed
value. In our example, we calculated the normalized weights in the alpha as of February
2nd. Let’s assume that the normalized weights of stocks in the alpha vector on February
1st and January 31st are as shown in Columns N and O, respectively.
Simulate Alphas Data Competitions (2) Team
Then the final weights in the alpha would be calculated using the given weighted average
formula:
which is implemented in Column P. Using this new derived vector, the simulator would
calculate the daily PnL and consequently the cumulative PnL chart. Note that even if
decay is used, more weight is assigned to the most recent values. So decay is a very
important factor in reducing transaction costs or turnover, as it includes information from
previous days, preventing the alpha from being very reactive.
To summarize, once we input the alpha expression and simulation settings in the BRAIN
simulator, it performs the operations discussed above to take long or short positions for
each financial instrument and generates the PnL chart.
Prev: How to choose the Simulation Settings Next: How to view your Alphas?
Simulate Alphas Data Competitions (2) Team
Table of Contents
1. Distribution of Active Alphas
The "Alphas" page shows a summary of the Alphas simulated so far. The controls
available on this page are:
Stage: The Alphas are grouped based on the stage - Unsubmitted and Submitted.
Favorite: Alphas can be marked as favorite by clicking the
Hidden: Alphas can be hidden by selecting their checkbox and clicking the
button. Clicking it again will un-hide the Alpha.
Simulate Alphas Data Competitions (2) Team
Columns, Filters and Sorting: Columns can be added or removed in the Alphas
page from 4 categories – Summary, Settings, Performance, Properties
Filters can be used to view a subset of all Alphas. The parameters by which you can filter
are: Name, Category, Code, Language, Color, Date Created, Decay, Drawdown,
Favorite, Fitness, Hidden, Margin, Neutralization, PnL, Region, Returns, Sharpe,
Status, Tags, Truncation, Turnover and Universe. [Note: Margin is the PnL divided by
dollars traded. It is the same as the “Profit per $ traded” column on the results page. The
unit bpm stands for basis points (margin), where each basis point represents 1/100th of
1% margin. I.e. 500 bpm would represent 5% margin.]
Fields to be sorted and the order of sorting can be chosen in this pop-up or by clicking on
the column header to cycle between increasing and decreasing order of sorting. The
number of Alphas to be displayed per page can also be adjusted in this section.
icon to view the code for an expression. You can then click Alpha name to open Alpha
information in separate block, and use "Clone Alpha" button to open this Alpha in
simulation window.
Alpha Lists: To compare the performance of two or more Alphas, first add the
Alphas to the list by selecting the Alphas and clicking the
This page visually represents the distribution of your submitted Alphas across three key
dimensions: Region, Delay, and Data Category. This graphical visualization is divided into
three layers, each representing one of these dimensions. The number displayed on each
segment, visible upon hovering your mouse over it, indicates the number of Alphas in
that subset and their proportion within the larger set.
This feature not only provides an overview of your submitted Alphas but also guides you
in diversifying your research. Here's how:
- Avoid submitting more than 30% of your Alphas in any single intersection of Region,
Delay, and Data Category.
This feature provides a color-coded visual cue to highlight the concentration of Alphas in
a specific intersection of Region, Delay, and Data Category. The color gradient ranges
from blue to red, with red indicating high concentration.
Prev: ⭐ How BRAIN platform works Next: ⭐ Alpha Examples for Beginners
Simulate Alphas Data Competitions (2) Team
Table of Contents
1. EBIT vs CAPEX
2. Future Investment and Dividend
3. Room for growth
4. Pay back the debt
5. Retained earnings
6. Pretax Income
EBIT vs CAPEX
Hypothesis
Stocks with higher EBIT compared to CapEx can be a sign of the company not investing
much in growth and the stock may not grow as much, thus we should sell those stocks.
Implementation
1 -rank(ebit/capex)
Simulation Settings
Region Universe Language
Simulate Decay Delay Truncation
Alphas Neutralization
Data Pasteurization
Competitions NaN
(2) Handling Unit H
Team
Retained Earnings is a measure of a firm's net income that it has accrued over time and
saved after the distribution of dividend. So Retained Earnings are an indicator of the
capability of future investment and dividend.
Implementation
Use sharesout to normalize the amount of retained earnings; then apply ts_delta()
operator to capture the change of the ratio over the last three months (one quarter).
Finally use rank() operator to normalize the result.
Think about how many trading days in one quarter. Maybe there are less than 90 trading
days in three months.
1 rank(ts_delta(retained_earnin
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Companies with high asset fair value but comparatively lower EBIT are likely companies
that have invested for growth over the past years and have more room for growth in the
future
Implementation
Rank domestic and foreign EBIT separately with grouping by industry, because different
Simulate Alphas Data Competitions (2) Team
industries may have different splits between domestic and foreign EBIT. Set lower alpha
signal for companies with lower asset fair value.
Can the alpha work well if comparing among companies that are already more sizeable
(TOP500), because companies that are too small may consider this metric as too risky as
it is a sign of lower cashflows.
1 alpha = -group_rank(fnd2_ebit
2
3 group_rank(fn_assets_fair_val
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
It is usually safer to go long on companies that can easily pay back the short term debt
using high liquid assets.
Implementation
Zscore of the ratio between cash and short term debt is calculated with higher readings
refer to higher ratio when compared with the market.
Try comparing a stock with its peers instead of the whole market.
1 zscore(cash_st/debt_st)
Simulation Settings
Region Universe Language
Simulate Decay Delay Truncation
Alphas Neutralization
Data Pasteurization
CompetitionsNaN
(2) Handling Unit H
Team
Retained earnings
Hypothesis
Retained earnings are the cumulative net earnings of a company after dividend payments
to the shareholders. In certain situations, shareholders might want to receive their
dividends and realize their profit. Thus, some shareholders might not be happy with
excessive retained earnings.
Implementation
We should long the stock that has its retained earnings decreasing and vice-versa.
1 -ts_rank(retained_earnings,25
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Pretax Income
Hypothesis
Implementation
Use the time-series rank operator to compare the trend of pretax income over the past 2
years, and use the quantile operator to normalize the result.
Hints to Implement
Simulate Alphas Data Competitions (2) Team
Boost the signal with sales data. For example, if the company has higher sales, it is more
likely to outperform.
1 quantile(ts_rank(pretax_incom
Simulation Settings
Region Universe Language Decay Delay Truncation Neutralization Pasteurization NaN Handling Unit H
Table of Contents
1. Tests on alphas excluding CHN region
2. Alphas in CHN region
3. Tests on SECTOR_UTILITIES_TOP3000 Universe of USA Region
4. Self-correlation
5. Weight
6. Sub Universe Test
7. Interpreting Status Messages in Simulation Results
8. Selecting Alphas For Submission
The following performance tests are run till the end of the In-Sample Period.
The 'Submit Alpha' button (in the Submission tab of the simulation results panel) is
used to start out-of-sample (OS) testing for Alphas meeting the performance and
correlation cutoffs
Only submitted Alphas are considered for scoring. Submitted Alphas show up on
the Out-of-Sample tab of the Alphas page.
Below table is for submission test for Alphas excluding CHN region
Fitness At least “Average”: Greater than 1.3 for Delay-0 or Greater than
1 for Delay-1
Sharpe Greater than 2 for Delay-0 Alphas or Greater than 1.25 for
Delay-1 Alphas
Weight test Max weight in any stock < 10%. This measures if sufficient
number of stocks are assigned weight for sufficient days in a
year. Number varies with simulation universe (Top 3000, Top
2000 etc.)
Sub universe test The Sharpe in the sub universe must be higher than at least
Simulate Alphas Data Competitions (2) Team
one threshold. These thresholds scale down sharpe with sub
universe size. You can find detailed example below.
Self Correlation <0.7 PNL correlation. Or Sharpe at least 10% greater than
other correlated alphas submitted by user
The China market has a high cost of trading, thus requiring higher returns than
other regions. Thus the submission criteria are Sharpe >= 1.625, Returns >= 6.3%
and Fitness >= 1.0 for D1; Sharpe >= 2.6, Returns >= 8.9% and Fitness >= 1.3 for
D0
Apart from usual robustness tests such as sub universes, turnover, fitness and
weight, there is an additional test exclusive to the China research region: Robust
universe test performance: Alphas are considered good if the robust universe
component retains at least 40% of the returns and Sharpe of the submission
version.
Turnover 1%<Turnover<70%
Fitness and IS Ladder Tests Not Mandatory, but suboptimal value triggers a warning note
Self-correlation
Alphas can also qualify if their Sharpe is greater, by 10% or more, than that of all
Alphas with which their correlation is higher than the cutoff.
For example, if your earlier submitted Alpha X has a Sharpe of 3.18, you can
submit a highly correlated Alpha Y, if its sharpe is 3.5 or more
This allows for making improvements to an existing Alpha.
The Sharpe value used for this comparison (3.18) is visible in the correlation
summary table in the simulation results.
Self correlation operates on a two year window whereas the inner correlation
Simulate Alphas Data Competitions (2) Team
operates on the intersect of the selected Alpha's pnl time periods.
Weight
Alphas are also tested on the distribution of Alpha weights across stocks. Alphas can fail
this test if:
Too few stocks are assigned weight for significant number of days in a year. Note
that assigning zero weights to all stocks at the start of the simulation does not fail
this condition, it only applies after the Alpha starts assigning weights. The exact
number of minimum stocks varies with the simulation universe.
Alpha weight is too concentrated in any one stock. For example, if one stock has 30
percent of all Alpha weight, it will fail this test.
Do not submit Alphas as soon they clear the performance cutoff. Improve the idea
until you have the best version: in terms of both performance and correlation.
However, do not spend extraordinary amount of time improving a single idea either:
It is generally better to try out new ideas with low correlation to previous ones
than to improve performance of Alphas with high correlation.
Generally, low correlation is more important than minor increase in
performance. Example: An Alpha with slightly better performance but high
correlation is worse that an Alpha with slightly lower performance but much
lower correlation
Table of Contents
1. Return
2. Sharpe and IR
3. Fitness
4. Cumulative PnL Chart
5. IS Summary
6. Self Correlation
In the Simulation result page, you will find a ratings panel in the Stats tab of Results that
says Spectacular, Excellent, Good, Average or Needs Improvement depending on your
Alpha’s Fitness as shown below:
Return
Return is the gain or loss of a security or portfolio in a particular period. Return consists
of the income received plus capital gains, relative to the amount of the investment. In
BRAIN paltform, return = annualized PnL / half of book size.
Sharpe and IR
Information ratio (IR) measures the prediction ability of a model. In BRAIN platform, it is
Simulate Alphas Data Competitions (2) Team
defined as the ratio of a portfolio’s mean daily returns to the volatility of those returns:
mean(P nL)
IR =
stdev(P nL)
Sharpe is the annualized version of the IR statistic, i.e. Sharpe = sqrt (252)*IR ≈ 15.8*IR;
where 252 is the average number of trading days (days the markets are open) in the USA
in a year.
Note: Sharpe and IR may be defined somewhat differently elsewhere than in BRAIN
platform .
Fitness
Fitness of an Alpha is a function of Returns, Turnover and Sharpe:
abs(Returns)
F itness = Sharpe ⋅ √
max(T urnover, 0.125)
Good Alphas have high fitness. You can optimize the performance of your Alphas by
increasing Sharpe (or returns) and reducing turnover. Improving one factor normally
has an adverse impact on the other factor. As you work on optimizing your Alpha, an
improvement in its fitness is an indication that your changes are having a positive impact.
Cumulative PnL Chart: A graph (shown below) of an Alpha’s performance (PnL) over
entire simulation. This graph can be zoomed in by clicking and dragging below the plot
area. Start and end dates for PnL plotting can also be changed here. Clicking the Sharpe
Ratio in dropdown menu at the upper right from PnL graph displays the Sharpe ratio
graph (Sharpe over time). Make sure that the PnL graph has an upward trend, the Sharpe
is high and the Drawdown is kept to a minimum.
Simulate Alphas Data Competitions (2) Team
IS Summary
IS Summary: Scrolling down to the Stats block (shown below) of the simulation results
shows various metrics about the Alpha's performance.
Year: The year on which the data was simulated. The last row shows the Alpha’s
performance over all years.
Turnover: Turnover signifies how often one trades. It can be defined as the ratio of value
traded to book size. Daily Turnover = Dollar trading volume/Booksize. Good Alphas have
low turnover, since low turnover means lower transaction costs.
Margin: The profit per dollar traded; calculated as PnL divided by total dollars traded for
a given time period.
PnL: Profit and Loss (PnL) is the money that the positions and trades generate (which
means it is the amount of money you lost or made during the year), expressed in dollars.
daily_PnL = sum of (size of position * daily_return) for all instruments, where the daily
return per instrument = (today’s close / yesterday’s close) – 1.0.
Self Correlation
Generate Self Correlation: Clicking the Down Arrow button in a Self Correlation row will
produce a table with the performance statistics of up to the 5 most correlated Alphas you
submitted that qualified for OS testing. This information is meant to help the user ensure
they have a diverse set of Alphas. This information can also be accessed by clicking on the
Alpha in the Alphas page.
Simulate Alphas Data Competitions (2) Team
Prev: Clear these tests before submitting an Alpha Next: IS, Semi-OS, and OS
Simulate Alphas Data Competitions (2) Team
Documentation
Table of Contents
1. IS, Semi-OS & OS
2. Alpha Statuses
The rolling 5-year In-Sample simulation period begins 7 years ago and ends 2 years ago,
updating daily. Using simulation settings, you can divide your In-Sample (IS) period into a
Train and Test period. The Train period can be utilized to develop your Alphas and
SuperAlphas, while the Test period is ideal for validating them. An Alpha developed based
on the simulation results of Training Period and performs well in both periods is likely a
strong candidate for submission and may have avoided overfitting.
The latest 2 years of data, the Semi-OS, are hidden for scoring and testing purposes.
Consultants have access to a 10-year In-sample period, instead of 5-year.
Keeping the last 2 years of data hidden leads to higher confidence in the Out-Sample (OS)
performance of Alphas and their scores. Statistics shown in the OS Tab of My Alphas
page will be populated as data becomes available by each passing day.
Alpha Statuses
Following successful simulation, the Alpha is labeled as "UNSUBMITTED."
Simulate Alphas Data Competitions (2) Team
Upon submission, the Alpha is assigned the "ACTIVE" status.
For consultants, ACTIVE Alphas are qualified to accumulate weight and are eligible to
contribute to the consultants' quarterly payments, as further described in their
respective consulting or service agreements. This ACTIVE status will be kept until the
dataset they rely upon is decommissioned or if WorldQuant otherwise
decommissions an Alpha, in its discretion.
In case the dataset in no longer available or there is prolonged underperformance
of the Alpha in the Out-Sample period, the Alpha's status is revised to
"DECOMMISSIONED".
Decommissioned Alphas do not accrue weight and are not eligible to contribute to
your quarterly payment.