Best Practices in defining and testing parameters
Parameters are the foundation of factor investing. They are the indicators used to measure the presence of desirable factors, either by themselves or in combination with other parameters. As such, defining parameters and testing them appropriately is one of the most crucial elements of success in factor investing, since they help to seamlessly employ factor investing strategies by providing meaningful and quantifiable barometers to test the efficacy of different factors.
As with everything that must be done in a systematic and controlled manner, best practices for this have emerged over time. Typically, the following sequence is one that has emerged as an extremely reliable one.
Ideation: Just like factors, parameters are also based on sound investment logic and investment experience. Just as it is logical that high quality companies should generate more shareholder wealth, it is equally logical to assume that a business able to generate consistently high Return on Equity (ROE) will also do the same. The endeavor is to take well accepted investment knowledge and test whether it actually works.
Parameter Definition: The next step is to define the parameter itself. While this seems like a straightforward exercise given the depth in which ratio analysis has been studied, in the real world this is an intense and decisive exercise. For example, to define ROE, one needs to choose the numerator from over 60 ways in which profit after tax can be expressed. Choosing the denominator poses a similar challenge. Further, inconsistencies in accounting standards and disclosures across time and between industries make this even more challenging and critical. Getting this wrong can endanger the efficacy of the parameter and everything that follows.
Data Audit: Once the relevant data fields have been shortlisted, a thorough verification of the data in these fields must be conducted. This is essential to identify data gaps and erroneous data that may be populated in these fields. There are numerous observational and statistical techniques used to ensure that the data being used is within well defined low error tolerances. Conducting meticulous data audit is the most pivotal facet of the entire parameter testing process, since negligence with the usage of the raw data can lead to a completely false and misleading analysis.
Coding the Parameter: Vast amounts of data cannot be analysed without the use of computer based software engines. Software code, often complex code, must be written to allow for all the stocks in the investment universe to be measured and ranked on the basis of the parameter sought to be identified. Best practices for software development are a necessary component at this stage to ensure low error rates and costly rework. Creating comprehensive codes also allows researchers to flexibly and dynamically back-test the parameters.
The code is then run on the data to generate output that is verified against calculations made separately on a sample basis. This is essential to verify whether the code has been written correctly and returns the desired output. Multiple scenarios are included in the sample to ensure coverage of all possibilities.
Robustness Testing: The selected parameter is now ready to be tested. A robust parameter is one that works across time and across different stock universes. The ideal way of testing a parameter for robustness is to take a stock universe, rank the stocks, divide the stocks into multiple slices based on the rank and compare the performance of the slices with each other across various periods of time, both overlapping and discrete.
The thought behind this is a simple one. If the parameter contributes positively to performance, then the top slice should outperform the one below it, which should outperform the one below it and so on. Similarly, volatility should increase as one goes down the ladder. It is important to slice the universe into portfolios having an equal number of stocks (i.e. terciles, quartiles, quintiles etc.) to ensure meaningful robustness testing results.
The example below shows how a universe of Top 150 companies by market capitalisation can be broken down into 3 slices (terciles) of 50 stocks each for testing a hypothetical parameter ABC for robustness. Each slice represents a portfolio of 50 stocks in terms of their ranking based on the parameter ABC, with the top 50 ranked stocks (1st Slice) and bottom 50 ranked stocks (3rd Slice) representing the best 50 and worst 50 stocks based on the parameter, respectively.
Portfolio Slices |
15 Years Annualised Return |
5 Years Rolling Mean Return |
15 Years Annualised Volatility |
Maximum Drawdown |
Parameter ABC 1st Slice (Stocks Ranked 1 to 50) |
16.34% |
15.64% |
12.05% |
-45.50% |
Parameter ABC 2nd Slice
(Stocks Ranked 51 to 100) |
10.99% |
7.24% |
14.96% |
-52.75 |
Parameter ABC 3rd Slice
(Stocks Ranked 101 to 150) |
8.45% |
5.76% |
17.01% |
-67.85% |
The above example demonstrates that the parameter ABC reflects a strong degree of robustness since the higher ranked stocks based on parameter ABC have superior risk and return profile than the lower ranked stocks. In other words, this justifies the usage of parameter ABC in selecting stocks while creating portfolios since including stocks that are ranked higher in this parameter clearly have a potential to generate higher returns, lower volatility and lower drawdowns for the portfolio.
Robustness testing must be repeated with at least one other universe of stocks to complete the exercise. Each slice of each universe is a distinct portfolio and with multiple lookback periods and rebalancing frequencies, performance for a plethora of portfolios must be generated and compared to determine whether a parameter is robust or not and to what degree.
Robustness testing plays a very important role in shortlisting parameters for selection of stocks while creating portfolios, since it helps separate the strong (robust) parameters from the weak parameters that have been defined.
Move to Production: Once a parameter goes through the rigorous process defined above, it is made available to researchers for their use.
Research, database and coding teams must work together to ensure that the process from ideation to production happens in the most efficient manner. If these practices are followed, the chances of getting things right increase exponentially.