Building a Trading Model
All it takes is a hypothesis to get started. My hypothesis is that there should be a relationship between the two instruments. Or maybe there's a new instrument on the market that's becoming more popular, or perhaps there's an unusual macroeconomic factor that drives micro-price behavior. This is why I create an equation, or a model, that attempts to capture the relationship. It will typically be a process equation showing how variables change over time with a random (stochastic?) component.
Next, you need to find a closed-form solution. Sometimes it is simple; other times, this can take days or weeks of algebra. Other times, there is no solution so I must settle for an approximate. This stage is where I find Mathematica's symbol manipulation toolkit to be very helpful. So now I have a model for the market. It's time to verify that it is realistic. Matlab is my preferred tool for this task. I make some assumptions about the values of various parameters and then run simulations. Are the simulations reasonable? Are they representative of, at the very least, the actual market dynamics?
If the model passes the sanity test, it's now time to move beyond blue sky exploration and ideation into formal research.
What does "formal research" mean? Why is this necessary?
This is the transformation from an abstract, stylized representation to a concrete, unambiguous representation of the market. It is difficult to create predictive models. It's easy to believe you have built a predictive model. However, it is very easy for you to make a mistake and think you are. This is why most "systems" fail in real life. This is not what I want to happen to my model. I will be putting real money at risk. Over the years, I have developed a steady, consistent, and systematic approach that minimizes my chances of being fooled. This is what I refer to as "formal research ".
Which steps should you include in your formal research?
Data contamination was my greatest fear early on. Data contamination is a scarce resource. Once you have exhausted your historical data, you won't be able to generate more. I am paranoid about not running out of data that is uncontaminated. I begin by breaking up my historical data into chunks that are not overlapping. Then, I randomize to make it impossible for me to know which chunk is which. This protects against subconscious biases. For instance, I could be risk-averse if my test data is 2008 or risk-seeking if 2009.
One chunk is my calibration set. I use Python to calibrate: I use the built-in optimization tools and have created a few. My parameters are both constrained and correlated in this example. The EM algorithm is a 2-step optimization process. Optimizers are sensitive to initial conditions. I use Monte Carlo for a variety of starting points within the solution space. This is all possible in Python.
This calibration should produce a set "model parameters", i.e., numerical values that can be combined and compared to actual market data in order to predict market prices. After calibrating the model, I test it on a sample. Is the model stable? Are the residuals mean-reverting and stable? If the answer is no, then the model won't work. To break the model, I use various "tricks". For example, I calibrate using monthly data and test with daily data. Or, I use Canadian market data to test US parameters. These types of attacks should not be a problem if the model accurately reflects economic reality. (Economics doesn't change when you cross borders).
You must clearly separate out-of-sample from in-sample. To avoid biases at the beginning, you use Monte Carlo and other robustness tricks. What can you do to make sure you aren't fooling yourself?
Parsimony is a key component of my life. My model should not have too many parameters or too many freedoms. It's not a model. I am constantly looking for ways to eliminate factors. If the model continues to work (and is still "rich") after multiple factors have been removed, it's likely to be a good one. Another indicator of robustness is the fact that the model works no matter what trading strategy it is used with. A lack of robustness is indicated if you are unable to make money with a non-linear scaling rule that has many edge conditions. Data is the only thing that can be substituted. Every possible dataset outside of the sample that I can test the model on is included in my mind. This includes different countries, different instruments and different dates frequencies. Otherwise, the model will not work on all of them.
It sounds complete. Now, what?
The next step after obtaining a calibrated model is to create a simulation of PL. If the opportunity set is not large enough to compensate for bid-ask or if I have occasional losses, then the mean-reverting residuals may not be sufficient. My model is not sufficient to test a trading strategy. This is where I need to be careful. It's easy to add new variables or bias the results by subconscious knowledge or wish away outliers. Here, simplicity, strict separation and intellectual honesty are key. Back-testing is done in Excel. This was a conscious choice. Excel isn't as powerful as Python and so I have a limit on how complex my trading rules can be. This is a positive thing. A strategy that requires complexity in order to be profitable is likely not a good strategy. Excel allows me to see the assumptions I made. It's easy to forget such things when working in code. Excel allows me to quickly and easily visualize performance statistics (risks, return, drawdowns and capital efficiency), as well as Sharpe ratio, capital efficiency and capital efficiency. These statistics are important, even if my model works.
Few trading models can pass all of the above: blue-sky formula and sanity check; historical calibration and out of-sample performance, trading strategy back-testing and profitability. It's time for those few who do make it to production. This is a completely different ballgame.