A few weeks ago some readers and I decided to explore doing our own options backtesting in order to challenge / validate some research by Tastytrade and explore our own ideas. Among the details, the question of how to test is on the floor for decision. Do we buy a dataset and manually crunch numbers, build a rudimentary automation solution or do we invest in an off-the-shelf automation tool such as OptionStack or ORATS?
The time investment to manually backtest complex strategies is high, as is building a custom automation solution. Thus, an off-the-shelf solution is of interest.
OptionStack seemed to be above the crowd and could be run in a browser so I started there. Using a demo account, let’s see how this tool performs. By the way, this is NOT a sponsored post.
Upon launching the platform I was greeted with a quick-start dialog which contained a few helpful videos. After watching the two brief clips I had a general feel for how the platform worked and was ready to start exploring.
After the welcome screen is dismissed the first thing I noticed is an already-configured backtest ready for action, followed by several more pre-built backtests in the left-hand column.
The creators of this tool are familiar with Tastytrade. The opening configuration is called “TastyCoveredCall” and there is a dedicated folder of preconfigured backtests covering some of their strategies.
From Idea to GUI
It took a bit of finagling to get my idea into the tool’s GUI. For example, I had to figure out how to describe “open a trade every trading day” using the available logic constructs. I initially started with “Days Past Since Last Trade >= 1”, but that didn’t work because closing trades reset the timer and prevented new positions from being opened the same day other positions are closed. The solution was to use “Day of the Week is between 1 and 7.”
Another example, which I haven’t been able to solve yet, is to replicate trading every Monday, Wednesday and Friday. The logic to open the trades works but the backtester doesn’t seem to be capable of opening positions on Friday that expire Monday. It thinks no such positions exist.
With the recent elevation of VIX I’ve been able to consistently sell puts 10% OTM or more with deltas in the 9-12 range, well above my typical target of 5. This got me thinking, should I place trades based on delta or % OTM?
We know that at order entry delta is an approximation of probability of profit (POP). Would a better approximation of POP be % OTM at order entry?
A 5 delta 45 DTE SPY PUT @ VIX 11.95 ~10% OTM
A 12 delta 45 DTE SPY PUT @ VIX 20.14 ~10% OTM
Asked another way, is 10% OTM a better approximation of 95% POP when accounting for VIX dynamics?
Let’s run a backtest and find out!
Setting up the Backtest
I populated the backtester with my trade logic and enabled / disabled the highlighted sections for easy toggling between tests. The transparent elements are disabled.
Why do I use a range for OTM and delta instead of a fixed single value?
If a position doesn’t match the criteria EXACTLY the backtester doesn’t place the trade. Hence, I needed a small range to work around the issue. The width of my range was chosen after empirical testing; there’s nothing special about the selection other than “it works.”
Open 1 SPY put closest to 45 DTE daily with a max of 21 open positions:
- Delta 4.6-5.2, close at max profit 25% or 21 DTE
- OTM 9.8-10.5%, close at max profit 25% or 21 DTE
- Delta 4.6-5.2, close at max profit 40% or 21 DTE
- OTM 9.8-10.5%, close at max profit 40% or 21 DTE
Here’s what the tool provided as “summary” data:
The date range is capped at ~1 year or less, within the dates above, due to my account being a trial account. Paid accounts are able to use the full date range, which goes back to 2011.
I found it curious the tool didn’t auto calculate the win / loss percentages and avg trade P/L.
The output files include:
- Log: informational messages such as “no trades met your criteria for trading day ‘x’”
- Statistics: daily values for: portfolio, net change, cumulative P/L
- Transactions: execution time, buy/sell, qty, fill price, instrument, comments
- Summary: the values in the screenshot above
- User Plots: values used to construct charts, in this case SPY P/L and portfolio return
Using scenario 1 as an example, when I dug into the transaction log I found significant discrepancies and misleading interpretations compared to the summary file. In no particular order:
- Max Drawdown is not the same as max loss. Drawdown appears to be calculated based on unrealized P/L. Max position loss is nowhere to be found.
- Num. Winners is not accurate. With the exception of a single position, all were winners. I’m not sure how the depicted number is derived.
- Num. of Losers is not accurate; I’m not sure how this number is derived. As a hunch I suspected it was the count of times the portfolio encountered a negative combined unrealized+realized P/L, but that’s not the case as the statistics file shows 6 instances of negative values. A manual review of the transactions demonstrates there was only one losing trade for a total loss of $37.50 (ignoring txn fees).
- Average Gain and Average Loss are inaccurate based on the previous observations.
- Portfolio Return is therefore not accurate either.
- Start Date is accurate. I looked up the closing value of SPY on July 3 2017 and compared the strike price of that option. It checked out. That was the only spot check I performed on this attribute.
Not listed in the summary is the mechanic of plotting total risk throughout the life of the backtest. There is an explicit feature for this in the tool but it only seems to record values every couple of weeks as opposed to daily, which prevents doing risk management.
The transaction log does seem to be a solid resource. Identifying the positions for use in running calculations is half the battle, which OptionStack does nicely (as long as it’s not opening a 3 DTE trade on Fridays). By using the transaction log one can calculate any number of [accurate] statistics themselves.
I pulled together my own “dashboard” after spending some time in Excel. Going forward I should be able to reuse the formulas and build dashboards (or more accurately, crunch the data behind the dashboard) much faster.
Based on available data, it appears using % OTM generates a greater return with less volatility than using delta to base trades. I’m curious how this would have played out during 2008.
For those interested in reviewing the data for each of the backtests, it’s available here. As usual, the file is read-only. You will need to make a copy to make any edits.
I know the options trading crowd is a small world; I don’t intend for this to be a bash session on a particular tool. If you’re involved with the OptionStack team and reading this, there are several other bits of feedback I’d like to provide ranging from UI elements and UX to additional data points to help improve the product. Feel free to reach out. I’m happy to be a [back]tester.
- Time to Reach FI Based on Savings Rate and Portfolio Allocation
- Using a Cash Buffer to Offset Sequence of Returns Risk
- Bonds Don’t Diversify a Portfolio
- Low SWR @ High Equity Allocation is Optimal over 60 Years
- Distribution of SWR Rates Across Asset Allocations @ 60-year Horizon