Using regression to find optimal parameters for a trading strategy based on market regime

S

StabMe

I am still fairly new to the field so forgive me if the whole post and my questions sound stupid.

A bit of explanation first.

So i have a trading strategy which is an extension of an Avellaneda-Stoikov model which i thoroughly tested in a backtest simulating hft environment and it is trading live as well with decent results on crypto exchanges. Sometimes it looses money due to sudden volatility spikes which i am able to manage by detecting such spikes and stopping sending orders during those. But sometimes i see, that when the market is less volatile, for example, should the strategy place bid/ask orders more agressively it would have made more money as well as i notice that when there are strong trends in price movement, it tends to be not as profitable. I notice, that by being able to switch to more or less aggressive modes of trading based on market regime, the strategy could have earned more or could have not lost as much being more (or less) risk averse.

So i had an idea - run a series of backtests (which align quite nicely with live trading, btw, in terms of how PNL of live vs backtest's match) with different trading strategy parameters, collect features that describe market state, collect value of equity earned during a step and save the data, along with parameters, into a data set. I chose a period of 5 seconds, during which about 10 to 100 orders are placed, during which features are collected and equity earned is calculated. I decided to try 3 combinations of two parameters, namely gamma and delta of Avellaneda-Stoikov model: a combination that has shown good performance initially, and combinations that would be more/less risk averse. So a series of such backtests results in a series of datasets, which are then concatenated into a single dataset.

So we have a large dataset with results of series of backtests with columns describing a state of the market, a step reward (equity earned) and gamma/delta values. So this dataset is then used to train a RandomForestRegression model. Target value is step_reward, which is shifted by one position, so that we can predict which features/parameters combination result in which step_reward.

Before training, features are discretized into deciles and an array of all possible feature combinations is created. All feature combinations are then tried with all possible parameter combinations and combination which predicts higher step_reward value is saved into a lookup tavle. this lookup table is then used in a backtester (or could be used in a live trading environment) so that we can answer to a question - which gamma and delta values are going to be most profitable given this set of discretized features? A lookup table is effectively a typed dict where a set of features is expressed as a base10 value and used as a key and gamma/delta is expressed as a list and stored in the value of this dict. This dict is then passed to a backtester.

Along with lookup table, i also create a bucket_ranges table which aligns real values of features into discretized buckets. Discretizing was performed on a whole dataset, so discretizing using on the fly would result in different values. This is why, as a quick workaround, i decided to use a bucket_ranges table which simply converts real feature value to a bucket number which is then used as part of the key when optimal gamma/delta are requested.

Instead of predicting highest step_reward i also tried to predict step_reward sign.

This idea is loosely based on this article, where authours tried to train a reinforcement learning model to predict the sign of the equity earned during a step.

When i train a model to predict a sign of the equity, i can obtain accuracy score of upto 80% with RandomForestClassifier, but this requires too many buckets and too many features to be used, which makes creating a lookup table impractical - it's length becomes of a few hundred million records. When there are only 5-6 different features discretized in buckets, accuracy score drops to about 60% for a period i tested it.

So when i use a generated lookup table in a backtester, using predicted values of gamma/delta result in a significant increase in sharpe ratio and a drop in maximum drowdown, however, such improvement is only relevant for period during which the model was trained on, which is not a surprise. When the period is shifted, the performance drops even lower than if i use a single combination of empirically found parameters previously.

Features i used (all of them or different sets of those mentioned): rsi, volume, vwap, mean position, number of fills during a step, volatility (long and short windows), mean book imbalance during a step (at bbo level, at 5 levels deep, at 2.5% deep from the midprice).

So my question is - is this a common practice to enrich a trading strategy with a regression model where what we predict is not next price move, but rather a combination of parameters that have a high probability of resulting in higher proft gain? Or is this is an outworldly idea that just doesnt stand up to critisism? I do use Bayesian optimization methods when trying to find optimal parameters while backtesting, but these parameters are set once and never changed, so thought that i could 'teach' the model to be more flexible during market regime changes.

TLDR: Is using a regression model for predicting optimal trading strategy parameters based on collected data of features describing a market state and profits earned during that time a bad idea?
 

Unreplied Threads

Risk management tools for long term Gamma/Vega sellers subject to margin calls

  • Lisa Ann
  • Finance
  • Replies: 0
TL;DR: if you're a retail investor and you systematically sell long-term vertical spreads while staying Delta-neutral, your main risk comes from Vega and the Gamma of opening gaps that can throw you in a margin call. Which model can help you to size the liquidity buffer to overcome these risks while allowing for acceptable returns over capital?



Background

For a while, I've investigated which trading strategies retail options traders use to steadily make meaningful profits by selling Gamma on European index options. So, before the actual question, I would like to share with you some of my findings. I stress the point of being "retail" because there are perks and flaws different than the ones belonging to professional players (like hedge funds):

  • usually, transaction costs are a serious problem. This means, for instance, that the frequency of Delta hedging shall be low. Moreover, synthetic exposures to Vega such as those provided by proper long/short positions along the whole chain are unworkable. Even an Iron Condor could be too expensive for some moneyness, that is, its payoff doesn't properly reward the risks taken when you subtract transaction costs from the final expected gain;
  • there is no lifeboat. If your mark-to-market current loss exceeds your current margin requirements and you don't have a liquidity buffer, the game is over even if your sold options are still OTM;
  • market liquidity is almost never a problem. It's hard to find a strike with a reasonable moneyness where the market maker isn't buying or selling an amount suitable to your needs. Therefore, you're never supposed to give some market makers a call to trade OTC.

So, according to my research, the natural selection kept alive only a small fraction of Gamma sellers. This Darwinian process killed those who took the shortcut of the proverbial "Picking Up Nickels In Front Of A Steam Roller" by keeping themselves too close to the crushing machine, that is, selling high Gamma options (even with "wings" properly hedged, e.g. vertical spreads). As you already know, bid-ask spreads make this short-term game unfair and Delta hedging cannot work while you're approaching the so-called "pin risk". In short, if you systematically sell short-term Gamma to harvest some variance risk premium, you end up... dead. That's what I've seen to occur with a scary frequency.

This pushed the survivors to move to longer maturities: they saw how much easier was to hedge the Delta and the Gamma, while at the same time keeping transaction costs low. However, there's no free lunch:

  • those who set aside a lot of buffer capital found out that time decay is a negligible source of profits when you're far from the expiration date, thus they found out that a Treasury bill would have given them better returns;
  • those who set aside only a small multiple of the margins required by their brokers met their new horrible reality: as soon as the implied volatility term structure makes a spike, Delta and Gamma neutrality means nothing and they blow up anyway;
  • even with enough capital set aside and a meticulous Delta management, there's no way you can deal with a gargantuan opening gap unless already hedged somehow.

This triggered another natural selection round. The (few) survivors put together the puzzle pieces and - according to my research - the highest survivability rate so far has been achieved by those who roughly trade like this:

  1. start on a medium or long-term implied volatility skew setup which allegedly makes bode for a mean reversion. Two easy examples: (1) rolling skew time series as difference between constant moneyness implied volatility time series; (2) large difference between implied risk neutral density and historical density. Generally speaking, playing with percentiles can show something worth trying;
  2. open a simple even-legs vertical spread, e.g. short 95% Put and long 90% Put. The long protective option is a necessary evil because you cannot forecast gargantuan opening gaps;
  3. don't reset Delta to zero already. Instead, overhedge it to make it slightly negative: as implied volatility is usually correlated with underlying negative returns, you would want a slightly negative Delta to partially hedge Vega in case of an implied volatility spike and a market drop.

You could end up with a very simple position like this (Delta -5% is just an example, you should run some regression analysis about the implied volatility-negative returns beta):

enter image description here

Question

From a broader perspective, it's clear what these traders are striving to do: homemade high yield bonds. They're always seeking for a fixed income because of their risk aversion, and this sounds more natural to them than small bleedings followed by uncommon large returns.

Ok, it seems easy, right? Wrong! Aside from blatant risks (e.g. Delta hedging open at least a "wing" downward), here comes the hard part: if you have $M$ dollars and one of those trades requires $m<M$ dollars margin, how many of those trades can you open with your broker to optimize the trade off between returns and risks?

Possible insights

It seems that the main variable here is: margins volatility. If you can estimate a density for $m$, you can optimize the use of your capital buffer. Margins can be calculated according to many metodologies, but according to my investigations almost every possible algorithm involves stress testing (see examples from Interactive Brokers):

  1. unfavourable set of underlying movements;
  2. unfavourable set of implied volatility spikes (or unfavourable risk neutral density tails enlargement, if you're working under a model free framework);
  3. portfolio re-pricing;
  4. $m$ is the loss under the worst case scenario.

Trying to quantitatively shape retail traders' rules of thumb led me to assume some usage of Extreme Value Theory applied to scenarios that have already been stressed. If the margin required by the broker was like a Value-at-Risk, here the problem would be to estimate an Expected Shortfall.

Moreover, I found out that using risk neutral density instead of stochastic or local volatility could simplify the job. For example, if you stick to a lognormal mixture framework (see Lognormal-mixture dynamics and calibration to market volatility smiles), a shock scenario doesn't need to disturb SDE's parameters neither to alter a volatility surface in a consistent (and arbitrage free) manner: just inflate the tails, shift the weighted mean(s) and rebuild your stressed chain to get the new worst case prices. A stress test scenario could be sparingly described just by five parameters (two standard deviations, two means, one weight) instead of a whole set of perturbed volatilities.

As you can see, I grope around in the dark. Any effective tool will be the answer.

How do I calculate the output impedance of the colpitts oscillator if I want to match it to the 50 ohm input of a spectrum analyzer?

  • help_me_learn
  • Physics
  • Replies: 0
I am still working on the 400+ MHz colpitts oscillator and I think I may have found the problem with the circuit I designed on the PCB. This will involve 2 questions and the first question will be below. I will create a separate post about the swing of the oscillator and how the Q factor of the inductor may have been the problem.

To properly measure this signal: First, the oscillator is supposed to produce more than 400 MHz. The oscilloscope I was using to measure the PCB design signal can't measure signals above 300 MHz. Second, the probes that are connected to the oscilloscope say 60 MHz. I am not sure what would happen if i used those probes for signals above 60 MHz. Next, I have access to a spectrum analyzer that can work with signals above 1 GHz. They require SMA connectors. I also have access to cables but have to order SMA connectors and redesign my PCB if I plan to test the oscillator with SMA. This is where I have a question:

1)How would I calculate the output impedance of the colpitts oscillator? I need to know this impedance to design a matching network to match the output to the 50 ohm input of the spectrum analyzer.

Here is the Colpitts oscillator:

enter image description here

Here is the equivalent circuit i was thinking for the output impedance:

enter image description here

and simplifying it would lead to:

enter image description here

Since the inductors and capacitors cancel each other out at the resonance frequency of the LC tank, the final result would be (?) : enter image description here

would the final result be Rp||R1||R2||R3||R4?

What components will I require to ascertain the capacity of a lithium iron battery?

  • JoeyB
  • Physics
  • Replies: 0
I see quite a few posts on measuring battery capacity. However, I cannot find one that shows how this is practically achieved (if there is a post, kindly inform me).

I am assuming in order to ascertain the capacity I would need the voltage and current that is flowing to the load (using a shunt). In addition, I would require a micro-controller to use these two parameters to calculate the capacity.

How would I therefore use these two parameters in order to determine the battery's capacity?

Do I need other parameters as well, if so what will there be?

I want to determine the "actual" capacity for a 12V 8Ah and 18Ah lithium battery.

Can I read a PIC24FJ128GA010 hex file from the microntroller using a pickit3?

  • SeanJ
  • Physics
  • Replies: 0
I'm working on a proejct where I have a functional device and some old code with no instructions and no .x project file (abandoned project).

I want to recompile the old code and reprogram the PIC24FJ128GA010 before altering the source code. Unfortunately, this risks bricking the old device especially as the previously used toolchain is unknown.

I have been advised to read the hex file from the microcontroller first and use this as a backup if the code I compile fails to program correctly, I can reprogram the device with the code that was on it.

I am using a pickit3 and mplabx to work with the PIC24FJ128GA010 and have been experimenting on an old explorer 16 board with a similar chipset on a PIM.

enter image description here

I see no options on mplabx to read a hex file off the pic. Is there a way to read the hex file from a pic using a pickit3?

Switching on high side and low side of the load

  • Alex Drake
  • Physics
  • Replies: 0
I am working with a safety related circuit that controls turning on/off load (solenoid of the hydraulic valve). I am wondering how I could implement this type of circuit using MOSFETs. Here is an equivalent circuit for the reference. Load voltage is 28 VDC. 2 control signals are 12V DC

enter image description here

Stored Procedure Com Select SQL SERVER

  • Bruno Henrique
  • Technology
  • Replies: 0
Boa noite, estou tentando criar uma stored procedure que vai buscar um dos dados em um select, o motivo é porque tenho uma tabela de itens, e preciso inserir todos os códigos dos itens em outra tabela.

A ideia é +- essa: Na tabela item, somente o campo "ItemCodigo" interessa, porem possui varios campos, e tem cerca de 170mil linhas, cada uma com um itemcodigo diferente.

NA tabela ItensUsuario possui a vinculação do usuario com o item e algumas outras informações.

Preciso criar uma stored procedure que vai inserir todos os itens do itemcodigo no itens usuário.

Tentei algumas formas, porém deu erro na conversão de varchar para int (acredito que seja devido ao select que fiz)

Code:
create procedure inserirtodositens
@idusuario int,
@idproduto int
as 
begin
insert into xx.dbo.itensusuario
values (@idusuario, @idproduto, 0, getdate(),0,-1)
end 

execute inserirtodositens 594, 'select itemcodigo from xx.dbo.itens'

O erro que retorna é "Error converting data type varchar to int."

Code:
Caso eu mude o "@idproduto int" para 
"idproduto varchar(8000) = NULL"

O erro que retorna é "Conversion failed when converting the varchar value 'select itemcodigo from xx.dbo.itens' to data type int."

Turn off the "follow mouse" feature when I press transformation shortcuts?

I'm more experienced with Maya and trying to transition to Blender. Blender's default settings for the move, rotate and scale hotkeys (G, R, S) are weird to me; when I press them, the selected object follows my cursor; I want to change the hotkey settings so that when I press those hotkeys, it just turns on the manipulators rather than following my cursor, just like in Maya.
Top