In most advanced EMSes, the behaviour of benchmark and SOR algorithms is managed via a large number of parameters (instead of hard-coded). In Quod Financial EMS, for example, there are over 100 parameters to manage a single trading algorithm. An example of a parameter is “Level of Aggressivity”. This is defined as the amount of liquidity that a trader wants to “destroy” during the execution. The aggressivity takes into account, for a given instrument (i.e.currency), the available liquidity but also the overall immediacy to complete the execution: it is a common practice to be less aggressive at the start of the execution and increase aggressivity towards the end.
Liquidity is a dynamic feature of the market. The liquidity captures different underlying forces including:
- The performance/behaviour of the single instrument
- The rules modifications at the venue(s) level, e.g. margin / risk levels are varied or Increase/decrease in the overall volume/participation.
This constant change in liquidity patterns creates the need to review, on a frequent basis, the performance of the algorithms and to tune the governing parameters. This is a labour-intensive quantitative analysis job, which is done by data scientists (also called ‘Quants’). In fact, it is our belief that a fair amount of algorithms parameters/performance are not reviewed and consequently infrequently tuned.
The case for applying AI/ML to the optimization of the algorithmic trading parameters is very strong and additional arguments include:
- Trading algorithms provide a lot of data; including Best Execution and Transaction Cost Analysis which can be enriched by different market data and benchmarks. It has to be noted that the data set is not as big as required by mainstream ML, but Reinforcement Learning (RL) is well suited.
- ML algorithms can be trained easily; the training can utilise the existing algorithmic trading testing / back testing environments (which are part of the regulatory requirements for algorithmic trading). These environments provide a good way to replicate the behaviour of multiple agents and the venues, and a replay mechanism.
Our own trials
In our own case, we have decided in our first trials to use a RL technique called Contextual Bandits. In Contextual Bandits, the following process is followed:
- The algorithm observes a context: the features extracted from market data and order input properties,
- It then makes a decision: what parameters are best (choosing one action from a number of alternative actions)
- And finally observes an outcome of the decision
The outcome defines a reward. The objective of the training is therefore to minimise/maximise the average reward, which is for example in our case to minimise slippage (impact of the execution) of the algo order.
This technique works well with sparse data, and the outcomes are reached after a long series of actions – like a game of chess, where you will not know until you win, lose, or draw. To address a complex decision in algorithmic trading the Contextual Bandits algorithms are evolving rapidly. For instance the notion of average expected outcome (usually called Regret) allows to recalibrate the outcomes.
From this first project, a second AI/ML project we are now working on is the Algorithmic Trading Selector. This aims to assist the (human or machine replicated) trader to select the best algo taking into account the execution objectives and context (e.g. Liquidity).
More information about this Whitepaper can be found at: http://www.quodfinancial.com/aiml-july18