Friday, June 15, 2007

MiFID did anyone notice yet.

OK, so I am assuming that people reading this know what the Markets in Financial Instruments Directive (MiFID) is. Looking on Google is only going to make things more confusing for you, so if you don't have a clue, just walk away now. Nothing is happening here, move along there.


For those of you who decided to keep reading, this particular post is not really about MiFID directly, it has more to do with the possible after effects of MiFID and what people in the financial services industry might want to start thinking about when they come to do their budgets for next year.

The Boat Project is an exceptionally serious attempt to provide a reporting platform for Pre and Post trade reporting requirements under MiFID. ABN-Amro, Citibank, Credit Susise, Deutsche Bank, Goldman Sachs, HSBC, Merrill Lynch, Morgan Stanley, UBS, Barclays Capital, BNP, DKIB, JP-Morgan and RBS is a suitably heavyweight collection of players. I can imagine that they will manage to produce some suitably large collections of data for the markets to consume and analyse.
Whilst not all of these people are directly involved in Turquoise, I can easily imagine that all of these people will be using the Turquoise platform as a means to generate a lot more tick and quote information for the markets.
We will need to wait and see what kind of a role the Equiduct and Instinet alternatives will offer. There will however be a marketing requirement for all of the new Multilateral Trading Facilities (MTF's) in a post MiFID world. They will need to show the depths of liquidity that their particular facility is able to offer. An easy way to do this is to publish the Tick & Quote data for your particular platfrom.
Obviously, if you want the sub-millisecond latency for your clever little Algo-Trading box, then you will need to pay the overhead that is normally associated with very low latency tick data. If on the other hand you are happy to see 0.5 to 3 second latency, then you might just find that this is a very cheap data source. If you can afford to wait until just after the markets have closed, you might even find that this data is provided "at cost".
Hopefully, you have at this point managed to catch up with the theme here. In 2008 we will be seeing a massive growth in the quantity of tick and quote information and a reduction in the costs associated with the provision of this data.

Being able to aggregate all of these different sources might make sense for reasons other than MiFID.
Most of the Risk Management, Risk Analysis and Risk Reporting applications used in the financial markets (for exchange traded instruments) are based on a snapshot, end of day aggregate data set that tells us the High, Low, Open and Close for a given instrument. This is a somewhat simplistic view but it is based on an assumption that the current data collections are incomplete. Depending on which analyst, report or study you have been reading lately, the London Stock Exchange accounts for just under half the total volume of trades done in London each day. The rest of the trades are done on the phone in the over the counter markets that we sometimes refer to as Dark Liquidity. It makes no sense to look at anything more than HLOC when half the possible data is not available. In a Post-MiFID world, where the light of transparency is shone on the dark liquidity, most of this information will suddenly be available. Looking at HLOC will seem a little simplistic.
It will begin to make sense to have some sensible form of aggregation.
The European exchange based markets currently produce something in the order of 600,000 ticks per second. The 27th of February 2007 was a record for most markets and this number jumped to periods of 800,000 ticks per second. So we can expect to have to deal with 1,200,000 to 1,800,000 ticks per second. In a 9-hour day this would be 16,200,000 on a busy day. That is 81,000,000 in a week, or 324,000,000 in a month and 3,888,000,000 for the year. This looks like it might be a big problem. Those are just the ticks for the underlying instruments. If we add the assorted collections of indexes to this, we can quickly get to a point where we are talking about truly vast time series. This is a field where every tiny little detail is no longer trivial.
It is a bit like nailing jelly to a tree, which is the title of my blog, so lets carry on here.

Analysis of the various different market segments shows that there is not common format for the tick and quote information. It is possible to standardise the format of equity information from each of the current 44 exchanges in Europe. Money Markets, Bonds, Repos, Options and Futures all have different quote conventions and it makes no sense to try to aggregate or massage them into a common format. Initial analysis suggest that there are seven basic forms of quote convention. So I need to have a minimum of seven tables and a total capacity of some 4-Billion rows per year. In a simple snapshot view of static price information it should be possible to model this with just about any of the basic database applications. But we are not talking about a snapshot. We are talking about a truly vast time series with quasi-realtime updates of anything upwards of 1,200,000 updates per second. Some of the bigger database vendors out there are able to sustain these types of loading, just so long as nobody wants to ask us any questions whilst we are busy pumping the constant stream of updates into the system.
Fortunately, there is ONE platform that can cope with the size, the rate of update and the collections of complex queries that users might want to ask. For some years now the nice people at Sybase have been selling something called Sybase-IQ. This is a product that has matured to a point where it is now the right product at the right time in the right market.

Truly Vast Time Series seem to have been designed specifically to take advantage of Sybase-IQ (or was it the other way round?)
As expected, the leading database vendor in the financial markets has not been sitting around waiting for users to realise what a super product this was. For instances where you want something a bit more complex than a big clever bucket, the people at Sybase have put together a collection of interesting Sybase technology that they are calling the Real-Time Analytics Platform (RAP).
At this point I need to point out that I have provided some input to the design of this solution. I am a little biased (actually more than a little biased) but the proof is as they say in the pudding. For anyone that was seriously interested I would be more than happy to arrange a little race where we show just how fast this particular solution is. It should be possible for those people who really are interesting in doing this to find my real-world credentials.

We think that we have a really sensible aggregation platform and would welcome the opportunity to benchmark it with real data. Is anyone up for the challenge?

Are there any risk management professionals who can see the benefit of having access to such a collection of time-series information?

Do we have any interest from the current collection of traditional data vendors who would be interested in being able to apply this solution to their current collections of data?

Perhaps you are an existing exchange that sees the need to be able to offer enhanced data services as a means to offset the reductions in transaction costs the MiFID will bring.

Or, perhaps you are one of the proposed new MTF platforms that sees the merit of being able to offer value added data services based on a rather sensible data aggregation platform.

No comments: