We Trade on the Market Data We Provide… So It Has to Be Right

March 13, 2025
 · 
6 min read
Featured Image

We’re heading to New York next week for Digital Asset Summit, an institutional-focused event by Blockworks attended by banks, fund managers, and trading firms. An event where people inherently understand the importance of good data, but they’re usually getting it from companies that we have previous experience of buying data from. These experiences and the short-comings led us to create our own market data service, and that’s what we want to discuss: the advantages of market data built by a firm that actually uses it themselves.

The Market Data Problem

Market data sits at the core of everything we do. If you’re running a trading firm, your systems act on that data. So it needs to be reliable, structured, and easy to work with. That sounds simple, but in practice, it’s anything but.

In terms of our founding team, both Tim and Marcus have spent quite a lot of time in their careers doing quant research, building models, and doing analysis around signals, understanding microstructure, and just trying to draw out inference from large amounts of data. To do that, you first have to get the data. You have to build an environment in which you can handle it, visualise it, and certainly analyse it.

But getting the data and making sure it is of a quality that is useful has always been a bugbear of quant researchers because it’s often quite a large amount of data. When you make mid-frequency signals, you need a long amount of time. If you are creating a high-frequency signal, you need a lot of ticks. So it’s a lot of data relative to the amount of compute that you have on your machine or the memory that you have available.

These are problems that we have butted up against in our careers, and the industry as a whole has been getting better at solving them. Data pipelines within really good quant firms are quite sophisticated now, to the point where at the larger, well-evolved firms, calling up a time series of data is something they work hard to make sure is very easy for quant researchers to do, so it’s not a blocker.

The meme is you spend half an hour or more mucking around with the data before you’re comfortable doing any calculations on it. Have I missed pieces of this? If I visualise it, are there some obvious outliers that I need to understand?

One of the practical interview questions we’ve done with people in the past is to give them a bunch of data, not huge amounts, but purposely perturb some of it so it’s not easy to do analysis on, and see if they pick that up. Because in the research job, you’re so often given data that’s not ready to use.

Why is that the case? Collecting from different sources, collected using different technologies, stored in different formats, sourced from different sources in different places, sourced at different frequencies, time-stamped with different precision, stored in different data types—floats versus ints, big numbers versus strings. All of these things are problems we struggle with.

And that’s a problem when you’re trying to run a trading firm.

Why We Built It Ourselves

As you’re building a trading business, as you’re building a capital markets business, this market data is your information. That is the thing that your systems primarily act on. Regardless of whether we had a business case for externalising it, we needed to make sure that internally we could use the data easily and that it was “a joy to work with,” so we didn’t embed this half-hour bullshit into everyone who’s working in the business day-to-day.

When we were at Maven and in the early days of LO:TECH, we bought market data from a small firm which at the time was one of the best in the space in terms of granularity. The delivery mechanism was okay, and it had been constructed reasonably well. But we were still paying for a service. If we are connecting ourselves to these venues for trading purposes, which as a high-frequency trading firm we have to do, we can’t really have the data delivered to us by someone who’s done the connection themselves and then normalised the data because that adds another hop into the process.

So we built it ourselves. And if we take in market data and we have a use case for looking at it at some point in the future from an analysis perspective, we might as well store it properly. We might as well make it easy to analyse. We might as well make that a low-friction process.

In our previous roles we’ve been customers of all the data providers in this space, and all of these things that had been learnings from our past came into this conversation. If we’re going to build a data pipeline, let’s make it a good one so that when a new quant researcher joins, they can go, “Oh my goodness, this is great. I can call data from Binance. I can call it from Uniswap. It all comes in the same shape, the timestamps are beautifully aligned, and any missing pieces of data have been handled. I can start my analysis within five minutes of getting the time series into my machine or onto an analysis box.” That was the idea.

The Key Difference

The care piece is important because yes, we are selling this as a client product, and we want it to be good for them. That helps us gain more clients through word of mouth and have happy clients, which means they renew agreements.

But the care comes as a result of the fact that we’re going to trade using this data ourselves. Our data pipeline that feeds our trading systems is the same infrastructure that feeds the data storage that goes out to clients. So we want that data pipe to be built in a way that constructs order books and handles timestamps really well. If we mess that up, not only do we introduce bad data to our client base, but we also introduce bad data to the trading system. We don’t want that.

That’s what makes our market data different, just an extra level of care and attention goes into the way this is stored and handled. We’re a trading firm, and we care about data in a particular way. That might be different from someone who is just aggregating data to sell it. That’s fine if that’s your business, more power to you. But we’re taking the data in to trade on it, and we need to understand it if we’re doing that.

That’s why we care about it so much, and we think that’s important for businesses like ours. A nice by-product of that is that the service that goes out to clients is good. We’ve also heard that independently from clients, that they like the fact that we trade on the data because it gives them an implicit trust that it’s decent.

We don’t just collect and normalise data to sell it, we collect it to trade on it.

And that means it has to be good. It has to be clean, structured, and reliable. Because if it isn’t, our trading systems don’t work.

That’s the standard we hold it to.

Want to partner with LO:TECH?

Low-Observable-Technology

LEGAL

PAGES

CONNECT

UPDATES

Low Observable Technology Ltd do not engage in the management of any cryptoassets or fiat currency on behalf of investors, nor do they hold fiat currency or cryptoassets on behalf of investors or customers. Low Observable Technology Ltd is not authorised or regulated by any regulatory authority.

The material provided on this website is provided for information purposes only and does not constitute an offer or solicitation for the purchase of any cryptoassets or any form of financial instruments referencing cryptoassets, or related services. The information on this website is not directed at nor intended for distribution to, or use by, any person resident in any country or jurisdiction where such distribution or use would be contrary to local law or regulation.

Any mentions of "market making" or "market maker", in the content posted on our website or in connection with our activities, refers to broader liquidity provision and does not refer to regulated activities which may be referred to using the same, or similar name, by the Securities and Exchange Commission or other regulatory or self regulatory organisations.

 

© LOW OBSERVABLE TECHNOLOGY 2025 - ALL RIGHTS RESERVED

Faction Footer