TSDB to Parquet
I’ve been reading a LOT of articles lately and I think this gave me an intense urge to write something. I’ve decided that writing is one of those skills which I really want to get good at.
I’ve been working on Cortex Metrics: It’s a CNCF project for long term storage solution for Prometheus. I thought it would be cool to write something about my recent work.
I’ve been working on issues related to converting TSDB into Parquet file format. I’ll explain about TSDB first
What’s a Series ?
If you worked with Prometheus you must have come across something like this
http_requests_total{instance="1",method="GET"}
Now this is a called as a Series. When you query it in Prometheus, you get some samples.
Samples are basically key value pairs of timestamp and numbers.
10:00 -> 5
10:01 -> 7
10:02 -> 9
Converting Series into TSDB
There can be millions and millions of sample data. Directly storing them is really expensive. So, Prometheus compresses them into chunks.
I won’t go deep into how it’s done. Prometheus format docs is a great place to learn more about it.
Series:
__name__ = http_requests_total
tenant = tenant_a
job = api
method = GET
status = 200
Chunks:
Chunk 1 -> samples from 10:00 to 10:30
Chunk 2 -> samples from 10:30 to 11:00
A TSDB block stores many such series. At a high level, a block looks like this:
01HXYZ.../
├── meta.json
├── index ← all series + where their chunks are
├── chunks/
│ └── 000001 ← actual compressed samples
└── tombstones
Parquet Format
In traditional SQL database the data is stored row by row:
row 1: tenant_a, http_requests_total, api, 200
row 2: tenant_a, http_requests_total, api, 500
row 3: tenant_b, cpu_usage, worker, null
If you want to read only the status column for all rows, you still have to scan every row and skip over the other columns. That’s inefficient.
A Parquet file format was introduced by Apache. It’s really efficient for querying large datasets because it uses columner format.
Conceptually, parquet layout is like this:
column tenant:
tenant_a
tenant_a
tenant_b
column __name__:
http_requests
http_requests
cpu_usage
column job:
api
api
worker
column status:
200
500
null
Conclusion
TSDB series labels
↓
Parquet columns
The use of Parquet in observability was inspired from Shopify’s engineering team. This video is really cool ! Deep dive into long term metrics for planet-scale commerce, with Filip Petkovski
Fun fact: ClickHouse too uses columnar format under the hood :)