Time series data presents unique challenges when compared to other types of data. Write throughput is incredibly high while read throughput is even higher still. Deleting large ranges of data on a regular basis make updating indexes particularly hard.
In this talk I'll cover how the time series data use case differs from normal database use cases. I'll go into detail on the tradeoffs we made in InfluxDB to create a distributed system that is highly available and maximizes scale and throughput.
On the distributed side of things I'll cover how we handle consistency and coordination in the cluster. How we built a specialized MapReduce framework for time series data.