Eventually, time will kill your data processing

Key takeaways
  • You will learn a multitude of ways that time causes problems in data processing.
  • You will also learn how to avoid data loss caused by timing issues in data collection.
  • You will learn how to mitigate time-related issues in batch processing pipelines with the aid of workflow orchestration.
  • You will also learn the tradeoffs involved in handling timing issues in stream processing.

Race conditions and intermittent failures, daylight savings time, time zones, leap seconds, overload conditions - time is a factor in many of the most annoying problems in computer systems. Data engineering is not exempt from problems caused by time, but also has a slew of unique problems. In this presentation, we will enumerate the time-related problems that we have seen cause trouble in data processing system components, including data collection, batch processing, workflow orchestration, and stream processing. We will also provide a handful of tools and tricks to avoid timing issues in data processing systems.

Lars Albertsson

Sharpen your skills. Explore

Pick your topics of interest below
Large Spinner

Partners

Köld
Partners, Sticker Mule

Best companies

Best companies, Qlik
Best companies, VP Securities
Best companies, Telavox
Best companies, EDP
Best companies, Avensia
Best companies, Citerus
Best companies, Avega Group
Best companies, Handelsebanken
Best companies, Play'n'GO
Best companies, Capgemini
Best companies, Trustly
Best companies, Danskebank
Best companies, Lantmäteriet
Best companies, ÅF