Engineering data quality

Key takeaways
  • We will describe the different ways in which technical systems can cause data quality to deteriorate.
  • You will learn how to test and monitor data quality in production.
  • We will also describe the tradeoffs involved in data quality, delivery latency, and availability.
  • You will learn architectural patterns and data engineering methods for improving data quality.

Garbage in, garbage out - we have all heard about the importance of data quality. Having high quality data is essential for all types of use cases, whether it is reporting, anomaly detection, or for avoiding bias in machine learning applications. But where does high quality data come from? How can one assess data quality, improve quality if necessary, and prevent bad quality from slipping in? Obtaining good data quality involves several engineering challenges. In this presentation, we will go through tools and strategies that help us measure, monitor, and improve data quality. We will enumerate factors that can cause data collection and data processing to cause data quality issues, and we will show how to use engineering to detect and mitigate data quality problems.

Lars Albertsson

Sharpen your skills. Explore

Pick your topics of interest below
Large Spinner

Partners

Köld
Partners, Sticker Mule

Best companies

Best companies, Qlik
Best companies, VP Securities
Best companies, Telavox
Best companies, EDP
Best companies, Avensia
Best companies, Citerus
Best companies, Avega Group
Best companies, Handelsebanken
Best companies, Play'n'GO
Best companies, Capgemini
Best companies, Trustly
Best companies, Danskebank
Best companies, Lantmäteriet
Best companies, ÅF