Skip to content


O’Reilly: The state of data quality in 2020

 The state of data quality in 2020

O’Reilly survey highlights the increasing attention organizations are giving to data quality and how AI both exacerbates and alleviates data quality issues.

https://www.oreilly.com/radar/the-state-of-data-quality-in-2020/

“Key survey results:

  • The C-suite is engaged with data quality. CxOs, vice presidents, and directors account for 20% of all survey respondents. Data scientists and analysts, data engineers, and the people who manage them comprise 40% of the audience; developers and their managers, about 22%.
  • Data quality might get worse before it gets better. Comparatively few organizations have created dedicated data quality teams. Just 20% of organizations publish data provenance and data lineage. Most of those who don’t say they have no plans to start.
  • Adopting AI can help data quality. Almost half (48%) of respondents say they use data analysis, machine learning, or AI tools to address data quality issues. Those respondents are more likely to surface and address latent data quality problems. Can AI be a catalyst for improved data quality?
  • Organizations are dealing with multiple, simultaneous data quality issues. They have too many different data sources and too much inconsistent data. They don’t have the resources they need to clean up data quality problems. And that’s just the beginning.
  • The building blocks of data governance are often lacking within organizations. These include the basics, such as metadata creation and management, data provenance, data lineage, and other essentials.

The top-line good news is that people at all levels of the enterprise seem to be alert to the importance of data quality. The top-line bad news is that organizations aren’t doing enough to address their data quality issues. They’re making do with inadequate—or non-existent—controls, tools, and practices. They’re still struggling with the basics: tagging and labeling data, creating (and managing) metadata, managing unstructured data, etc.”

Read more: https://www.oreilly.com/radar/the-state-of-data-quality-in-2020/

Stephen

Posted on: February 17, 2020, 6:50 am Category: Uncategorized

0 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.