What is data health and why is it critical to your business?


Every great data insight or data visualization starts with good, clean data. Whether you want to understand lifetime value, design upsell and cross-sell strategies, define personas, or develop sophisticated data models, having clean and consolidated data will enable better analytics, improve the performance of your marketing campaigns and maximize your marketing ROI.

Clean data is particularly crucial for CRM, ERP, sales and IT systems with customer data. For example, proper planning and cleansing of your customer data from the beginning will keep you from falling behind on your CRM implementation. Your data needs to be reviewed, filtered and cleaned to ensure that bogus data is not transferred. The the cost to the business of processing errors can be evaluated from the time spent on manual troubleshooting, forced ETL re-runs and at worst, representing incorrect or invalid data to the customers or employees to drive their business decisions.

  • How do you ensure your data is not wrong or incomplete when you digest data from various third-party sources, especially sources like FTP and AWS S3 which (unlike an API) do not have given structure all the time?

  • How do you successfully migrate data from an old system to new one?

It is safe to say that the majority of data flows have set of expected data types defined and very often the value range as well.

One option is to use SQL or Python transformations but such hard coded configuration or approach can be very time-consuming and it is lacking of the flexibility or simplicity to be reused. Additionally, it would not be obvious which rows and columns include rogue values until these transformations run into an error (or you would have to design a specific workflow to off-load them.)

Another option is to describe the data and set up value and type conditions for it in the form of rules. Once that’s done, all you need to do is make sure data flows include rules that check every time you run the orchestration (ETL process). KBC Data Health App has been designed to help you automate this data check process.

Typical use cases:

  • Ensuring data quality from systems with data collected by users (internal IT systems, CRM, user forms, etc.)

  • Migrating data from legacy systems - data migration assumptions vs. reality check

  • Validating crucial fields for report buildup

KBC Data Health Application

Data Health Application is an app designed to aid users to produce a clean data file. To boost user productivity, it provides users a simple and convenient solution to cleanse or filter data instead of creating multiple long queries in transformation to obtain the same results. Some primary features include:

  • Filtering data based on user configured rules to match business needs

  • Can be triggered to run on a scheduled basis

  • Generate a report with descriptions and reasons why rows are rejected

As many users did not have any prior knowledge in SQL, this application is capable of creating basic SQL functionalities through simple user interface inputs. The application does not have any pre-configured rules. It allows users to have the freedom to create rules tailored to their needs and wants. With the combination of KBC orchestration, this application can be triggered on a daily/weekly basis depending on user’s business requirements. With that being said, users will have an automated progress that generates “clean” data to conduct any in depth analysis without worrying about handling corrupted data or outliers.

Supported Rules:

  1. List Comparison

  2. Digit Count

  3. Numeric Comparison (Value comparison)

  4. Regular Expression (Regex)

  5. Column Type (Applicable value type)


Input Table:

Screen Shot 2017-08-29 at 15023 PMpng


  • User wants anything within the “Western Europe” Region

  • User is only interested in countries placed within the top 10 happiness rank

  • Happiness score cannot be empty

Output table:

Screen Shot 2017-08-29 at 21203 PMpng

If you’re already a KBC user you can find the Data Health app alongside the rest of our data applications. Not yet a user and want to learn more? Contact us to discuss.