Workshop: Bulletproof Data Validation in Python

Bad data can easily yield wrong results even through well-written code. Therefore, to be reliable, a solution must be able to catch bad data before it reaches the solution engine.

The best place to catch and handle bad data is right in the input schema. But to achieve that, in addition to primary keys, we need to establish relationships between tables via foreign key constraints, precisely define the data type of every column, and define additional data predicates when appropriate.

The key idea here is to be as precise as you can about what you expect the data to look like. This is opposed to guessing what can go wrong with the data and adding catches for each exception.

See our LinkedIn post for an example of how we would prevent some of the +10 data mistakes in the tiny data set below.

In this workshop, you will learn how to implement bulletproof data integrity checks in Python.

We will use a package called ticdat, which is easy to use and has been around for quite some time but has been highly underutilized by the data science and operations research communities.

Agenda:

  • 5 min: Brief introductions (name, country, and what you do);

  • 5 min: Additional motivation to implement data validation;

  • 25 min: Defining and implementing data schemas with data validation using ticdat;

  • 15 min: Embedding schemas into solution engines;

  • 10 min: Questions and exchange of experiences;

The participants are welcome to remain on the call after the workshop to network or to ask additional questions.

More details:

  • Free of cost;

  • At most 20 participants per session;

  • Virtual (on Zoom) and the call will NOT be recorded;

  • Led by Aster Santana with support from the Mip Wise team;

  • The material used during the workshop (slides, and sample scripts) will be shared after the session with all the attendees.

Sessions:

Session #1: Mar 7, 2023, Noon - 1:00 pm ET (UTC -5)

Session #2: Mar 9, 2023, 8:00 am - 9:00 am ET (UTC -5)

About the workshop:

Registration:

Please fill out the form below if you would like to attend the workshop.

By submitting the form below, you understand that the number of spots available in each session is limited and this registration does not guarantee you a spot.