COEQ ML Model Testing Checklist

In machine learning, model quality doesn’t start with algorithms — it starts with how data is structured and used across the lifecycle.

Datasets

A fundamental concept in the ML workflow is the use of three distinct datasets.

Training dataset — used to train the model
Validation dataset — used to evaluate and tune the model
Test dataset — used to assess final model performance

This separation is critical - The test dataset must remain independent and not be used during training or tuning.

Why?

Because it provides a true reflection of model quality, free from bias introduced during development.

How Data Is Typically Split?

When sufficient data is available, datasets are commonly split using ratios (Training : Validation : Test).

60:20:20
80:10:10

These splits are usually done randomly, unless:

The dataset is small
There is a risk of non-representative samples
Real-World Constraints Change the Approach

In practice, data is rarely unlimited. When data is constrained:

Training and validation datasets are often derived from a single combined dataset
The test dataset remains separate to preserve objectivity

A Practical Solution for Limited Data:

Training and validation datasets are combined.
Multiple split combinations are created (e.g. 80% training / 20% validation).
Models are trained and tuned across these combinations.
Final performance is calculated as an average across runs.

This approach improves model reliability and robustness, even with limited data.

What This Means for AI Testing & Quality

For testing professionals, this isn’t just a data science concept — it’s a quality control mechanism.

How data is split directly impacts:

Model accuracy
Reliability of results
Confidence in testing outcomes
Real-world performance

Poor dataset separation = misleading test results.
Strong dataset discipline = trustworthy AI systems

The COEQ Perspective

At COEQ we believe that in AI systems, data is the test environment. If your datasets are not structured correctly, you are not truly testing the model — you are validating assumptions.

Training, Validation & Test Data in Machine Learning

Datasets

How Data Is Typically Split?

What This Means for AI Testing & Quality

The COEQ Perspective