picostitch
crafting (and) JavaScript

ML for VRT - Part 2: Learning Keras

In part 1 I started my naive investigation on how to apply machine learning for making visual regression tests (VRT) better. I described the problem to solve, explored Keras very superficially and did also touched on the complexity of doing ML myself as opposed to having colleagues who are experts and who throw phrases like "train a model" and "predict" etc. around.
Oh boy, did I underestimate this.

Keras - A Deep Learning API

The above paragraph is gibberish? Let's take a step back again.

Since Kamal had pointed me to Keras I go with the flow, I trust his expertise and I start reading what it is.

Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result as fast as possible is key to doing good research.

Sounds like what I need. And if my VRT will run on the server I am fine with Python, which is a great language! Though I had to ask Kamal, what about JavaScript. I know there is tensorflow for JS, he said I should read this but from all what I hear and learned Python seems to be the first go to language. So I stiick to it, I also want to learn fast. So I started digging out my rusty Python knowledge :).

Next I read Introduction to Keras for Engineers. The first important thing I learned are the what I will learn in this guide, which sounds like the steps I need:

  1. Prepare your data before training a model
  2. Do data preprocessing
  3. Build a model that turns your data into useful predictions
  4. Train your model
  5. Evaluate your model on a test data
  6. Customize
  7. Speed up training by leveraging multiple GPUs.
  8. Refine your model

I guess I have to start taking some screenshot, to do step 1 "Prepare data".

Preprocessing

The next step is preprocessing data, the guide says:

In general, you should seek to do data preprocessing as part of your model as much as possible, not via an external data preprocessing pipeline.

On the other hand this might cause a lot of data, imagine every image has a million pixels, won't that be slow as hell? So I asked Kamal again, since that was not that clear from the guide:

Me: How do I preprocess my screenshots?
Kamal: Keras preprocessing will do that for you.
Me: I expect images to have many pixels and also varying sizes, do I have to preprocess those?
Kamal: The library takes care of it.

The answer came later in the guide too:

In Keras, you do in-model data preprocessing via preprocessing layers
[..]
The key advantage of using Keras preprocessing layers is that they can be included directly into your model, either during training or after training, which makes your models portable.

Makes sense to me. But still feels like it will be computation intensive. But let's see. The guide then lists some code, that looks readable but what's under the hood is magic to me. But let me get through the process first and eventually it will reveal it's magic, I learned that. The alternative would be to go deep into the science behind it, but then I would not get done in the next two years ;).

Building Models

This is just step three of the eight steps listed above.

A "layer" is a simple input-output transformation (such as the scaling & center-cropping transformations above).
[..]
You can think of a model as a "bigger layer" that encompasses multiple sublayers and that can be trained via exposure to data.

Sounds like docker, hehe. Next, some code, I understand: