Learning What I Don't Know In ML.NET

In 2018 ML.NET was was initially released so .Net developers could easily do machine learning. Mostly due to excitement I ran through the samples and even tried to apply it to a system I was heavily iterating on. Sadly as much I was impressed with what the samples could do I simply didn't have enough data to apply it to anything I was working on.

Sadly apart from the odd virtual conference I've watched since, machine learning mostly faded away for me. That was until I stumbled across a youtuber called Code Bullet. His language and tone won't be for everyone but for me his funny way of trying to solve problems got me excited again. (I also like how he admits 'things' are hard but that is one for another time.)

The Goal...

So my goal is to try and understand enough so I can do 'basic' machine learning or at least know how in .NET to implement other peoples models.

One of the classic simple machine learning problems is to take images of drawn numbers and train a model to output which number (0-9) it is. I chose to try and do this in ML.NET as this fits better with my .NET background and looks to be the current main .NET implementation.

The MINST Database

So first off we need a lot of data, luckily as this is the classic problem there is a free data set called The MINST Database of handwritten digits. Its a set of images to train your model and another set to prove it.

They have used their own file format to make it a small download which is also a great format for inputting into a training model. I however want to work with real images, as for me this is closer to a real project.

So first thing is to extract the images with a horrendous bit of code which can be found in this GIST. This outputs each of the images to its own file with the value of the image in the filename.

Training

So the training program is going to take the input data and then ML.NET is going to train and output a model.

First step here is to define a container and setup the input data.

public class ImageData
{
    public string Label { get; set; }
    public UInt32 LabelAsKey { get; set; }
    public string Path { get; set; }
    public string Filename { get; set; }
    public byte[] Image { get; set; }
}

Then the following to grab all the images.

var files = Directory.GetFiles(@"C:\Users\Jonathandent\Desktop\LearningML\ExampleData\");
var data = files.Select(x => new ImageData
{
    Path = x,
    Label = ProcessFilename(x),
    Filename = Path.GetFileNameWithoutExtension(x)
});

It uses a rough little helper function just to strip the filename down to its value.

public static string ProcessFilename(string path)
{
    var filename = Path.GetFileNameWithoutExtension(path);
    var end = filename.Replace("image.", "");
    var single = end.First();
    return single.ToString();
} 

Now we need to start setting up the ML.NET side of things.

We need to grab these package references and add them to our project file.

<PackageReference Include="Microsoft.ML" Version="1.5.2" />
<PackageReference Include="Microsoft.ML.ImageAnalytics" Version="1.5.2" />
<PackageReference Include="Microsoft.ML.Vision" Version="1.5.2" />
<PackageReference Include="SciSharp.TensorFlow.Redist" Version="2.3.1" /

Next we create a ML context and supply it with our image data.

var mlContext = new MLContext();

var trainingData = mlContext.Data.LoadFromEnumerable<ImageData>(data);

From reading up here its a good idea to shuffle the data so we don't end up when its training working on all the ones then the twos which can screw the results.

As ML.NET works using a pipeline system we end up setting up a lot of options which we will plugin later.

var shuffledData = mlContext.Data.ShuffleRows(trainingData);

After this we need to setup the label of the image and also setup reading in the bytes of the images.

var preprocessingPipeline = mlContext.Transforms.Conversion.MapValueToKey(
                    inputColumnName: "Label",
                    outputColumnName: "LabelAsKey")
                .Append(mlContext.Transforms.LoadRawImageBytes(
                    outputColumnName: "Image",
                    imageFolder: @"C:\Users\Jonathandent\Desktop\LearningML\ExampleData\",
                    inputColumnName: "Path"));

Both of these steps make up our pre-processed data.

var preProcessedData = preprocessingPipeline.Fit(shuffledData).Transform(shuffledData);

Next we split up this data into batches including a batch so ML.NET can validate how good the model is.

var trainSplit = mlContext.Data.TrainTestSplit(data: preProcessedData, testFraction: 0.3);
var validationTestSplit = mlContext.Data.TrainTestSplit(trainSplit.TestSet);

Now we can pass in a few options and train the model.

var trainSet = trainSplit.TrainSet;
var validationSet = validationTestSplit.TrainSet;
var testSet = validationTestSplit.TestSet;

var classifierOptions = new ImageClassificationTrainer.Options()
{
    FeatureColumnName = "Image",
    LabelColumnName = "LabelAsKey",
    ValidationSet = validationSet,
    Arch = ImageClassificationTrainer.Architecture.ResnetV2101,
    MetricsCallback = (metrics) => Console.WriteLine(metrics),
    TestOnTrainSet = false,
    ReuseTrainSetBottleneckCachedValues = true,
    ReuseValidationSetBottleneckCachedValues = true
};

var trainingPipeline = mlContext.MulticlassClassification.Trainers.ImageClassification(classifierOptions)
                .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabel"));

var trainedModel = trainingPipeline.Fit(trainSet);

Last and very important as this can take quite a while (hours..) to train is that we save our model.

mlContext.Model.Save(trainedModel, trainingData.Schema, @"C:\Users\Jonathandent\Desktop\LearningML\model.zip");

Once this program has run we will get a nice zip file with our model.

Using the model

One of the more important things I learnt here is how you pre-process your data when training will effect the steps you need to take to use the model. You need to match steps in the pre-processing stage so the inputs into the model are the same.

Using the model is pretty straight forward.

var mlContext = new MLContext();

// Load trained model
var trainedModel = mlContext.Model.Load(@"C:\Users\Jonathandent\Desktop\LearningML\model.zip", out modelSchema);

var currentFile = @"C:\Users\Jonathandent\Desktop\LearningML\testImage.bmp";

//Load the image information
var image = new ImageData
{
	Path = currentFile,
	Filename = System.IO.Path.GetFileNameWithoutExtension(CurrentFile)
};

var data = mlContext.Data.LoadFromEnumerable<ImageData>(new List<ImageRow> { image });

//Load the bytes and pre-process the same say
var preprocessingPipeline = mlContext.Transforms.LoadRawImageBytes(
	outputColumnName: "Image",
	imageFolder: @"C:\Users\Jonathandent\Desktop\LearningML\ExampleData\",
	inputColumnName: "Path");

var preProcessedData = preprocessingPipeline.Fit(data).Transform(data);

var predictionData = trainedModel.Transform(preProcessedData);

//Get a predicition..
var predictions = mlContext.Data.CreateEnumerable<ImageRowOutput>(predictionData, false);

var outputLabel = "";
foreach(var item in predictions)
{
	outputLabel = item.PredictedLabel;
	break;
}

With a bit of luck the outputLabel should match the number drawn in the image.

I wrapped this code in a little rough WPF application to make it easier to test lots of images.

As much as the above works really well, at this point I don't really have a understanding of what its doing to create this model.

One for another day.