Out[5]:

'/mnt/v/projects/DL_s3/Notebooks'

Preprocessing and Hyper-parameter tuning¶

In this notebook we'll discuss our full model pipeline: - sampling - preprocessing - transformation/augmentation - balancing/oversampling etc.

And additionally the hyperparmeter tuning approach we took to select optimal transformations and parameter for our model.

Data Sampling¶

We've split the dataset into test and training samples roughly trying to maintain an around 15% test sample size. The sample ratios used in our model training:

Test: 20% of full dataset:
Training: 80%

During hyper-parameter tuning we've further split the training dataset:

Validation: 20% of remaining samples (i.e. 17% of full dataset)
Remaining images used for training: 64%

Full Pipeline and Model:¶

Preprocessing and balancing:¶

Validation/test samples are normalized and resized
Additionally, a variable number of transformations are applied to the training sample
(optional: augmentation based oversampling can be used based on selected binned continuous variable, which generates additional transformed duplicate images)

Model:¶

Base Model (MobileNetV3 Small):
- Initial Conv2d layer
- Series of Inverted Residual Blocks (specific to MobileNetV3 architecture)
- Final Conv2d layer
  - (The classifier part of the original MobileNetV3 was removed)
Global Pooling Layer, AdaptiveAvgPool2d(1) (reduces the model output shape (batch_size, 576, height, width) - > (batch_size, 576, 1, 1))
Two Custom Heads:
- Gender Classifier:
  - Dropout layer
  - Linear layer: (576 -> 2)
  - Input features: 576 (MobileNetV3 Small's last_channel)
  - Output features: 2 (for binary gender classification)
- Age Regressor
  - Dropout layer
  - Linear layer: (576 -> 1)
  - Input features: 576
  - Output features: 1 (for age regression)

Visualizing Torchvision Transforms¶

As part of your hyperparameter tuning process we've selected the optimal transformations to use with our model.

Each enabled transformation is sequaintlly applied to each image:

No description has been provided for this image

Selecting Best Transformers:¶

(For val_age_mae red is good i.e. we want to minimize it and bad for val_gender_acc)

We can see that there is significant variance between the effect of each transform depending on if we're classifying gender or estimating age.

We have chosen to exclude the transforms that have signficant negative:

include_center_crop: True
include_color_jitter: True
include_gaussian_blur: True
include_gaussian_noise: False
include_random_affine: False
include_random_erasing: False
include_random_grayscale: true
include_random_horizontal_flip: False
include_random_perspective: True
include_random_rotation: True

Model Hyper-parameter Tuning:¶

You can view detailed charts for final runs used to select the optimal configurations here: Summary in Wandb

(this reports shows final runs with most of the parameters stabilized to very narrow ranges, more extensive tuning was used to select the appropriate base_lr, schedulers, batch_size etc.)

This report has pretty much all of the parameter that have been tried (not that it contains runs of multiple different tuning sessions/sweeps and might have used different samples etc. so it's included to give a broad approximate picture.

Additional Transformation used with Augmentation Based Oversampling:¶

We've chosen to include 2 final versions of our model:

V1/baseline no oversampling, all images in the dataset are only used once with randomly selected transformations to applied to a random proportion of them each epoch.
V2/improved same as above but additional augmented samples are used for all age bins

Training sample summary for both moodel:

Without oversampling:

Age Range	Count
0-9	2452
10-19	1268
20-29	5816
30-39	3586
40-49	1837
50-59	1845
60-69	1068
70-79	543
80-89	541

With augmented samples and oversampling:

Age Range	Count
0-9	4042
10-19	3213
20-29	6397
30-39	4836
40-49	3611
50-59	3617
60-69	3073
70-79	2705
80-89	2704

Examples of additional augmented duplicate samples that could be included: