Skip to main content

Project: U-Net Segmentation

· 4 min read

View Project

Python PyTorch Artificial Intelligence (AI) Image Segmentation Carvana Image Masking Challenge Cityscapes Dataset Last Updated: December 2023

Backup of Old Project (December 2023)

This is a backup of an old project focused on training a U-Net model from scratch for semantic segmentation from scratch on the Cityscapes dataset and Carvana dataset. The images are DOWNSCALED to speed up the training process for learning purposes.

The model has been trained and tested with the following results:

[!WARNING] If you are looking for a high-quality model, this is NOT the place. This is only a practice exercise when I was in year 2.

[!NOTE] Training is done long ago and some parameters recorded, such as the number of epochs, may not be accurate

[!IMPORTANT] This project is done few years ago from the point of writing this documentation. Back then, the training details are not documented properly and the jupyter notebook results are not saved for each experiment. The results are not excellent either. Sorry for the inconvenience.

Experiments

Carvana Dataset (Carvana Image Masking Challenge Dataset) Carvana Image Masking Challenge

Dataset information

  • Original image dimension: 1918 x 1280
  • Training image dimension: 160 x 240 and 320 x 480

Validation accuracies (highest)

  • Validation Pixel Accuracy: 0.9955
  • Validation Dice Score: 0.9911

Results

Results on test set: (Top: Prediction, Bottom: Reference)

alt text

Results on train set: (Top: Prediction, Bottom: Reference)

alt text

Inspection of intermediate layers: (In the order: Prediction, Output of Downsampling Block 1, Output of Bottleneck Block, Output of Upsampling Block 1, Reference)

alt text

Details of the experiments
Normalization: mean 0, std 1
Augments:
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomVerticalFlip(p=0.1),
transforms.RandomRotation(degrees=35),

**29/12/2023 17:15: result12_**
Downscaled image dimension: 160 x 240
Batch size: 32
Learning rate: 5e-7
Decay: StepLR: step_size=5, gamma=0.85
Loss: BCEWithLogitsLoss
Epochs: 42
Final: Test accuracy: 90.9%, Avg loss: 0.403125, Test recall: 0.5831877589225769, precision: 0.980972170829773

**30/12/2023 08:40: result13_**
Downscaled image dimension: 160 x 240
Batch size: 32
Learning rate: 1e-4
Decay: ReduceLROnPlateau: patience=5
Loss: BCEWithLogitsLoss
Epochs: 50
Final: Test accuracy: 81.0%, Avg loss: 0.775568, Test recall: 0.1010913997888565, precision: 0.9727051258087158

**30/12/2023 11:33: result14_**
Downscaled image dimension: 160 x 240
Batch size: 32
Learning rate: 1e-4
Decay: ReduceLROnPlateau: patience=5
Loss: BCEWithLogitsLoss
Epochs: 100
Final:
Test accuracy: 96.6%, Avg loss: 0.210099, Test recall: 0.9129086136817932, precision: 0.927651584148407

**30/12/2023 15:47: 1703922437**
Downscaled image dimension: 160 x 240
Batch size: 16
Learning rate: 1e-4
Decay: None
Loss: BCEWithLogitsLoss
Test accuracy: 99.5%, Avg loss: 0.013175,
Test recall: 0.9918522238731384, precision: 0.9872172474861145, dice_score: 0.9895293116569519

**30/12/2023 16:49: 1703926149**
Downscaled image dimension: 320 x 480
Batch size: 8
Learning rate: 1e-4
Decay: None
Loss: BCEWithLogitsLoss
Test accuracy: 99.6%, Avg loss: 0.009994,
Test recall: 0.9942919611930847, precision: 0.9879629611968994, dice_score: 0.9911173582077026

Cityscapes Dataset (Cityscapes Dataset) Cityscapes Dataset

  • Test Pixel Accuracy: 0.8669
  • The code for evaluating the mAP for the cityscape dataset has been lost. The code in this repository does not reflect the true results.
1704279189_cityscapes/model_1704284109.h5: 

**Documented:**
Test acc: 0.8402184247970581, Test loss: 0.6274673556908965