Image Classification Project_Imagenette

3 min readMay 11, 2021

Hello, This is Tony and I am a newcomer of Machine Learning. My full Jupyter Notebook can be found on https://jovian.ai/wjshku/proj-cov-imagenette. Let’s begin now.

Explore our Dataset

I found this dataset from FastAI and the source is here. https://github.com/fastai/imagenette.

This is a simplified dataset that contains 10 classes of images (each of them have training set around 1000 images and validation set around 500 images), which are classes = ['tench', 'English springer', 'cassette player', 'chain saw', 'church', 'French horn', 'garbage truck', 'gas pump', 'golf ball', 'parachute'].

Prepare our data

After finishing most of my code, and run the evaluate function(which will be explained later), I found that the images do not share the same size which is very very terrible!!

So after a long time, I finally figured out the solution: We have to use Torch transforms class to resize our image files(I even wrote a transpose function previously, only to find the image size are really really not even equal after transpose). If you want to learn more about transforms, see the attached link for more details.

https://www.youtube.com/watch?v=X_QOZEko5uE

transform = transforms.Compose([transforms.Resize((64,64)),transforms.ToTensor()])
dataset = ImageFolder(data_dir+'/train', transform=transform)

Write Model & Upload to GPU

To facilitate handling large amount of images, we should make use of Google Colab’s GPU. After writing the model and dataloader conventionally, we need an additional to_device() function. See below.

def to_device(data, device):
    """Move tensor(s) to chosen device"""
    if isinstance(data, (list,tuple)):
        return [to_device(x, device) for x in data]
    return data.to(device, non_blocking=True)

class DeviceDataLoader():
    """Wrap a dataloader to move data to a device"""
    def __init__(self, dl, device):
        self.dl = dl
        self.device = device
        
    def __iter__(self):
        """Yield a batch of data after moving it to device"""
        for b in self.dl: 
            yield to_device(b, self.device)

    def __len__(self):
        """Number of batches"""
        return len(self.dl)

Train our Model!

So finally, we could train the model. In consideration of the complexity of image dataset, I chose 5 convolutional layers and 2 linear layers.

self.network = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 64 x 32 x 32

            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 128 x 16 x 16

            nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 256 x 8 x 8

            nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(512, 512, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 512 x 4 x 4
            
            nn.Conv2d(512, 1024, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(1024, 1024, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2), # output: 512 x 2 x 2

            nn.Flatten(), 
            nn.Linear(1024*2*2, 1024),
            nn.ReLU(),
            nn.Linear(1024, 128),
            nn.ReLU(),
            nn.Linear(128, 10))

The learning rate of my model is 0.0001 which is quite low so more epochs are needed.

One of the difficulties I faced was the issue of learning rate. A large learning rate could lead to too fast convergence to a local minimum and a too small learning rate could make the process time-consuming. So we should do an experienment when choosing lr. If the loss and accuracy did not change at all or even increased, then you should consider decreasing the lr(we normally use 1e-x, x>1). Click on the link to know more.

Understand the Impact of Learning Rate on Neural Network Performance - Machine Learning Mastery

Deep learning neural networks are trained using the stochastic gradient descent optimization algorithm. The learning…

machinelearningmastery.com

Summary

Okay so the final result of model accuracy is almost 70% which is pretty good actually. But obviously we could do better. I will update the model after I learnt some more advanced techniques such as regularization.

Last but not least, if you are interested, you could also take a look at the training tutorial from FastAI(the final result is better than mine of course, but they also utilized some advanced functions.) https://docs.fast.ai/tutorial.imagenette.html