11. Image Classification with Deep Learning: CNN in TensorFlow Using Python

Problem Statement

Develop an automated image classification system to accurately distinguish between various fruits and vegetables using deep learning techniques. Leveraging Convolutional Neural Networks (CNNs) implemented in TensorFlow with Python, this system will classify images of different fruits and vegetables, aiding in applications such as inventory management, quality control, and dietary analysis. The model will be trained, validated, and tested on a diverse dataset, ensuring high accuracy and generalization across a wide range of real-world images.

Device Selection
Data Preprocessing:
- Transform the Training Data set.
- Transform the Validation Data set.
- Load the Data sets
Model Architecture:
- Build The Model.
- Train the Model
Training and Validation
Evaluation
- Evaluate the Test Data set.
Model Saving
Single Image Prediction

1. Device Selection

The device is selected based on whether a GPU (NVIDIA CUDA or Apple MPS) is available, defaulting to the CPU if neither is found. - Please read - GPU Support in TensorFlow for NVIDIA and MacOs

2. Data Preprocessing

In this project, we employ three distinct datasets to develop and evaluate our image classification model: training, validation, and test datasets.

2.1 Transform the Training Dataset

The training dataset undergoes comprehensive transformations to enhance model robustness and generalization. These transformations include data augmentation techniques such as rotation, shifting, and flipping to simulate various real-world conditions and prevent overfitting.

The ImageDataGenerator in TensorFlow is a powerful tool for real-time data augmentation, which is essential for training deep learning models, especially when you have a limited dataset. Data augmentation involves creating new training samples through random transformations of the existing data, helping the model generalize better by seeing more variations of the data.

Here’s what each parameter in the ImageDataGenerator does:

1. `rescale=1./255`

Purpose: This rescales the pixel values of the images from their original range (0 to 255) to a range of 0 to 1.
Why It's Important: Normalizing the pixel values to a range between 0 and 1 helps in stabilizing the training process by ensuring that the inputs to the model have a consistent range. Most deep learning models, including CNNs, perform better when the input data is normalized.

2. `rotation_range=20`

Purpose: This randomly rotates the image within a range of 0 to 20 degrees.
Why It's Important: Rotation helps the model learn that the object of interest (e.g., a fruit or vegetable) can appear at different angles, improving the model's ability to recognize objects regardless of their orientation.

3. `width_shift_range=0.2`

Purpose: This shifts the image horizontally by a fraction of the total width, up to 20% of the width.
Why It's Important: Horizontal shifts help the model learn to recognize objects that may not be perfectly centered in the image.

4. `height_shift_range=0.2`

Purpose: This shifts the image vertically by a fraction of the total height, up to 20% of the height.
Why It's Important: Vertical shifts, like horizontal shifts, help the model handle variations in the positioning of objects within the image.

5. `shear_range=0.2`

Purpose: Shearing involves shifting one part of the image more than another, effectively slanting the image along the x or y-axis. This parameter specifies the shear angle in degrees.
Why It's Important: Shearing makes the model more robust to slight distortions or slants in the object’s shape.

6. `zoom_range=0.2`

Purpose: This randomly zooms in or out on the image by up to 20%.
Why It's Important: Zooming helps the model learn to recognize objects at different scales, making it more adaptable to varying object sizes in the images.

7. `horizontal_flip=True`

Purpose: This randomly flips half of the images horizontally.
Why It's Important: Horizontal flipping is useful for images where the object of interest can appear mirrored, such as a fruit or vegetable viewed from either side. It increases the variety of training samples and helps the model generalize better.

8. `fill_mode='nearest'`

Purpose: When an image is transformed (e.g., rotated or shifted), there may be pixels in the output image that do not have corresponding pixels in the input image. fill_mode='nearest' fills these newly created pixels with the nearest pixel value from the valid part of the image.
Why It's Important: This ensures that the augmented image does not contain blank spaces, which could confuse the model during training.

2.2

TensorFlow 1