Blogs >
Object Detection with Python

Object Detection with Python

Object Detection is the process of finding real world object instances in still images or Videos. When we see an image, our brain instantly find the objects contained in it. On the other hand, it takes a lot of time and training data for a machine to learn to find object instances. But the recent advanced technology has made this computer vision field a lot easier.

clear
% Load the ground truth data
load('DaataAndCars.mat')
load('rcnnStopSigns.mat')
% Update the path to the image files to match the local file system
 DaataAndCars.imageFilename = fullfile('carData/', DaataAndCars.imageFilename);

% Display a summary of the ground truth data
summary(DaataAndCars)
Variables:
imageFilename: 41×1 cell array of character vectors
stopSign: 41×1 cell
carRear: 41×1 cell
carFront: 41×1 cell

% Only keep the image file names and the stop sign ROI labels
carFront = DaataAndCars(:, {'imageFilename',
'carRear'});
% Display one training image and the ground truth bounding boxes
I = imread(carFront.imageFilename{1});

I = insertObjectAnnotation(I,'Rectangle',carFront.carRear{1},
' Car Rear',
'LineWidth',24);

figure
imshow(I)

% A trained detector is loaded from disk to save time when running the
% example. Set this flag to true to train the detector.
% Set training options
    options = trainingOptions(
'sgdm',
...
        'MiniBatchSize', 128, 
...
        'InitialLearnRate', 1e-3, 
...

        'LearnRateSchedule', 'piecewise', ...
        'LearnRateDropFactor', 0.1, ...
        'LearnRateDropPeriod', 100, ...
        'MaxEpochs', 100, ...
        'Verbose', true);


    % Train an R-CNN object detector. This will take several minutes.    
    rcnn = trainRCNNObjectDetector(carFront, cifar10Net, options, ...
    'NegativeOverlapRange', [0 0.3], 'PositiveOverlapRange',[0.5 1])


*******************************************************************
Training an R-CNN Object Detector for the following object classes:

* carRear

--> Extracting region proposals from 41 training images...done.

--> Training a neural network to classify objects in
training
data...

Training on single GPU.
Initializing input data normalization.
|========================================================================================|
| Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning |
| | | (hh:mm:ss) | Accuracy | Loss | Rate |
|========================================================================================|
| 1 | 1 | 00:00:00 | 62.50% | 0.6585 | 0.0010 |
| 25 | 50 | 00:00:15 | 100.00% | 0.0018 | 0.0010 |
| 50 | 100 | 00:00:31 | 100.00% | 0.0008 | 0.0010 |
| 75 | 150 | 00:00:46 | 100.00% | 0.0014 | 0.0010 |
| 100 | 200 | 00:01:02 | 100.00% | 0.0001 | 0.0010 |
|========================================================================================|

Network training complete.

--> Training bounding box regression models for each object
class...100.00%...done.

Detector training complete (with warnings):


*******************************************************************


rcnn = 
 rcnnObjectDetector with properties:

Network: [1×1 SeriesNetwork]
RegionProposalFcn: @rcnnObjectDetector.proposeRegions
ClassNames: {'carRear' 'Background'}
BoxRegressionLayer: 'conv_2'

% Read test image
testImage = imread('image003.jpg');
% Detect stop signs
[bboxes,score,label] = detect(rcnn,testImage,'MiniBatchSize',128)
bboxes = 2×4
1016 427 609 378
782 363 406 407

score = 2×1 single column vector

0.9983
0.5087

label = 2×1 categorical array
 carRear

% Display the detection results
[score, idx] = max(score);
bbox = bboxes(idx, :);
annotation = sprintf('%s: (Confidence = %f)', label(idx), score);
outputImage = insertObjectAnnotation(testImage, 'rectangle', bbox, annotation);
figure
imshow(outputImage)