This article was written by Ankit Sachan.
In this post, I shall explain object detection and various algorithms like Faster R-CNN, YOLO, SSD. We shall start from beginners’ level and go till the state-of-the-art in object detection, understanding the intuition, approach and salient features of each method.
What is Image Classification?
Image classification takes an image and predicts the object in an image.
The problem of identifying the location of an object (given the class) in an image is called localization. However, if the object class is not known, we have to not only determine the location but also predict the class of each object.
Predicting the location of the object along with the class is called object Detection. In place of predicting the class of object from an image, we now have to predict the class as well as a rectangle (called bounding box) containing that object. It takes 4 variables to uniquely identify a rectangle. So, for each instance of the object in the image, we shall predict following variables:
- class_name,
- bounding_box_top_left_x_coordinate,
- bounding_box_top_left_y_coordinate,
- bounding_box_width,
- bounding_box_height
This article explains the following techniques:
- Object Detection using Hog Features
- Region-based Convolutional Neural Networks(R-CNN)
- Spatial Pyramid Pooling(SPP-net)
- Fast R-CNN
- Faster R-CNN and Regression-based Detectors
- YOLO(You only Look Once)
- Single Shot Detector(SSD)
The full article is available here.