People Innovation Excellence

Feature Descriptor : SIFT (Scale Invariant Feature Transform) Part 1 : Introduction to SIFT

Features and interesting points are important information that can be extracted from an image to provide a “feature” description of an object in the image. This description can then be used to locate the object in another image which is usually called as image matching. Image matching is a fundamental aspect of many problems in computer vision, including object or scene recognition, solving for 3D structure from multiple images, and motion tracking. In order to achieve a good matching result, there are many considerations when extracting features and record them from an image.

Matching features across different images is a common problem in computer vision. When the images are similar in nature, which means they have same scales and same orientations, simple corner detector can be used to extract features from both images. However, when the images are different both in scales and rotations, simple corner detector cannot solve the problem. Some corner detectors like Harris detector is rotation invariant, which means, we still can find the same corner even if the image is rotated. It is obvious because corners remain corners in rotated image. However, Harris is not scale invariant because a corner may not be a corner if the image is scaled. A corner in a small image within a small window may look flat when it is zoomed in the same window as illustrated on the picture below :

Picture 1 – Illustration of image scaling

SIFT stands for Scale Invariant Feature Transform is a popular interest point descriptor which is widely used because of its scale and rotation invariant characteristics. SIFT was created by David Lowe from University British Columbia in 2004. The example of SIFT robustness against rotation and scale transformation is shown in the picture below :

Picture 2 – Robustness of SIFT against rotation and scale transformation

In short, there are three goals which are expected to be achieved by using SIFT :

  • To extract distinctive invariant features which can be correctly matched against a large database of features, providing a basis for object and scene recognition
  • Extracting features which are invariant to image scale and rotation
  • Extracting features which are robust against affine distortion, change in 3D viewpoint, and noise.

The Advantages of SIFT

 Besides of its scale and rotation invariant features, SIFT also have several other advantages:

  • Locality
    Before we go through details, we should know first what is local feature and what is the difference between local and global features?
    Basically there are two types of features that can be extracted from an image, they are global and local features. Global features describe the image as a whole to generalize the entire object. It includes contour representations, shape descriptors, and texture features such as shape matrices and Histogram of Gradient (HOG). Local features describe the image patches (key points in the image) of an object. The example of local features are SIFT and SURF. Generally, global features are used for low level applications such as object detection and classification and for higher level applications such as object recognition, local features are used because it is more robust to occlusion and clutter than global features.
  • Distinctive
    Individual feature extracted by SIFT has very distinctive descriptor, which allows a single feature to find its correct match with good probability in a large database of features.
  • Quantity
    One major advantage of SIFT is it can generates large numbers of features that densely cover the image over the full range scales and locations. For instance, it is possible to collect 2000 stable features from a typical image of size 500×500 pixels. As we know that the quantity of features is important for object recognition, where to detect the small objects in cluttered background, it requires at least 3 features from each object to be correctly matched for reliable identification.
  • Efficiency
    The performance of SIFT is close to real-time performance

The details about SIFT algorithm will be explained in part 2.


Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2), 91-110.

Published at :
Written By
Irene Anindaputri Iswanto, S.Kom., M.Eng.
Lecture Specialist S2 | School of Computer Science

Periksa Browser Anda

Check Your Browser

Situs ini tidak lagi mendukung penggunaan browser dengan teknologi tertinggal.

Apabila Anda melihat pesan ini, berarti Anda masih menggunakan browser Internet Explorer seri 8 / 7 / 6 / ...

Sebagai informasi, browser yang anda gunakan ini tidaklah aman dan tidak dapat menampilkan teknologi CSS terakhir yang dapat membuat sebuah situs tampil lebih baik. Bahkan Microsoft sebagai pembuatnya, telah merekomendasikan agar menggunakan browser yang lebih modern.

Untuk tampilan yang lebih baik, gunakan salah satu browser berikut. Download dan Install, seluruhnya gratis untuk digunakan.

We're Moving Forward.

This Site Is No Longer Supporting Out-of Date Browser.

If you are viewing this message, it means that you are currently using Internet Explorer 8 / 7 / 6 / below to access this site. FYI, it is unsafe and unable to render the latest CSS improvements. Even Microsoft, its creator, wants you to install more modern browser.

Best viewed with one of these browser instead. It is totally free.

  1. Google Chrome
  2. Mozilla Firefox
  3. Opera
  4. Internet Explorer 9