Wednesday, 27 March 2013

Feature extraction: Completed

Hello again.

Extracting features in the correct way involves several steps.

After computing the channels, they're merged into one matrix with 10 layers (number of channels I am computing). This allows me to use the full potential of openCV in future steps. OpenCV is an open source library for computer vision applications.

The features are nothing more than simple local sums calculated over rectangular areas in the image. The fastest way to do this is by using the integral image trick introduced by Viola and Jones in their face detection work. This allows for a local sum to be computed by a simple floating point operation, making them extremely fast to compute.

"To calculate the summation of the pixels in the black box, you take the corresponding box in the integral. You sum as follows: (Bottom right + top left – top right – bottom left).

So for the 3,5,4,1 box, the calculations would go like this: (30+0-17-0 = 13). For the 4,1 box, it would be (0+15-10-0 = 5)." - Taken from here.

OpenCV provides a function to obtain the integral of an image, and just by doing:
integral (MergedChannels (source), IntegralChns (destination));

I get a 10 layered matrix with the integral channel images in the correct order. Awesome, right?

OpenCV also allows access to a multichannel matrix's elements at a certain coordinate all at once. This comes in handy when its time to calculate the features, making it possible for all the channels being handled at the same time, and features being naturally and easily stored in a coherent manner.

This analysis has not only to be done in the whole image with a sliding window, but also in the same image multiple times for several different scales.This leads to dozens of thousands of sums being calculated per image, and each 640x480 image takes about half a second to be processed (this depends on the size of the sliding window, number of scalings being done and other parameters).

Now onto the final step of the algorithm, training and testing a machine learning method.


Thursday, 21 March 2013

Channels: completed

You could say that, in terms of implementation, the Integral Channel Features algorithm is divided in three major parts:
  • Computation of channels - DONE
  • Feature extraction
  • Classification
In the paper, many channels were computed and tested for performance , but once that work is done there is no need to repeat it. Only the ones that achieve the best results were computed. These were the gradient magnitude, gradient histogram channels with 6 bins of orientation and the LUV colour channels.

Now its time to extract features from those channels. Those features constitute of local sums using the integral of the image for fast computation.

I'm not yet sure of how this is going to be done.


PS: I'll be happy to send samples of code to anyone interested

Monday, 18 March 2013

What is this blog gonna be about?

Hello everyone!

My name is Pedro Silva, I'm 24 years old and I study Mechanical Engineering in Aveiro University, Portugal. On the next few months I will be working on my final project to obtain my master's degree.

The title of the project is "Visual Recognition of Pedestrians for a Driver Assistance System" and my main goal is to create a software application that uses Computer Vision tools to detect pedestrians on images of urban setting. This application is to be tested and (hopefully) validated in the ATLASCAR, which is an on-going project of the Mechanical Engineering Department with the goal of creating an autonomous driving car. This car has participated in several robotics competitions with many prizes won.

In this blog I will be making regular updates of the development of my work. I will be writing in English so that the content can be accessible to anyone, and hopefully get some feedback !

Pedestrian Detection is a discipline that has been subject to a great deal of research and investigation in the past decade, and because of that the development of this technology has been extraordinary. After conducting some literature review I was able to create a work plan.

As a first step, I will implement the Integral Channel Features, an algorithm that takes advantage of the wealth of information contained in various channels of an image. The paper describing this method can be found here.

Since this method runs at a slow rate, due to the need to evaluate each image several times at different scales, I will then implement the FPDW - Fastest Pedestrian Detector of the West. This algorithm speeds up the process through a number of simplifications and approximations made on the first one. It is documented that this method runs at 6 FPS on 640x480 sized images.

I will be developing the program under ROS (C++) environment and at the moment I've assembled a platform for advertising, subscribing, processing and publishing images, which will be the base of all the future work.

My introductory presentation that was made for the laboratory's team can be downloaded here, although it is in Portuguese.

And this is it.