Extracting features in the correct way involves several steps.
After computing the channels, they're merged into one matrix with 10 layers (number of channels I am computing). This allows me to use the full potential of openCV in future steps. OpenCV is an open source library for computer vision applications.
The features are nothing more than simple local sums calculated over rectangular areas in the image. The fastest way to do this is by using the integral image trick introduced by Viola and Jones in their face detection work. This allows for a local sum to be computed by a simple floating point operation, making them extremely fast to compute.
"To calculate the summation of the pixels in the black box, you take the corresponding box in the integral. You sum as follows: (Bottom right + top left – top right – bottom left).
So for the 3,5,4,1 box, the calculations would go like this: (30+0-17-0 = 13). For the 4,1 box, it would be (0+15-10-0 = 5)." - Taken from here.
OpenCV provides a function to obtain the integral of an image, and just by doing:
integral (MergedChannels (source), IntegralChns (destination));
I get a 10 layered matrix with the integral channel images in the correct order. Awesome, right?
OpenCV also allows access to a multichannel matrix's elements at a certain coordinate all at once. This comes in handy when its time to calculate the features, making it possible for all the channels being handled at the same time, and features being naturally and easily stored in a coherent manner.
This analysis has not only to be done in the whole image with a sliding window, but also in the same image multiple times for several different scales.This leads to dozens of thousands of sums being calculated per image, and each 640x480 image takes about half a second to be processed (this depends on the size of the sliding window, number of scalings being done and other parameters).
Now onto the final step of the algorithm, training and testing a machine learning method.