Problem:

Write an application that uses a dataset of images (of at least) 20 images downloaded from the internet with various content (landscapes, objects, people, etc). The application should take an image as input argument and determine the set of similar images from the database by doing histogram comparisons. Suppose the set of all images is I={I_i, I₂, …. I_n}. Extract an image Q from I and set it as the query image. DO the following steps:

1. Compute the histograms of all images in I and for the image Q.

2. Compare the histograms (using all metrics available in OpenCV for histogram comparison) of Q with Q and with all other image histograms from the set I.

3. Establish the maximum similarity as being the comp(Q,Q) and normalize the results of other comparisons against this maximum similarity. (Beware different metrics have different numerical values for the perfect similarity … (some zero, others large positive values, some 1, some large negative values, etc).

4. Use a second run with a color reduction mechanism where the histograms are computed on a smaller number of bins i.e. 64, 32 bins (each bin containing respectively 256/64 or 256/32 colors).

Compare the results and explain for each metric the results. What about after color reduction. Can we say that image retrieval based on graphical content be performed using histogram comparison?

You should print results in tabular numerical format where the comp(Q,Q) value is first displayed as reference values followed by all other similarities normalized.