User:Turtlesoup4290

From Wikipedia, the free encyclopedia

Colour Indexing is a topic in computer vision which is discussed in CVonline [1]

Colour indexing is the process of ranking, or otherwise representing, a set of images by their colour content for use in image retrieval tasks such as automated searches. The process typically involves using low-level data within an image (such as colour, contrast, shape, and texture) in order to generate an index for the image. The concept of using low-level colour data in order to index sets of images has been around since the beginnings of modern day computer vision. However, restrictions in technology at that time has meant that the concept is only recently becoming achievable to implement and utilise.

Uses of colour indexing[edit]

Colour indexing a set of images has the potential to be benificial to a large range of practical applications regarding image analysis and image retrieval. Below are a four practical applications of colour indexing: [2]

Image Retrieval[edit]

Indexing an image by colour could allow for a set of images to be stored and represented in a database by their index. The retrieval of similar or matching images to a query image could then be perfoned by applying the same indexing method to the query image, and searching for images in the database with a similar, or matching indices. For example, consider a hypothetical application assigned with the task of retrieving all frames in a video sequence than contain an image of an apple, where the only information currently known is a single frame containing the image of the apple to be detected. By using the previously defined image retrieval technique, and creating indeces for each of the frames as well as the given frame, the system would be able to retrieve the frames in the video that consisted of similar or matching colour indices. Currently, colour indexing is not commonly used in most cases where image retrieval is needed. Instead, the use of meta-data tagging (the process of annotating an image by describing its contents, usually by hand) to generate a set of tags for an image in which words can be used to generate a query for an image in a similar way to other data retrieval methods.

Object Recognition[edit]

By maintaining a dataset of images where the content of each image is known to be that of a single 'model' type, an image, or object within an image may be recognised computationally purely by using a colour indexing approach. For example, consider using a set of 'training' images for a hypthetical application tasked with recognising a specific object within a 'testing' images. Each training image is individually tagged by hand with meta-data regarding what entities which make up the image. A colour index is then generated for each of the training images, as well as for the testing image. The testing image's index is then matched to the closest index within the training set of indices; returning a subset of training images which closely resemble the colouring of the testing image. From the tagged meta-data attached to this subset of training images, information can be obtained regarding the contents of the testing image.

Object Localisation[edit]

Colour indexing may also be used to locate a specified region in a sequence of images. This allows for applications such as tracking an object's position on screen as it moves through a video clip made up of a sequence of indivdual frames. An example use of this would be to consider an application developed to track a figure skater as they skate around in front of the camera. Their location in each image could be tracked through each frame by using colour indexing to search for a region defined within the first frame of the video sequence. Each frame of the video could then be analysed for similar colour indeces to that of the tracked region, and use the information regarding the region's colour, texture, and shape to detect a region within the current frame constisting of similar colour, texture, and shape.

Video annotation[edit]

By a similar method to that of object recognition in images, the annotation of previously unseen video clips could be achieved in a similar method. By annotating a video as a sequence of individual images, a video could be tagged, or otherwise categorised by it's contents or topic. Given a set of example video sequences of known topic, an unknown video could could be tagged by analysing each frame of the unknown video sequence against frames of the videos where the contents are known. See Object Recognition.

Colour indexing techniques[edit]

The process of assigning an index to an image based on it's colour, and other low-level data elements such as texture and shape, can be achieved in different ways.

Histogram based indexing[edit]

One common approach is to generate feature descriptors which define value based information regarding the different perameters of the image. In colour indexing, notable features are usually represented as N-Colour feature descriptors,[3] or texture feature descriptors which are extracted from the image by applying a range of different feature detection techniques on the original image. The feature descriptor values obtained by these techniques for each image are commonly represented as feature vectors. The indexing of the image is achieved by observing the image's feature vector in terms of the feature space, and allows for analysis of inter-image data to generate representitive indices for each image in a set of images.

Below is an example of a simple 3-Colour image feature vector, including size paremters, representing some of the possible low-level data which could be retrieved from an image for use in colour indexing:

This feature vector represents the red, green, blue, width, and height components of an image respectively, where the values for red, green, and blue are the frequency values for the number of pixels which fell into those colour categories. Note that the values for red, green, and blue sum to 10,000, which is equal to the number of pixels in the image, but more colour categories could also be added to generate a higher degree feature vector with N colour categories.

Another common tecnique in generating a colour index for an image is to use a region based approach. This again involves generating a feature vector containing information such as colour and texture at each pixel. After this has been achieved, regions are defined by detecting similar pixels close together and grouping them. Over a number of iterations, these regions are allowed to expand in size by region growing until every pixel in the image is assigned to one of N regions. The region's statistics regarding the location of each region, size, colour, shape, texture, etc, are then used as the index for the image.[4]

Multi-Modal Neighbourhood Signatures[edit]

(Left) An image with which we wish to produce a multi-modal neighbourhood signature. Select neighbouring regions from this image to analyse. (Middle) Generate the colour density function for the image and calculate the modal values in order to generate measurements. (Right) A signature is produce containing data such as the modal colours of the neighbouring regions.

Rather than the de-facto standard of using colour histograms in order to generate an index for an image, a more probabilistic approach can also be used in gathering the requires low-level data. This approach requires calculating the colour density functions for a set of neighbourhood regions of an image. Using this, the density functions are used to calculate a probabilistic set of possible colour values, generally using the modal values of each density function to generate illumination invariant measurements. The measurements are then combined to produce a signature for the image or image set. Multi-modal neighbourhood signatures can also be calculated for a set of images, by taking the union of each neighbourhood for each image in the set, and calculating the density functions for the union sets. This method has many advantages in image retrieval tasks, when the illumination of the images or the quality of the information is likely to be erratic or prone to error, due to Multi-modal neighbourhood signatures allowing for outlier images within the training set, which would otherwise obscure the desired results, to have a lesser effect on the signature generated. [5] [6]

References[edit]

  1. ^ R. B. Fisher, "CVonline: an overview", Int. Assoc. of Pat. Recog. Newsletter, 27(2), April 2005.
  2. ^ D. Koubaroulis,J. Matas and J. Kittler, "The MNS model of object colour appearance and applications", CVSSP, University of Surrey, Guildford GU2 7XH, Surrey, UK. CMP, Czech Technical University, Prague, Czech Republic.
  3. ^ Surrey Image/Video Database Retrieval System, University of Surrey, Guildford, 1998
  4. ^ Surrey Image/Video Database Retrieval System, University of Surrey, Guildford, 1998
  5. ^ The Multimodal Neighbourhood Signature for modelling object colour appearance and applications in computer vision, Dimitri Koubaroulis, PhD thesis, University of Surrey, October 2001
  6. ^ The Multi-modal Signature Method: An Efficiency and Sensitivity Study, D. Koubaroulis1, J. Matas1, J. Kittlerl, Centre for Vision Speech and Signal Processing, University of Surrey, Guildford. CMP, CTU Prague. 2000

Bibliography[edit]

  • Datta, Ritendra (2008). "Image Retrieval: Ideas, Influences, and Trends of the New Age". ACM Computing Surveys. 40 (2): 1–60. doi:10.1145/1348246.1348248. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

External links[edit]

Category:Image processing Category:Artificial intelligence