User:DrTrigonBot/doc

From Wikipedia, the free encyclopedia

Page has to be translated from de:User:DrTrigonBot/Doku. Further information related to techn. details are documented here.

Discussion/Talk-Summary[edit]

Soft redirect to:de:User:DrTrigonBot/Doku#Diskussions-Zusammenfassung
This page is a soft redirect.

Individual Sandboxes (e.g. for users)[edit]

Soft redirect to:de:User:DrTrigonBot/Doku#Individuelle Spielwiese (z.B. für Benutzer)
This page is a soft redirect.

SubsterBot[edit]

Soft redirect to:de:User:DrTrigonBot/Doku#SubsterBot
This page is a soft redirect.

Individual Tasks/Jobs[edit]

Soft redirect to:de:User:DrTrigonBot/Doku#Individuelle Aufträge
This page is a soft redirect.

Categorization[edit]

The goal/aim is to have a bot working automatic and processing page by page (for more detail confer bot flag request also), using clever algorithms (as mentioned on commons:User:DrTrigonBot/ToDo) to categorize media by machine (confer e.g. the journal "Computer Vision and Image Understanding").

First the bot uses various detection and recognition alogrithms on the image content to retrieve as much data as possible. In a second step the bot decides on the reliability of those data and uses it to categorize the image in a final step then. If successful, the category along with all data relevant for the categorization will be delivered to the image description page. The data is added using {{FileContentsByBot}}.

So the procedure reads: image download/retrieval → feature detection/extraction → classification → categorization → report

In order to do it's job the bot has to download every single image, thus we follow the principle of attempt to extract as many information as possible, once the file is downloaded.

User:Multichill is working on OpenCV face detection based classification too User:DrTrigonBot/doc#From User:Multichill/Using OpenCV to categorize files.

Logging/debug results at: commons:User:DrTrigon/User:DrTrigonBot/logging - hist.

{{FileContentsByBot}}[edit]

Properties[edit]

Very basic checks like: file size (os), pixel size (PIL, rsvg), palette, SVG validity (py_w3c), ...

Categories: Category:Animated GIF, Category:Animated PNG

Conditional Categories: Category:PDF files, Category:TIFF images (only those ones, see this talk)

Examples: File:MORPH.gif, ...

Faces[edit]

Pre-trained haar cascade detection (OpenCV) for frontal and profile faces along with eye detection (within the face region) in order to reach as sufficient quality.

The metadata (ExifTool, may be pyexiv2) of images takes with modern digital cameras may contain face detection data (detection done by camera software) also. Those data are extracted and processed too (like commons:User:DschwenBot does for GPS data).

For some further info on face detection works confer e.g. this. More about extraction of camera face detection info.

Categories: Category:Unidentified people, Category:Groups, Category:Faces, Category:Portraits

Examples: File:Morningside City Councilman Kevin D Kline.jpg, File:Newsom Brown rally.jpg, ...

ColorAverage[edit]

The color histogram (PIL) is used to calculate the images average color RGB value. This is compared to a predfined color palette (Pantone color matching system) by calculating the Color difference (python-colormath) and finding the closest match in order.

Further info on color palettes can be seen at RGB Chart & Multi Tool. May be NCS would be more suitable (in general a palette with constant distances between all color should be preferred over Pantone)?

Categories: Category:Graphics

Examples: File:Mortaisage-3 couteaux.jpg, File:New Figure 13.png, ...

ColorRegions[edit]

First a image segmentation algorithm (JSEG project, may be SLIC) is applied, may be incrementally. Then the same as in User:DrTrigonBot/doc#ColorAverage is done for every region/segment. Afterwards the position and size all the regions are calculated to complete the data.

This procedure is oriented on Automatic Categorization of Image Regions using Dominant Color based Vector Quantization, e.g to use JSEG and GLA. Read Uni-Modal Versus Joint Segmentation for Region-Based Image Fusion for more info.

Categories: (works not very well - thus switched off at the moment)

Examples: ...

People[edit]

To implement people/pedestrian detection, we use the pre-trained HOG descriptors (OpenCV) and complete them with haar cascade detection full body detection (similar as in User:DrTrigonBot/doc#Faces).

Categories: Category:Unidentified people, Category:Groups

Examples: File:Bhubaneswar WikiFotoWalk2.jpg, File:Funeral of the Cardinal Schaepman in Utrecht.jpg, ...

Chessboard[edit]

Detection on chessboard pattern in any kind of scenes is a fundamental and crucial task for camera calibration and as such as separate algorithm dedicated for this purpose only was implemented (OpenCV) and can be used here as well.

Categories: Category:Chessboards

OpticalCodes[edit]

Automated detection of 1D and 2D optical codes (such as barcodes, data matrices, ...) is essential for a lot of applications and those algrithms (zbar, pydmtx) are used here also.

Categories: Category:Barcode

Examples: File:El caso.jpg, ...

Text[edit]

(PDF only at current state)

Categories: Category:Books (literature) in PDF

Examples: File:Job-110359 Report Wikipedia English V2.pdf, ...

Streams[edit]

(...)

Categories: Category:Videos, Category:Ogg sound files

Examples: File:More Instructions for Using the Columbus State University Writing Center Calendar.ogv, ...

(conditional)[edit]

All kinds of categories (e.g. file formats) not worth to be added alone and therefore need another ones already present (if one of the categories above was found).

Categories: Category:JPEG, Category:PNG, ...

Examples: ...

( switched off for the unspecific ones - only more special ones are handled now - more or less nothing ;)

(generic)[edit]

This is a section for new, experimental or other kind of methods not set up with a specialized template yet. This template can be set up by anybody. The absence of it indicates that something was going wrong and the bot fell back to this "emergency" mode in order to be able to do an output at least. It is used on the logging/debug page commons:User:DrTrigon/User:DrTrigonBot/logging also.

Examples: ...

Belonging to here / parts in development / experimental / not or partly implemented yet:

Libraries and external code (credits)[edit]

Before categorizing the bot tries to gather as much information about an image file and its content as possible by means of the following libraries and methods:

  • python default packages (e.g. PIL)
  • pywikipedia framework packages
  • additional python packages (more exotic ones)
  • modules needing compilation (C/C++ code)
    • JSEG algorithm from University of California (with kind thanks for the permission to use it) refined into a python wrapper/bindings
    • pydmtx libdmtx Python Wrapper (need to compile because of missing debian/TS package)
    • zbar Python Wrapper (need to compile because of missing fedora/devel environment package)
    • OpenCV Object Categorization by BoW refined into a python wrapper/bindings because not included in official ones
    • SLIC Superpixels for Python Wrapper (need to compile because of missing package - is in early development)
  • DrTrigonBot framework packages
  • external programs (binaries)
    • ExifTool by Phil Harvey (since it is the only one capable of handling face recognition meta data)
    • pdftotext from poppler library
    • ffprobe from FFmpeg library
    • ImageMagick

Machine learning[edit]

I installed OpenCV from linux distro repos:

  • ubuntu or fedora have OpenCV python bindings
  • in the samples directory are some folders with example python, C and C++ programs (fun and useful to play around with!)

Do face detection in combination with Pywikipedia to fill Category:Unidentified people (may be Category:Unidentified people (bot tagged)?). Next step is probably to start training some filters based on Commons images. For more details on test done e.g. on fedora 15 with face detection and 'bag of words' method, confer the code for pywikipedia bot framework available at https://jira.toolserver.org/browse/DRTRIGON-120. Most recent code available at:

From User:Multichill/Using OpenCV to categorize files[edit]

At the time of writing Commons contains about 150.000 uncategorized files. This is only about 1,25% of all files, but it's always nice to be able to lower the number even further. A lot of categorization work has already been done by the CategorizationBot, but this work is all done based on usage of a file. No categorization has been done based on the contents of the file itself.

OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision. It can be used to "recognize" images. OpenCV could be used to move uncategorized files to one of the unidentified topics categories based on the image characteristics. OpenCV contains several approaches we could use to "recognize" images.

Some frequently occurring subjects in uncategorized files:

MailerBot[edit]

Soft redirect to:de:User:DrTrigonBot/Doku#MailerBot
This page is a soft redirect.