DocFetcher

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Fairemblem (talk | contribs) at 20:28, 4 December 2016 (Added a couple citations). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

DocFetcher
Developer(s)DocFetcher project
Stable release
1.1.17 / Feb, 12, 2016
Written inJava
Operating systemMS Windows, Mac OS X, Linux
LicenseEclipse Public License
Websitehttp://docfetcher.sourceforge.net/

DocFetcher is an open source desktop search application that runs on Microsoft Windows, Mac OS X and Linux.[1] It is written in Java[2] and has a Standard Widget Toolkit based graphical user interface.

DocFetcher's indexing and searching facilities are based on Apache Lucene, a widely used open source search engine.

Features

  • Supports all major document formats, including PDF, HTML, Microsoft Office, OpenOffice.org, RTF, EPUB, and more.
  • Supports search of a variety of other formats including Visio, JPEG, MP3, and SVG.
  • Supported archive formats: zip, 7z, rar, tar.*
  • Can search in Outlook emails (PST files)
  • Can be customized to index any kind of source code file
  • Automatically updates its indexes whenever files are modified
  • Exclusion of files from indexing based on regular expressions

Portability

DocFetcher is available as a portable version that allows the user to bundle DocFetcher and his or her personal files in order to create a portable and searchable index of files. Portable means the user may for instance carry around this repository on a USB drive, or a synchronize it over multiple computers via a file synchronization service. Also, due to the fact that DocFetcher is Java-based, this repository can be accessed from different platforms, e.g. from Windows as well as from Mac/Linux.

Pairing of HTML files

By default, DocFetcher treats pairs of HTML files (e.g. a file named foo.html and a folder named foo_files) as a single document. The purpose of this is to improve the quality of the search results by hiding files inside the HTML folders, which users are usually not interested in and which can therefore be considered "noise".

See also

External links

  1. ^ DocFetcher homepage, retrieved 2016-12-04
  2. ^ "Portable desktop search: Make the most of DocFetcher", TechRepublic, 2012-11-10, retrieved 2016-12-04