Searchblox

From Wikipedia, the free encyclopedia
Jump to: navigation, search

SearchBlox is a provider and an open source enterprise content search engine built on top of Apache Lucene/Solr [1] and Elasticsearch.[2] Its software implements search indexing solutions for Website, eCommerce, Intranet, Twitter, Flat files and database searching. Search capabilities include flexible crawling, metadata indexing, search result customization, multilingual support, reporting, analytics, faceted search, and featured results. Users are provided with simple search customization, integration, and administrative solutions.

SearchBlox external connectors allow one to index Alfresco, CMIS, Documentum (EMC), Databases (Oracle, MySQL, PostgreSQL, JDBC supported DBs), FileNet (IBM), LiveLink (OpenText), Meridio (Autonomy), SharePoint, Wiki, Window Shares (Win, Samba, NetApp, other NAS systems). It has the ability to deploy on AWS EC2 featuring automated, semi-automatic and manual deployment methods to set up cloud search on Amazon EC2.[3]

Features[edit]

End-user features

  • Seamlessly search across websites, databases, emails, RSS and Atom Web feeds, file systems, twitter, flat files and custom content
  • Advanced Search – Search by file format, language, keyword occurrence and last modified date
  • Faceted Search – Term, Number/Date range and Date Histogram based faceted search results
  • Synonyms – Synonyms can be mapped for each collection
  • Automatic highlighting of user search query terms in HTML and PDF documents
  • Keyword-in-Context Display – search results are displayed with areas of content where the keyword occurs
  • Supports Boolean AND, OR, and NOT searches, Fuzzy, Wildcard and fielded searches
  • Email Alert – Setup keyword alerts for new content that get discovered and indexed
  • Content tag cloud - Text tag cloud based on the content within the collections
  • Spelling suggestions - Intelligent smart suggestions based on the content

Administrator features

  • Easy to use and intuitive web console to manage all aspects of the search application
  • Built-in replication to synchronize search indexes across multiple instances of SearchBlox
  • Customization search results using CSS or XSL stylesheets. Also available as XML/JSON data for complete control of search results or programmatic access
  • Built-in Crawlers to index HTTP, HTTPS, File System, emails (PST files), Databases (Oracle, MySQL and MS Server), Twitter, RSS and Atom Web Feed content
  • Built-in file serving of documents in File System Collections and Outlook PST archives (including attachments)
  • Support for indexing content through Proxy Servers
  • Selective indexing of sections of HTML pages using <noindex> </noindex> or <!–stopindex–> <!–startindex–> tags
  • API – Create, Update, modify and delete collections programmatically
  • Advanced Reporting – real-time reporting for top queries and zero match queries on a per collection basis
  • Setup distributed clusters of search infrastructure for crawling and search
  • Setup Featured Results for targeted search hits

Supported file formats[edit]

  • HTML
  • XML
  • Word
  • Excel
  • PowerPoint
  • Visio
  • PDF
  • Text
  • RTF
  • EPUB
  • AutoCAD (DWG)
  • OpenOffice
  • iWorks (Pages, Numbers, Keynotes)
  • WordPerfect
  • Images (BMP, JPEG, TIFF, GIF, PNG, SVG, PSD)
  • Audio (AIF, MP3, MP4, MIDI, WAV)
  • Video (MPEG, FLV)
  • 32-bit and 64-bit Outlook PST Email Archive files (including attachments)

Supported Languages[edit]

Arabic Gujarati Polish Bengali Hebrew Portuguese Chinese (Simplified) Hindi Russian Chinese (Traditional) Hungarian Romanian Czech Italian Slovak Danish Japanese Slovenian Dutch Kannada Spanish English Korean Swedish Estonian Latvian Tamil Finnish Lithuanian Telugu French Malayalam Thai German Malayalam Turkish Greek Norwegian

Latest software release[edit]

Searchblox Version 8.1 Build 1 was released in October 2014 for Windows, Unix, Mac OS X, War file, WordPress Plugin, Drupal Module, PHP Examples for Rest API. The new 8.1 Build 1 version makes it easy to add faceted search without the hassles of managing a schema and scales horizontally without any manual configuration or external software/scripts. It enables to have distributed indexing and searching abilities without using any separate scripts/programs as in SolrCloud. SearchBlox provides on demand dynamic faceting of fields without specifying them through a config or script.

History[edit]

SearchBlox Software, Inc.[4] was founded in 2003 with the aim to develop commercial search solutions based on Apache Lucene. SearchBlox currently uses Elasticsearch as the underlying search API.

Availability[edit]

Searchblox Software provides search to over 300 customers in 30 countries. Software supports 37 languages and is available on the Windows, Linux and Mac platforms.

References[edit]

  1. ^ Apache Lucene
  2. ^ Elasticsearch
  3. ^ Amazon Web Services / EC2
  4. ^ SearchBlox Website

External links[edit]

  1. Official web site for Searchblox Software
  2. Official web site for Searchblox Help Center
  3. SearchBlox Wiki