Java Open Source Projects Directory

...dedicated into Java open source projects

  • Increase font size
  • Default font size
  • Decrease font size
Search Engines

lucene

Jakarta Lucene is a high-performance, full-featured text search engine library written entirely in Java. It is a technology suitable for nearly any application that requires full-text search, especially cross-platform.

 

yacy

This is a distributed web crawler and also a caching HTTP proxy. You are using the online-interface of the application. You can use this interface to configure your personal settings, proxy settings, access control and crawling properties. You can also use this interface to start crawls, send messages to other peers and monitor your index, cache status and crawling processes. Most important, you can use the search page to search either your own or the global index.

 

zilverline

Zilverline is what you could call a 'Reverse Search Engine'. It indexes documents from your local disks (and UNC path style network disks), and allows you to search through them locally or if you're away from your machine, through a webserver on your machine. Zilverline supports collections. A collection is a set of files and directories in a directory. PDF, Word, txt, java, CHM and HTML is supported, as well as zip and rar files. A collection can be indexed, and searched. The results of the search can be retrieved from local disk or remotely, if you run a webserver on your machine. Files inside zip, rar and chm files are extracted, indexed and can be cached. The cache can be mapped to sit behind your webserver as well.

 

oxyus

Oxyus Search Engine is a Java based Application for indexing web documents for searching from an intranet or the Internet similar to other propietary search engines of the industry. Oxyus has a web module to present search results to the clients throught web browsers using Java Server that access a JDBC repository through Java Beans.

 

hounder

Hounder is a simple and complete search system. Out of the box, Hounder crawls the web targeting only those documents of interest, and presents them through a simple search web page and through an API, ideal for integrating into other projects. It is designed to scale on all fronts: the number of the indexed pages, the crawling speed and the number of simultaneous search queries. It is in use in many large scale search systems.

 

piscator

Piscator is a small SQL/XML search engine. Once an XML feed is loaded, it can be queried using plain SQL. The setup is almost identical to the DB2 side tables approach.

 

bddbot

DDBot is a web robot, search engine, and web server written entirely in Java. It was written as an example for a chapter on how to write your search engines, and as such it is very simplistic.

 

regain

´regain´ is a fast search engine on top of Jakarta-Lucene. It crawles through files or webpages using a plugin architecture of preparators for several file formats and data sources. Search requests are handled via browser based user interface using Java server pages. ´regain´ is released under LGPL and comes in two versions: 1. standalone desktop search program including crawler and http-server 2. server based installation providing full text searching functionality for a website or intranet fileserver using XML configuration files.

 
  • «
  •  Start 
  •  Prev 
  •  1 
  •  2 
  •  Next 
  •  End 
  • »


Page 1 of 2