Recoll is an open source application that has been designed from the ground up to provide users with a personal full-text search utility for their GNU/Linux desktop environments. It is based on the well known Xapian backend.
Key features include easy installation, support for popular UNIX-based systems, modern graphical user interface powered by Qt, support for the Unity user interface of the Ubuntu operating system, support for common file types, as well as support for multiple languages.
In addition, the application features support for multiple character encodings, state-of-the-art query functionality, support for email attachments, transparent handling of common UNIX archive formats, including gzip and bzip2, file and folder filters, and comprehensive documentation.
Natively indexed file types include Plain Text, HTML, Mailbox, Maildir, Pidgin and Purple log files, Man Pages, Dia diagrams, Powerpoint and Excel documents. Also, thanks to external helpers, the program is also capable of indexing documents from well known apps, such as AbiWord, KWord, OpenOffice, Gnumeric and Okular.
Among other features, we can mention that Recoll can be used inside the Mozilla Firefox web browser, through an add-on, allowing users to easily index visited web pages. it can process multiple email attachments at once using multiple selectable databases, and uses wildcard searches.
It provides users with a modern graphical user interface written with the Qt GUI toolkit and designed to index user's home directories from the get-go. The indexing process can be stopped at any time, and it is also possible to erase search and document history, as well as to view indexed types and missing helpers.
Search results can be easily sorted by date, oldest or newest first, viewed as table or saved as a spreadsheet file in the common CSV (Comma Separated Value) file format. From the Preferences dialog, users can configure the indexing schedule, the graphical user interface, and external tools.
What is new in this release:
- Added GUI dialog to perform partial indexing.
- Avanced search in "Any Clause" mode: directory filter would not filter but add an ORed clause.
- Fix bogus syntax errors about parentheses around phrases.
- Fixed a few boundary conditions detected by VC++
- Misc other small fixes, see commit log.
What is new in version 1.20.4:
- 1.20.4 has a fix to skip compress file system images like xxx.img.gz by default. This should have been in 1.20.3
What is new in version 1.20.1:
- An Open With entry was added to the result list and result table popup menus. This lets you choose an alternative application to open a document. The list of applications is built from the information inside the /usr/share/applications desktop files.
- A new way for specifying multiple terms to be searched inside a given field: it used to be that an entry lacking whitespace but splittable, like [term1,term2] was transformed into a phrase search, which made sense in some cases, but no so many. The code was changed so that [term1,term2] now means [term1 AND term2], and [term1/term2] means [term1 OR term2]. This is useful for field searches where you would previously be forced to repeat the field name for every term. [somefield:term1 somefield:term2] can now be expressed as [somefield:term1,term2].
- (1.20.1) The Query Fragments tool was added to the GUI. This is a window with customizable buttons to add arbitrary query language fragments to the current search. The buttons and fragments are defined in an xml file inside the recoll configuration directory ~/.recoll/fragbuts.xml. This makes it easy to define "pre-cooked" filters for things that you need repeatedly. See the manual for more details.
- We changed the way terms are generated from a compound string (e.g. an email address). Previously, for an address like jfd@recoll.org, only the simple terms and the terms anchored at the start were generated (jfd, recoll, org, jfd@recoll, jfd@recoll.org). The new text splitter generates all the other possible terms (here, recoll.org only), so that it is now possible to search for left-truncated versions of the compound, e.g., all emails from a given domain.
- (1.20.1) New keyboard accelerators for the result table: Ctrl+r switches the focus from the search entry to the table, Ctrl+o opens the document for the current line, Ctrl+Shift+o opens document and closes recoll, Ctrl+d previews the document.
- (1.20.1) A special term is now indexed for results from the web history: use "-rclbes:BGL" to exclude the web results, "rclbes:BGL" to restrict the results to the web ones. This is difficult to remember, but the Query Fragments feature means that you don't need to (this is in the sample Query Fragments file).
- Recoll now indexes #hashtags as such.
- It is now possible to configure the GUI in wide form factor by dragging the toolbars to one of the sides (their location is remembered between sessions), and moving the category filters to a menu (can be set in the "Preferences->GUI configuration" panel).
- We added the indexedmimetypes and excludedmimetypes variables to the configuration GUI, which was also compacted a bit. A bunch of ininteresting variables were also removed.
- When indexing, we no longer add the top container file name as a term for the contained sub-documents (if any). This made no sense in most cases, as it meant that you would get hits on all the sections from a chm or epub when the top file name matched the search, when you probably wanted only the parent document in this case.
- However, the container file name was sometimes useful for filtering results, and it is still accessible, in a different way: the top container file name is added as a term to all the sub-documents, only for searching with a prefix. The field name is containerfilename, and no match on the subdocuments will occur if the field is not specified (this is different from previous filename processing, which was indexed as a general term. containerfilename is also set on files without sub-documents (e.g. a pdf).
- A new attribute, pfxonly, was created to support the above change. This can be set on any metadata field inside the [prefixes] section of the fields file. The affected field terms will be indexed only with a prefix, so they will cause a hit only for a field search (the general behaviour is that field terms are indexed both prefixed and not, so they can also cause a hit when searched as general terms).
- A new [queryaliases] section was created in the fields, for definining field name aliases to be used only at query time (to avoid unwanted collection of data on random fields during indexing). The section is empty by default, but 2 obvious aliases are commented: filename=fn and containerfilename=cfn. Setting them in your personal file may save you some typing if you search on file names.
- You can now use both -e and -i for erasing then updating the index for the given file arguments with the same recollindex command.
- We now allow access to the Xapian docid for Recoll documents in recollq and Python API search results. This allows writing scripts which combine Recoll and pure Xapian operations. A sample Python program to find document duplicates, using MD5 terms was added. See src/python/samples/docdups.py
- The command used to identify the mime types of files when the internal method is file -i by default. It is now possible to customize this command by setting the systemfilecommand in the configuration. A suggested value would be xdg-mime, which sometimes works better than file.
- The result list has two new elements: %P substitution for printing the parent folder name, and an F link target which will open the parent folder in a file manager window. e.g. Open parent directory
- /media was added to the default skippedPaths list mostly as a reminder that blindly processing these with the general indexer is a bad idea (use separate indexes instead).
- recollq and recoll -t get a new option -N to print field names between values when -F is used. In addition, -F "" is taken as a directive to print all fields.
- Unicode hyphen (0x2010) is now translated to ASCII minus during indexing and searching. There is no good way to handle this character, given the varius misuses of minus and hyphen. This choice was deemed "less bad" than the previous one.
What is new in version 1.19.14:
- 1.19.14 fixes relatively minor but ennoying issues in indexing, plus a few other glitches:
- The use of a separate readonly Database object for querying the index while indexing would trigger Xapian errors, (bad block reads), and subsequent up-to-date check failures (leading to unnecessary reindexing). The jury is out as to the cause, but using the same object for reading and writing seems to eliminate the problem.
- An unnecessary log message in the child process between forking and executing the filter could block on a mutex, and lead to a 20 mn timeout for the affected father process thread (happened only in multithread mode).
- Also a possible overflow of the filter stack. This could only really happen in pathological situations (hand-crafted recursive zip file...).
What is new in version 1.19.13:
- This hopefully fixes the last remaining bug in the multithreading code, which was causing quite rare, but annoying crashes. You definitely want to upgrade to this version if you are running recoll 1.19.
What is new in version 1.19.11:
- Case/diacritics sensitivity is still off by default for this release. It can be turned on only by editing recoll.conf (see the manual). If you do so, you must then reset the index.
What is new in version 1.19.9:
- This release fixes a number of significant bugs (query date condition handling, possible GUI crashes...).
What is new in version 1.19.2:
- This release fixes a bug in path translations for additional indexes.
What is new in version 1.18.1:
- This version brings optional case- and diacritics-sensitive searches, complex search history, direct access to hit pages for PDF documents.
What is new in version 1.17.3:
- Release 1.17.3 mostly fixes an indexing crash that sometimes occurred while processing email.
What is new in version 1.17.2:
- Fixes a few bugs and adds a small feature for handling characters that should not be accented in your language (ie: a in swedish). See unac_except_transx in the manual configuration section. Also a new rcldia filter for Dia files.
Requirements:
- Xapian and Omega
- Qt
I commenti non trovato