5. Keyword search¶
To search for text, enter a query in the Search panel and click the Search button.
Note: If a query is more complex and takes more time to evaluate, then refreshing the page or closing the browser tab during this evaluation will cause the query to be cancelled and will disappear from results list.
For query syntax rules, please see to the Search query syntax section below.
5.1. Search options¶
With search options you can limit keyword searching to specific item parts or attributes:
- Text
- Title / Subject
- Path & File name
- Summary & Description
- Message Headers
- Raw Data (e.g. low-level data from PST files, MS Office documents, vCards)
- Authors & E-mail Addresses
- Each of the From, Sender To, Cc and Bcc fields separately
- Comments
- Export (searches in the export IDs of the items that are part of any export set)
To see the search options, click on Search drop-down button next to the Search button
and then click on Set search options
option from the drop-down menu.
The options box will be displayed as a popup window.
Select the properties that you want to include in your search and deselect those you want to exclude. Your selected search options will be stored and used for future searches until the time you change them again.
Note: As a reminder, a warning label will be shown when not all options are selected.
Enable paragraph exclusion
checkbox is used to exclude paragraphs marked for exclusion as described in the Previewing results section.
5.2. Search query syntax¶
In the text field of the Search panel you can use a special query syntax to perform complex multi-term queries and use other advanced search features.
5.2.1. Use of multiple terms (AND/OR operators)¶
By default, a query containing multiple terms matches with items that contain all terms anywhere in the item. For example, searching for:
John Johnson
returns all items that contain both “John” and “Johnson.” There is no need to add an AND (or “&&”) as searches are performed as such already, however doing so will not negatively affect your search.
If you want to find items containing at least one term but not necessarily both, use one of the following queries:
John OR Johnson
John || Johnson
5.2.2. Minus sign (NOT operator)¶
The NOT operator excludes items that contain the term after NOT:
John NOT Johnson
John -Johnson
Both queries return items that contain the word “John” and not the word “Johnson.”
John -“John goes home”
This returns all items with “John” in it, excluding items that contain the phrase “John goes home.” The NOT operator cannot be used with a single term. For example, the following queries will return no results:
NOT John
NOT “John Johnson”
5.2.3. Phrase search¶
To search for a certain phrase (a list of words appearing right after each other and in that particular order), enter the phrase within full quotes in the search field:
“John goes home”
will match with the text “John goes home after work” but will not match the text “John goes back home after work.” Phrase searches also support the use of nested wildcards, e.g.
“John* goes home”
will match both “John goes home” and “Johnny goes home”.
5.2.4. Grouping¶
You can use parentheses to control how your Boolean queries are evaluated:
(desktop OR server) AND application
retrieves all items that contain “desktop” and/or “server,” as well as the term “application.”
5.2.5. Single and multiple character wildcard searches¶
To perform a single character wildcard search you can use the “?” symbol. To perform a multiple character wildcard search you can use the “*” symbol.
To search for “next” or “nest,” use:
ne?t
To search for “text”, “texts” or “texting” use:
text*
The “?” wildcard matches with exactly one character. The “*” wildcard matches zero or more characters.
5.2.6. Fuzzy search¶
Intella supports fuzzy queries, i.e., queries that roughly match the entered terms. For a fuzzy search, you use the tilde (“~”) symbol at the end of a single term:
roam~
returns items containing terms like “foam,” “roams,” “room,” etc.
The required similarity can be controlled with an optional numeric parameter. The value is between 0 and 1, with a value closer to 1 resulting in only terms with a higher similarity matching the specified term. The parameter is specified like this:
roam~0.8
The default value of this parameter is 0.5.
5.2.7. Proximity search¶
Intella supports finding items based on words that are within a specified maximum distance from each other in the items text. This can be seen as a generalization of a phrase search.
To do a proximity search you place a tilde (“~”) symbol at the end of a phrase, followed by the maximum word distance:
“desktop application”~10
returns items with these two words in it at a maximum of 10 words distance.
Like phrase searches, proximity searches also support nested wildcards.
5.2.8. Field-specific search¶
Intella’s Keyword Search searches in document texts, titles, paths, etc. By default, all these types of text are searched through. You can override this globally by deselecting some of the fields in the Options, or for an individual search by entering the field name in your search.
title:intella
returns all items that contain the word “intella” in their title.
The following field names are available:
- text - searches in the item text
- title - searches in titles and subjects
- path - searches in file and folder names
- summary - searches in descriptions, metadata keywords, etc.
- agent – searches in authors, contributors and email senders and receivers
- from – searches in email From fields
- sender – searches in email Sender fields
- to – searches in email To fields
- cc – searches in email Cc fields
- bcc – searches in email Bcc fields
- headers - searches in the raw email headers
- rawdata - searches in raw document metadata
- comment - searches in all comments made by reviewer(s)
- export - searches in the export IDs of the items that are part of any export set
You can mix the use of various fields in a single query:
intella agent:john
searches for all items containing the word “intella” (in one of the fields selected in the Options) that have “john” in their author metadata or email senders and receivers.
5.2.9. Special characters¶
The following characters need to be escaped before they can be used in a query:
- && || ! ( ) { } [ ] ^ ” ~ * ? : /
They can be escaped by prefixing them with a character.
Note that during indexing, some of these characters will be filtered out and will never make it into the index. The rules for handling specific characters depend on the context in which they occur. For instance, the punctuation characters like dots (‘.’) or dashes (‘-‘) are significant within numbers, email addresses or host names, while being ignored (i.e. interpreted as whitespaces) between regular words. In the latter case, escaping those characters will not make them searchable.
5.2.10. Regular Expressions¶
This release contains experimental support for searching with regular expressions. This may be extended, refined and documented in a future release. For now, please visit http://lucene.apache.org/core/4_3_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Regexp_Searches for more information.
Be aware that these regular expressions are evaluated on the terms index, not on the entire document text as a single string of characters! Your search expressions should therefore take the tokenization of the text into account.