The linear review of emails is often a time-consuming and expensive task to perform. One factor is that emails may quote the text of previous emails in the thread, resulting in a lot of redundant text. Take for example these three emails:
Marked in red is the redundant text. The text of the first two emails is quoted in full in the last email. When a reviewer reads the last email, he or she has read everything there is to read in this thread. The reality is often more complex, e.g. because people respond to the same root email, remove part of the quoted text, forward it to new recipients, or even alter the quoted text to cover up certain facts. Therefore, it is not always as simple as reading the last email in the thread.
Intella helps with this type of review through the process of email threading. First, it identities the emails that belong to the same thread. Within each thread, it links the replies and forwards to their parent emails, constructing a graph of how the conversation unfolded. All duplicates of a mail will be represented by the same node in this graph. Next, it compares the emails within the thread and determines the set of “inclusive” and “non-inclusive” emails. By default, a mail will be marked as inclusive. When Intella detects that one of the follow-ups of a mail (a reply or a forward) contains all its text and attachments, it will be marked as non-inclusive, as reading the latter email implies having read the first as well. Reading all the inclusive emails and their attachments in a thread implies having read everything there is to read in the thread. This can greatly reduce the time needed to review a large collection of emails.
Besides separating inclusive from non-inclusive emails, email threading enables several other functionalities:
Each email item that was processed by the Email Threading analysis is assigned the following properties:
Furthermore, the algorithm establishes for each follow-up email if it is a Reply, Reply All, or Forward. This status is derived from the sender and receiver information, rather than from e.g. the Subject line. A loose but conceptually practical definition is:
If the set of participants of the response email is the same as the email that it is responding to (the previous email in the thread), it is a Reply All, unless this is a conversation between only two people, in which case it is a Reply.
If the response email is going to one or more people, and none of them was involved in the original email, it is a Forward.
In all other cases, it is a Reply.
Note: Performing email threading analysis is governed by the ‘Can perform email threading’ permission. Users who are not granted with it will not see the Email Threading action in the contextual menu.
As email threading is a computationally expensive algorithm, it requires an explicitly triggered post-processing step. To start the Email Threading procedure, select one or more items in the Details view and select “Email Threading…” in the right-click menu. This will open the dialog shown below:
Select Discard existing email threading data if you want to clear the Email Thread facet and all the data generated as part of previous runs of Email Threading procedure.
Select Analyze headers embedded in email body if you want the algorithm to take the headers embedded in the email body into account. Such headers are typically placed above the quoted text, referencing the original author and time of the quoted text and sometimes other metadata. This can be used to link emails together when the SMTP or mail container-specific metadata is missing or incomplete. This option may produce better results but is computationally expensive. When speed is not of the essence, we recommend turning this feature on.
Click the Run button to start the email threading process.
Once the process is done, the Email Thread facet will be populated and the email items that were part of the threading analysis will be augmented with the threading-related information.
Besides processing the selected items, Intella will automatically process all duplicate items and parent items as well.
Note: The “Analyze paragraphs” indexing option is a prerequisite for determining the inclusiveness of emails. If this option was not used during indexing, all emails will be marked as Inclusive.