13. Redaction

Redaction is the process of concealing part of an item’s text, graphics and/or metadata in order to conceal that content part from unauthorized view. A typical use case of redaction is the concealing of legally privileged information in information that is produced for an opposing party in an e-discovery matter, e.g. because of attorney-client privilege. Other scenarios are hiding person names, birth dates, social security numbers, credit card numbers, etc. due to privacy laws or when they are not relevant to the matter at hand.

13.1. Workflow

When redacting an item, Intella Connect first creates a temporary PDF representation of the item and then lets the user mark the sensitive areas in it. This PDF and the added redactions are stored in the case. The original evidence item is not changed, nor is any information removed from the Intella case. At any time the redaction marks can be reviewed, edited and removed.

Only when the item is exported to the final PDF or to a load file, are the redactions “burned in”: all pages in the temporary PDF are converted to images in which the sensitive part is literally blacked out. The result is a PDF that is guaranteed not to contain the sensitive information.

Redaction affects the results of the regular PDF export and the PDFs and TIFFs that are created as part of a load file. For the sake of brevity, the remainder of this section will only refer to exported PDFs when both are meant.

13.2. Redacting an item

It’s possible to redact an item by opening it in the Previewer and clicking:

  • on the Redact action in the toolbar
  • on the “Redaction” tab

Intella Connect will render the item based on the preferred redaction profile which can be changed by selecting another profile from the profiles dropdown located above the rendered document on the right side.


Note: Profile can be changed only when there are no redaction marks applied to the document.

More information related to Redaction Profiles can be found in the Redaction Profiles section.

Redaction tab contains a PDF rendering of the item and offers various controls for adding and editing redactions. As the PDF is generated on demand, the tab may take some time to appear, depending on the type and complexity of the item. The item is now ready to be redacted.

To redact a part of the content, simply select the rectangular area in the rendered item that needs to be hidden. The selected area will now be covered with a black rectangle. You can repeat this step to conceal additional parts of the item. The redactions are stored automatically; no manual save action is needed. The rectangle is semi-transparent so that the reviewer can still see what content has been redacted without having to move it. In the final exported document the rectangle will be a solid black.

Redaction editor

To move or resize a redaction mark, click on the rectangle. The rectangle will become selected and can then be moved or resized with the mouse. To remove a redaction, select it and click the Remove selected mark button or press Delete key. To remove all redactions of this item, click the Clear redactions button.

When you close and reopen the item, the Previewer will immediately show the Redaction tab again with all previously made redactions, as the PDF is cached. Only when no redactions are added will the PDF be discarded. Redacted items can easily be found using the Redacted category in the Features facet.

13.3. Exporting

When exporting an item to PDF, Intella Connect will by default use the redacted version if there is one. More specifically, it will convert the temporary PDF into a final PDF that contains only images, and will burn in the redactions in these images so that the sensitive content is concealed permanently.

Exported load files containing PDFs or TIFFs will undergo a similar process. The result of this last conversion step is a PDF that has no regular machine-processable text. To verify this, simply open the PDF in a PDF reader like Acrobat and try to select the text. That makes this redaction method very safe (as opposed to removing the sensitive text from the source file) as all information is in plain sight; there is e.g. no hidden metadata that could still leak the sensitive information. The downside is that the PDFs can have a large file size as all text is represented as images, and that they would need to be OCRed to make the non-concealed text accessible again for text selection, keyword search, etc.

As the final PDF is derived from the temporary PDF, the PDF export settings entered in the Export dialog will only have any effect on the non-redacted items in the export set. The redaction tab in the Previewer also has an Export as PDF button, to export the current item as a redacted PDF. This PDF will be the same as when it is exported as part of a collection of items to PDF, i.e. all pages will be converted to images with their redacted parts showing as black rectangles. This option is useful when only a few redacted documents are necessary or to verify the redaction export.

13.4. Mass redaction

A common redaction method is to search for a company or organization name and to review and optionally redact the search hits. Intella Connect can assist with this process: when the Redaction tab is viewed while Intella Connect’s search interface shows one or more keyword queries, the keyword search hits will be highlighted in the Redaction tab and can be redacted with the click of a button. Note that this highlighting works best on single term queries. It does not work reliably or even at all for more advanced queries such as phrase searches, wildcard queries, etc. The currently used keyword(s) will be shown beneath the item content.

Use the arrow buttons to move from one keyword hit to another. Click the Redact hit button to redact the currently highlighted occurrence, or click the Redact all button to redact all occurrences in the current item. Please see the subsection on Caveats below when using the Redact all button.

13.5. Redaction Profiles

When the Redact action or Redaction tab in the Previewer is clicked, a PDF that is generated will consist of a limited set of content and metadata properties. For example, e-mails will show their most important headers (e-mail sender and recipients, subject and sent/received dates) on the first page, followed by the e-mail body. The full SMTP headers of the e-mail are printed on one or more separate pages, followed by the list and content of the e-mail’s attachments. When this default set of content and metadata properties is not suitable for a specific case, or different settings are desired for different types of items or different audiences, the user can define one or more redaction profiles for the case. Such a profile defines the set of content and metadata properties to be used in the redacted PDF. Defining additional profiles and/or selecting preferred profile can be done in Profile setting which can be accessed by pressing a Gear icon next to the Redact action.

Redaction tab

To add a redaction profile to the list, click the Add button in the Redaction tab. The window that opens allows the reviewer to enter a profile name and select which content and metadata properties should be used when this redaction profile is chosen. For a detailed description of the available properties see the PDF rendering options section.

Add redaction profile window

13.6. Caveats

As the purpose of redaction is to conceal sensitive information, it is vital that the reviewer takes notice of the following caveats on the redaction functionality.

First, there are a number of issues to be aware of when using keyword Hit Highlighting to control the redactions. When highlighting the search hits in a PDF, the highlighted area may not exactly cover the responsive text in the PDF. The redaction rectangle then needs to be manually moved and resized. Whether this happens depends on the fonts used in the PDF: PDFs that Intella Connect has generated using texts from its own databases are fine (e.g. pages with e-mail bodies and headers), but text in existing evidence PDFs or in Word documents that are converted to PDF may be a different story. We have no control over the font characteristics used in those documents and therefore cannot guarantee correct placement of the redaction rectangle.

Another important aspect is that Hit Highlighting may not find all occurrences of the text that is searched for. For example, words that are misspelled, use a spelling variation or are hyphenated may not be found. Texts inside graphics will also not be found. Note that OCR software that is used to combat this can also introduce spelling errors.

Finally, tables and graphs may require extra attention. When creating a redacted PDF rendering of an item, the PDF is only associated with that specific item, not with any duplicates of that item. We may introduce that functionality in a future version.