Choosing eDiscovery software is tough enough, with all the options on the market. Our directory alone has 68 options. While it allows you to filter by features, that’s not helpful if you don’t know what the features are!
Here’s a brief glossary of eDiscovery terms to help you narrow down your choices and make the best decision for your firm or legal department.
Courts can impose significant sanctions if you don’t take proper steps to preserve electronic information when you’re a party to litigation. But due diligence will provide you a ‘safe harbor’ from sanctions. Protect yourself by implementing reasonable procedures to preserve relevant electronic data. This means you need to set up ‘good faith’ operations of automated systems.
The first thing a good auto-classification system does is separate the wheat from the chaff. The Rules of Civil Procedure and numerous State mirror statutes stipulate that all parties to a Federal lawsuit have a legal responsibility to preserve relevant electronic information. Figuring out what to keep and what to toss reduces litigation risk and increases regulatory compliance.
Not storing disposable content also makes storage and discovery much cheaper. A good auto-classification feature reliably identifies disposable content, or content that no longer has business or legal value.
Question for vendors: How does your system identify disposable content?
Auto-Classification is one of those features that can mean a lot of different things. You need more information from your vendor than “it offers auto-classification.” At its most basic, auto-classification can just mean the software assigns keywords to documents based on their contents.
Higher-end eDiscovery software can be “taught” how to classify information by a subject matter expert for any given category of records.
Question for vendors: Does your software learn how to classify information?
Good eDiscovery software classifies based on statistically relevant sampling and quality control. This allows you to easily demonstrate a highly defensible, transparent approach to information governance. When you can defend your practices, you minimize your risk of regulatory fines and eDiscovery sanctions.
Any kind of auto-classification will produce more consistent results than having a team of lawyers do it. It also finds and eliminates duplicates.
According to eDiscovery expert George Socha of the Socha-Gelbmann Electronic Discovery survey, there are actually four categories of electronic discovery formats in terms of production, review, and processing:
- True native: What most parties have in mind when asking for native production. Copies of the original documents with metadata intact in the format created by the authoring application, like DOC or XLS.
- Near native: A copy of a native file where the content and metadata are electronically accessible. For example, a Word document that has been converted and retains the searchability and some metadata of the original file or electronically converted, searchable PDFs.
- Near paper: TIFF or PDF files that must undergo Optical Character Recognition (OCR) before it can be searched or indexed. OCR can yield imperfect results in terms of search accuracy and the results are generally inferior to electronically originated documents converted to PDF with text intact. Also, the text must be sent separately, usually as a TXT file, requiring that both the TXT file and TIFF image be reviewed and redacted separately.
- Actual paper: Documents that originated in paper form or digital files that have been printed to paper. Clearly paper offers no searchability or other time-saving electronic review methods.
You want an eDiscovery solution that can save, analyze, and hand over information in its native format. Or, the format the data had when it was created or obtained by your organization.
You want your eDiscovery platform to retain metadata. This information includes date created/modified, author, etc. It also includes hidden material that does not appear when a document is printed out, for example hidden rows, cells, and formulas in Excel, and Track Changes and comments and markups in Word.
Saving information in its native format helps you avoid making copies of the same data in different formats.
Again, not every vendor will be able to save and categorize all formats. A vendor might boast it offers native format, but not include Facebook messages, for instance.
Question for vendors: Which formats does your system work with?
eDiscovery software vendor NextPoint defines eDiscovery analytics as “any tool or set of tools used to provide data mining, event tracking, and reports to provide insight and analysis for document review.”
Analytics tools include functionality ranging from review metrics to advanced data mining applications. Case analytics can also refer to tools to review and analyze ESI.
Question for vendors: What metrics does your software track?
In addition to making your discovery process more efficient and reliable, good analytics can also make it more defensible and repeatable.
It’s all part of taking ‘reasonable steps’ to prevent inadvertent disclosure. Federal Rules of Evidence Rule 502 (regarding inadvertent disclosure of work product):
Depending on the circumstances, a party that uses advanced analytical software applications and linguistic tools in screening for privilege and work product may be found to have taken ‘reasonable steps’ to prevent inadvertent disclosure. The implementation of an efficient system of records management before litigation may also be relevant.
Question for vendors: How does the analytics data you provide make the discovery process more efficient, reliable, defensible, and repeatable?
Some vendors offer graphical views of data to eDiscovery applications, often in a dashboard, giving review teams easy access to progress and status reports.
Question for vendors: How is the case analytics information presented?
Discussion threads on content allow reviewers and other users to communicate easily about a particular piece of information.
OCR (Optical Character Recognition)
OCR scans text contained in an image and converts it into searchable text. This is often used for scanned paper documents.
Allows users to view the documents’ progress through the eDiscovery process. Shows who viewed a document and when.
Basically this feature refers to scanning paper and running OCR on it.
This allows your eDiscovery platform to extract and store the metadata (like author, date, and topic) of your ESI (electronically stored information).
Some eDiscovery vendors will sort documents into groups based on related topics. Users can see an overview of the topic as well as similar concepts in documents. This helps users see common themes and patterns among electronic documents. It’s especially helpful for litigators.
Storage limit refers to how much information you can store in your eDiscovery platform. You want to be sure you don’t end up having to pay more than you expected in order to store all your data.
Document indexing refers to associating records with keywords for a more efficient search and review process.
In some eDiscovery platforms you can comment on, highlight, and underline specific parts of a document.
Annotations can be made private either by password protection or with the ability to send out documents without annotations included, while retaining the annotations for internal use.
Review, Export, Upload Speed
While not often discussed, eDiscovery and review are extremely time-intensive, and therefore expensive endeavors. Speeding up the process can save millions.
Hopefully this brief glossary of eDiscovery terms will help you narrow down your choices to help you make the best electronic discovery software decision for your firm or legal department. What terms should we include on our next one? Let us know in the comments!