Electronic discovery. This was the use that kept coming to my mind as I tested dtSearch, an immensely fast and powerful desktop and enterprise search tool. Its marketing tagline is, “Instantly search terabytes of text.” I can’t vouch for terabytes, but after testing it on a single laptop, I can say it is the fastest desktop search tool I’ve ever used. More importantly, it has the broadest range of search options of any desktop search tool I’ve used, with more than two dozen indexed, unindexed, fielded and full-text search options.
Its power and versatility were the reasons I kept thinking of e-discovery as I tested dtSearch. The Dec. 1 amendments to the Federal Rules of Civil Procedure underscore the obligation of parties in litigation to provide electronic data in discovery. That means finding the data. And finding it requires a powerful and versatile search tool. Obviously, for large-scale e-discovery projects, lawyers would want to bring in professional consultants. But for e-discovery on a smaller scale, a tool such as dtSearch could prove invaluable.
The variety of search types is impressive. dtSearch allows:
- Phrase searching finds phrases such as: due process of law.
- Boolean operators and/or/not can join search words and phrases: due process of law and not (equal protection or civil rights).
- Proximity searching finds a word or phrase within a number of words of another word or phrase: apple pie w/38 peach cobbler.
- Directed proximity searching finds a word or phrase within a number of words before another word or phrase: apple pie pre/38 peach cobbler.
- Phonic searching finds words that sound alike, like Smythe in a search for Smith.
- Stemming finds variations on endings, such as applies, applied, applying in a search for apply.
- Numeric range searching finds any number between two numbers, such as between 6 and 36.
- Creation of macros to include frequent terms in a search request.
- Wildcard support allows ? to hold a single letter place, and * to hold multiple letter places: apple* and not appl?sauce.
- Fuzzy searching, to find words even if they are misspelled. The use can tune this feature to greater and lesser degrees of “fuzziness.”
- Concept, synonym and thesaurus searching. For example, search for “fast” and find “quick” and “speedy.”
dtSearch is capable of searching more than 50 types of files, including common law-office and business formats such as Adobe Acrobat, Eudora message files, HTML, JPEG, MHT archives, MIME messages, all Microsoft Office types, OpenOffice 2.x and 1.x files, TAR, TIFF, WordPerfect, WordStar, XBase, XML and ZIP.
On top of all that, dtSearch includes features specifically designed for corporate and forensic applications. These include e-mail filtering; automatic parsing of text segments in large data blocks, such as those recovered through an “undelete” process, from unallocated computer space, or from partially recovered file fragments; language recognition algorithms for detecting text in a variety of languages; a filtering algorithm for scanning recovered data blocks using multiple text encoding detection methods; and automatic recovery of text from corrupt forensically-recovered documents.
Installation of dtSearch is quick, but before you can use it, you must create a search index. A nice feature of dtSearch is the ability to create multiple indexes. You could index your entire hard drive, if you wish, or create separate indexes for different types of files or data, such as an Outlook-only index. Initially, I chose to create an index of everything in the My Documents folder and in Outlook. This took about 30 minutes, after which dtSearch reported that it had indexed 9,145 documents and 10.9 million words.
Once the index is created, searching is every bit as fast as the company promises — virtually instantaneous. Results are displayed in a horizontal split-screen. The list of matching files appears in the top pane. As you click on each file name, the document appears in the bottom pane, in its native format, with the search terms highlighted. Search results can be sorted by relevance, date or hit count. As you search, you can enter Boolean denominators manually or select them with the click of a button. You can choose whether to use fuzzy, phonic and synonym searching. You can also filter results by file types and dates.
dtSearch comes packaged with two additional search tools: CD Wizard and dtSearch Web. The CD Wizard is a handy tool that creates searchable indexes for backup CDs and DVDs. The index is stored directly on the disc and autoruns whenever the disc is inserted in a computer. The Web tool converts searchable files into HTML so that they can be viewed using a Web browser.
For me, a minor inconvenience is that dtSearch does not continually update its indexes. You have to update the index manually by selecting “Update index.” This means, for example, that a search would not find e-mails you received subsequent to the last update. On the plus side, you can schedule dtSearch to perform these updates automatically. An update to my original index, run about a week later, took less than three minutes to finish.
For someone used to the simplicity of, say, Google Desktop Search, dtSearch is a bit more daunting with a less user-friendly interface. For example, the first time I attempted to update the index, I ended up recreating it from scratch. I could find nothing in the help manual to explain why. Only after tinkering a bit did I found that I had to uncheck an option that came checked by default.
Another big difference is price. While Google Desktop Search is free, dtSearch is $199 for the single-user, desktop version. The network version starts at $800 for five users. Before you buy, you can download a fully functional, 30-day evaluation version.
But the trade off is a far more powerful and configurable search tool. As I said at the outset, more than a search tool, dtSearch is an application that lawyers can use for electronic discovery and management of electronic documents. For anyone interested in a powerful, versatile desktop search tool, dtSearch is worth considering.