Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Navigating the Tricky Terrain of Remote and Self-Collections

By Gavin W. Manes and Tom O'Connor
July 27, 2012

Although predictive coding has been the most prominent buzzword in e-discovery circles this year, remote collection of Electronically Stored Information (ESI) remains a hot topic. Like self-collection of data, with which remote collection is often associated, remote collections have been viewed by IT staff as a way to save time and money. But legal professionals remain skeptical. Remote and/or self-collections can be dangerous to the integrity of the data and the case if not handled properly. e-Discovery experts are justifiably suspicious of their validity, especially if a lawyer has not been involved in overseeing the process. In many cases, remote collection is likely to be indefensible in court, unless certain guidelines are followed throughout the process.

What is Collection?

To define ESI collection, let's first look at the standard forensic collection process. A forensically sound data acquisition is conducted in a controlled environment by an experienced forensics practitioner. This process is not invasive to the original data and does not change any data before, during or after the data acquisition process. For instance, in a complete forensics copy of a hard drive, all information is copied in a bit-for-bit process, including deleted files, unallocated disk space, slack space and partition waste space.

Collection may be accomplished in several ways. The easiest method is to remove the original drive from its native environment and connect it to a computer that has hardware and software optimized to support the forensic process. The preferred method, however, is to use hardware write-blocking technology specifically designed for the forensics process. NIST (National Institute of Standards and Technology; www.nist.gov) provides a list of tested write-blocking technologies.

When hardware write-blocking or removal of the hard drive is not possible, the drive may be left in the computer and the computer booted using a modified
version of an operating system which has been “neutered” to prevent it from changing any data on disk drives connected to the computer.

But both of these operations require a technician to be physically present. When you factor in hourly fees and travel expenses, the cost of acquiring one drive in a forensically sound manner can easily exceed $1,000. Additionally, the often tight timelines imposed by the Federal Rules of Civil Procedure can make timely collection of relevant information a daunting challenge.

Remote Collection

But what if the collection could be done without a technician physically present? Enterprise IT professionals have collected some types of data remotely for many years. Smart routers not only control IP connections and monitor overall network traffic, but can also perform statistical analysis and even provide alarm notification systems. Remote measurement and data acquisition of information regarding humidity levels in the PC server room and temperatures in the production department have long been commonplace.

So could those same technologies be used to collect data in the e-discovery process?

Remote Collection through Network

Collecting data from remote offices and laptops/desktops can now be done through a local self-managed agent on the LAN or WAN. This allows IT staff to collect at the best time and can be done either with an index of the data to enable searching and culling or without an index for a quick and efficient collection. Since it is being performed on the network, this collection can include all ESI sources, such as encrypted laptops, thumb drives and even recycle bins.

Remote Collection through
Hardware Device

Remote data collection can also be done by sending a small piece of hardware, similar to a thumb drive, to a location where it is used to gather data in a forensically sound manner. The device is plugged into the target computer, where it automatically finds data that has been deemed relevant, and saves it either to the device or to an accompanying hard drive. This solution has been in play for several years. Access Data, Guidance Software and many others have hardware devices that can be purchased and configured to perform automated collection tasks.

Issues with Remote Collection

But does the pressure on IT organizations to lower budgets drive them to embrace solutions that are technologically sound but legally risky? Do these remote tools meet the current legal litigation hold standards imposed by the courts?

Current legal standards require a repeatable, defensible process, while IT standards are fluid and ever-changing. Thus, a typical IT procedure used to export data from a system may truncate or alter the data in a way that fails to be compliant with discovery rules, though it may have no impact on daily business activities. In the dynamic world of IT tasks, lack of training in legal requirements can be costly.

In the forefront of the legal standard is the need for active supervision by an attorney. It seems unlikely this will occur in most network-based collections. In cases where an attorney, especially outside counsel, does become actively involved, the question arises whether the time billed for that activity will constitute any sort of time savings over physical collection by an onsite technician.

With regard to collection through portable devices, the problems inherent in self-collection grow larger. Most remote collection tools involve the pre-designation of specific data and are not performing a true bit-by-bit collection. In essence, these tools are set to collect items with a particular file extension or header, which means they do not collect everything. Therefore, self-collection tools would likely not include any deleted e-mails in slack or free space.

e-Mail collection presents unique problems. It is likely that a self-collection device will gather everything without specifying account names, and it would be difficult to train a device to target one mail account. Webmail (such as GMail or Hotmail) can be difficult to find even during a typical human-powered investigation, so the likelihood of locating that information using a self-collection tool is small.

Furthermore, the manufacturers of these devices have clearly assumed that the amount of information to be collected is fairly small, which may not be true in cases where graphics or e-mail files are the target. Irregular file types will be skipped since the search is based on extensions, and unknown extensions would not likely be located by an automated tool.

Since self-collection searches of this nature are most likely based on keywords or extensions, encrypted files would not necessarily be located. It is possible these tools could find files if they are single-encrypted, such as a password-protected Word document.

It is most likely that these collection devices operate based on the 80/20 rule: If they collect the most common extensions or headers, you are likely to get at least 80% of what you want. But for a full understanding of drive contents, it would be necessary to have a person skilled in the art of forensic investigation collect the information from the drive's slack and free space in order to make educated estimates about deleted materials.

Another important question is when something goes wrong, who will testify? Are the IT professionals ready to appear in court? Will the manufacturer of a remote collection agent or hardware device be called to the stand when key evidence has not been properly collected? Who pays for that testimony and who accepts the liability for the error? This is a similar issue to predictive coding: If no one really knows how it works, who will testify when something goes wrong?

Remote Assisted Collection

Remote assisted collection is a combination of the two solutions mentioned above that includes an element of supervision.

A vendor pre-loads a hard drive with imaging software, ships it to the customer, and the customer plugs that into the computer to be imaged. Then the collection supervisor sets up a remote connection with the customer's computer to be imaged. The collection supervisor then talks the customer through the collection process while watching his or her actions or takes over and performs the collection. Activity can also be recorded through the remote connection as an additional verification step.

This process also allows collection supervisors to be in the chain of custody even if they are not physically onsite. Multiple parties can watch this process or there can be multiple collection supervisors. This process works even if it is not a full forensics copy since there is a human decision about what to collect.

Conclusion

Although it is possible to automate collection processes both on a network and with devices provided for self-collection, the main issue with forensic collection isn't necessarily the automation, but the proper handling of evidence from the beginning of the collection process all the way through production of that information to the appropriate parties. Current legal standards call for a repeatable, defensible process, which means the active involvement of attorneys in each step of the process. Fast and cost-effective collection methods also need to be fully defensible and legally sound.


Gavin W. Manes Ph.D. ([email protected]) is president and CEO of Avansic, a Tulsa-based company that provides ESI processing, e-discovery, and digital forensics services to law firms and companies. Manes has briefed the White House, Department of the Interior, the National Security Council, and the Pentagon on computer security and forensics issues. Tom O'Connor ([email protected]) is director of professional services at Avansic. Based in New Orleans, O'Connor is best known for his work in e-discovery, which includes assisting firms and corporate counsel in matters of retention policies, litigation holds and document exchange protocols.

Although predictive coding has been the most prominent buzzword in e-discovery circles this year, remote collection of Electronically Stored Information (ESI) remains a hot topic. Like self-collection of data, with which remote collection is often associated, remote collections have been viewed by IT staff as a way to save time and money. But legal professionals remain skeptical. Remote and/or self-collections can be dangerous to the integrity of the data and the case if not handled properly. e-Discovery experts are justifiably suspicious of their validity, especially if a lawyer has not been involved in overseeing the process. In many cases, remote collection is likely to be indefensible in court, unless certain guidelines are followed throughout the process.

What is Collection?

To define ESI collection, let's first look at the standard forensic collection process. A forensically sound data acquisition is conducted in a controlled environment by an experienced forensics practitioner. This process is not invasive to the original data and does not change any data before, during or after the data acquisition process. For instance, in a complete forensics copy of a hard drive, all information is copied in a bit-for-bit process, including deleted files, unallocated disk space, slack space and partition waste space.

Collection may be accomplished in several ways. The easiest method is to remove the original drive from its native environment and connect it to a computer that has hardware and software optimized to support the forensic process. The preferred method, however, is to use hardware write-blocking technology specifically designed for the forensics process. NIST (National Institute of Standards and Technology; www.nist.gov) provides a list of tested write-blocking technologies.

When hardware write-blocking or removal of the hard drive is not possible, the drive may be left in the computer and the computer booted using a modified
version of an operating system which has been “neutered” to prevent it from changing any data on disk drives connected to the computer.

But both of these operations require a technician to be physically present. When you factor in hourly fees and travel expenses, the cost of acquiring one drive in a forensically sound manner can easily exceed $1,000. Additionally, the often tight timelines imposed by the Federal Rules of Civil Procedure can make timely collection of relevant information a daunting challenge.

Remote Collection

But what if the collection could be done without a technician physically present? Enterprise IT professionals have collected some types of data remotely for many years. Smart routers not only control IP connections and monitor overall network traffic, but can also perform statistical analysis and even provide alarm notification systems. Remote measurement and data acquisition of information regarding humidity levels in the PC server room and temperatures in the production department have long been commonplace.

So could those same technologies be used to collect data in the e-discovery process?

Remote Collection through Network

Collecting data from remote offices and laptops/desktops can now be done through a local self-managed agent on the LAN or WAN. This allows IT staff to collect at the best time and can be done either with an index of the data to enable searching and culling or without an index for a quick and efficient collection. Since it is being performed on the network, this collection can include all ESI sources, such as encrypted laptops, thumb drives and even recycle bins.

Remote Collection through
Hardware Device

Remote data collection can also be done by sending a small piece of hardware, similar to a thumb drive, to a location where it is used to gather data in a forensically sound manner. The device is plugged into the target computer, where it automatically finds data that has been deemed relevant, and saves it either to the device or to an accompanying hard drive. This solution has been in play for several years. Access Data, Guidance Software and many others have hardware devices that can be purchased and configured to perform automated collection tasks.

Issues with Remote Collection

But does the pressure on IT organizations to lower budgets drive them to embrace solutions that are technologically sound but legally risky? Do these remote tools meet the current legal litigation hold standards imposed by the courts?

Current legal standards require a repeatable, defensible process, while IT standards are fluid and ever-changing. Thus, a typical IT procedure used to export data from a system may truncate or alter the data in a way that fails to be compliant with discovery rules, though it may have no impact on daily business activities. In the dynamic world of IT tasks, lack of training in legal requirements can be costly.

In the forefront of the legal standard is the need for active supervision by an attorney. It seems unlikely this will occur in most network-based collections. In cases where an attorney, especially outside counsel, does become actively involved, the question arises whether the time billed for that activity will constitute any sort of time savings over physical collection by an onsite technician.

With regard to collection through portable devices, the problems inherent in self-collection grow larger. Most remote collection tools involve the pre-designation of specific data and are not performing a true bit-by-bit collection. In essence, these tools are set to collect items with a particular file extension or header, which means they do not collect everything. Therefore, self-collection tools would likely not include any deleted e-mails in slack or free space.

e-Mail collection presents unique problems. It is likely that a self-collection device will gather everything without specifying account names, and it would be difficult to train a device to target one mail account. Webmail (such as GMail or Hotmail) can be difficult to find even during a typical human-powered investigation, so the likelihood of locating that information using a self-collection tool is small.

Furthermore, the manufacturers of these devices have clearly assumed that the amount of information to be collected is fairly small, which may not be true in cases where graphics or e-mail files are the target. Irregular file types will be skipped since the search is based on extensions, and unknown extensions would not likely be located by an automated tool.

Since self-collection searches of this nature are most likely based on keywords or extensions, encrypted files would not necessarily be located. It is possible these tools could find files if they are single-encrypted, such as a password-protected Word document.

It is most likely that these collection devices operate based on the 80/20 rule: If they collect the most common extensions or headers, you are likely to get at least 80% of what you want. But for a full understanding of drive contents, it would be necessary to have a person skilled in the art of forensic investigation collect the information from the drive's slack and free space in order to make educated estimates about deleted materials.

Another important question is when something goes wrong, who will testify? Are the IT professionals ready to appear in court? Will the manufacturer of a remote collection agent or hardware device be called to the stand when key evidence has not been properly collected? Who pays for that testimony and who accepts the liability for the error? This is a similar issue to predictive coding: If no one really knows how it works, who will testify when something goes wrong?

Remote Assisted Collection

Remote assisted collection is a combination of the two solutions mentioned above that includes an element of supervision.

A vendor pre-loads a hard drive with imaging software, ships it to the customer, and the customer plugs that into the computer to be imaged. Then the collection supervisor sets up a remote connection with the customer's computer to be imaged. The collection supervisor then talks the customer through the collection process while watching his or her actions or takes over and performs the collection. Activity can also be recorded through the remote connection as an additional verification step.

This process also allows collection supervisors to be in the chain of custody even if they are not physically onsite. Multiple parties can watch this process or there can be multiple collection supervisors. This process works even if it is not a full forensics copy since there is a human decision about what to collect.

Conclusion

Although it is possible to automate collection processes both on a network and with devices provided for self-collection, the main issue with forensic collection isn't necessarily the automation, but the proper handling of evidence from the beginning of the collection process all the way through production of that information to the appropriate parties. Current legal standards call for a repeatable, defensible process, which means the active involvement of attorneys in each step of the process. Fast and cost-effective collection methods also need to be fully defensible and legally sound.


Gavin W. Manes Ph.D. ([email protected]) is president and CEO of Avansic, a Tulsa-based company that provides ESI processing, e-discovery, and digital forensics services to law firms and companies. Manes has briefed the White House, Department of the Interior, the National Security Council, and the Pentagon on computer security and forensics issues. Tom O'Connor ([email protected]) is director of professional services at Avansic. Based in New Orleans, O'Connor is best known for his work in e-discovery, which includes assisting firms and corporate counsel in matters of retention policies, litigation holds and document exchange protocols.

Read These Next
How Secure Is the AI System Your Law Firm Is Using? Image

What Law Firms Need to Know Before Trusting AI Systems with Confidential Information In a profession where confidentiality is paramount, failing to address AI security concerns could have disastrous consequences. It is vital that law firms and those in related industries ask the right questions about AI security to protect their clients and their reputation.

COVID-19 and Lease Negotiations: Early Termination Provisions Image

During the COVID-19 pandemic, some tenants were able to negotiate termination agreements with their landlords. But even though a landlord may agree to terminate a lease to regain control of a defaulting tenant's space without costly and lengthy litigation, typically a defaulting tenant that otherwise has no contractual right to terminate its lease will be in a much weaker bargaining position with respect to the conditions for termination.

Pleading Importation: ITC Decisions Highlight Need for Adequate Evidentiary Support Image

The International Trade Commission is empowered to block the importation into the United States of products that infringe U.S. intellectual property rights, In the past, the ITC generally instituted investigations without questioning the importation allegations in the complaint, however in several recent cases, the ITC declined to institute an investigation as to certain proposed respondents due to inadequate pleading of importation.

Authentic Communications Today Increase Success for Value-Driven Clients Image

As the relationship between in-house and outside counsel continues to evolve, lawyers must continue to foster a client-first mindset, offer business-focused solutions, and embrace technology that helps deliver work faster and more efficiently.

The Power of Your Inner Circle: Turning Friends and Social Contacts Into Business Allies Image

Practical strategies to explore doing business with friends and social contacts in a way that respects relationships and maximizes opportunities.