Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

Kicker Features Topics Law Firm Management Technology Media and Telecom Sections News LJN Newsletters

Successful Data Migration

By David Hartmann and Scott Giordano
March 29, 2013

When corporate legal and IT departments deploy new enterprise software, migrating legacy data into the new system is usually one of the larger challenges faced. When it comes to e-discovery software, this challenge is exasperated as matter information may be contained in legacy systems or in a collection of spreadsheets or other ad hoc tools. This challenge presents unique risks, since lost or altered electronically stored information (ESI) or audit trails can lead to opposing counsel questioning the integrity of the entire e-discovery process, with judicial sanctions looming. Put simply, implementation teams have to get it right the first time. It's easy to think of data migrations purely in terms of technical requirements. But like any complex project, they must be approached as a process, involving various stakeholders and a carefully defined sequence of activities.

The essence of any data migration project involves the following stages:

Gathering project requirements and assessing related parameters;
Defining and analyzing the source data;
Identifying and mapping the information;
Extracting and ingesting the data; and
Validating the results, gaining user acceptance and going live.

Project Requirements and Assessing Parameters

Gathering requirements for data migration initiatives involves assigning project sponsors and having them answer potentially difficult threshold questions in advance of deployment, including:

What are the overall goals of the migration?
What are the risks of going forward?
How quickly does the project need to be completed?
What information do we want to move?
Do we have the right people on the project team?

Having a common understanding of project goals is very important. It is easy to assume that data migrations involve the simple goal of transferring all data from one system to another. In fact, transferring certain ESI previously stored on the legacy system may not be necessary ' or even desirable ' depending on the nature of the systems involved. What's more, data migrations often serve as good opportunities for organizations to assess the data and determine what ESI, if any, can be safely discarded without threat of legal consequence. In this sense, some organizations leverage data migrations as an opportunity to assess and 'clean up' their data as part of defensible data deletion initiatives. It's important that such priorities be woven into the overall project goals and evaluated for success at the end of the project.

Ensuring the project has the right team members is also very important. Successful migrations require subject matter experts who can identify significant risks and requirements, such as the precise attributes of the data elements, knowledge about the organization's IT infrastructure, and what ESI is currently under legal hold or has been triggered for preservation for pending legal matters. For example, a database administrator (DBA) knows the myriad formats and attributes of the data elements of a given repository; that expertise is necessary for defining the protocols needed for successful integration with and migration to the new e-discovery system. See Figure 1, below.

[IMGCAP(1)]

Defining and Analyzing the Source Data

Once the project parameters have been defined and the project team assembled, the next step involves defining and analyzing the source data. Its structure must match the necessary format of the destination system. Even the smallest discrepancies can result in a flawed migration. Every data element will have a particular format and attributes. For example, when formatting a cell in an Excel spreadsheet, users have formatting options, such as date, time, percentage and scientific. These options are necessary because IT systems expect information in a particular format and will not process it otherwise. Dates are a particularly good example of this consideration because they can be formatted in any variety of ways: dd-mm-yyyy, dd-mm-yy, yyyy-mm-dd and many more.

Adding a further wrinkle to this is that source data may come from a variety of ad hoc and or formal e-discovery systems, such as matter management, review point tools or Excel spreadsheets, or enterprise-managed systems like Microsoft Access, SharePoint or Lotus Notes. Migrating data from Lotus Notes poses unique challenges on its own as most Notes systems are developed in-house and contain data structures that are not consistent with industry standards. Conversely, an existing e-discovery point tool might contain more universal data structures but tend to logically relate and organize the data much differently than the destination system.

Identifying and Mapping the Information

The next consideration is how source data will be 'mapped' to the new system. In the e-discovery context, matter, legal hold and custodian records will each have a unique configuration. Some tasks for mapping include:

Identifying what ESI from a particular source needs to be moved;
Understanding where that information will appear in the new system;
Defining what type of character coding, field types and formatting are involved; and
Identifying the unique identifiers.

The first task is potentially the most difficult because of the potential record volumes. For example, a legal department may have 2,000 active matters with an average of 10 holds and 100 custodians per matter, creating 200,000 record combinations requiring migration. Add to this the history of a given legal hold, which itself may contain scores of entries, and suddenly the potential volume has exploded. Referring back to the project goals, and deciding whether to migrate legal hold histories or just the current active elements of a hold, implicates how a legal department can address a failure during a hold process. Not keeping the hold histories involves the risk that if there is a failure with a given hold, it may be difficult or impossible to reconstruct what precisely went wrong and potentially exculpatory information will not be available.

For e-discovery systems, the smallest details must be taken into account. For example, with legal holds, a source system Matter Name field may allow for 256 characters with special characters permitted, like the '#' symbol, in the name of the matter. The destination system might not allow for such symbols or as many characters in the corresponding field. The identification of unique identifiers within the data set is especially critical for mapping. Data, in its rawest form, is decentralized and seemingly random. Unique identifiers are the elements within data sets that link records together in a logical way. Using a legal hold example, unique identifiers are what allow a system to precisely recognize the connection between a particular matter, all the legal holds that fall under it and the implicated custodians. In short, it is impossible to successfully map data to a new environment without first understanding how it's connected within the legacy system.

Extracting and Ingesting the Data

After the mapping strategy and configuration are complete, a test migration with sample data should be conducted. Data can be extracted from the source and ingested into another in a variety of ways. Two popular methods include the use of eXtensible Markup Language (XML) and a Comma-Separated Value (CSV) table. In the former, all of the formatting information is included with the data values, so that an XML-enabled system receiving the data will 'know' everything about it, including where to place it. In the latter, every data element of a given record will be copied into a text file where each data value is separated by a 'delimiter,' a character such as a comma, which represents a boundary between values. Each subsequent record continues to be copied, one after another, with the same number of data types with unique data values for each record extracted. This continues with one record flowing after the next until all records are extracted. A utility in the new system will read each of these elements, using the delimiter as a guide, and copy the data elements (i.e., ingest them) into their proper fields in the new matter, legal hold, or custodian record.

The granular nature of the CSV method underscores the potential pitfalls of the ingestion process. For example, if a delimiter for the configuration of the extraction file is a comma and the name of the matter has a comma in it, the Matter Name in the source system will be two fields instead of one. That means the other half of the Matter Name after the comma wasn't ingested properly and likely mapped to whatever field came after the Matter Name in the migration process. See Figure 2, below.

[IMGCAP(2)]

Validating the Results, Gaining User Acceptance and Going Live

Moving the system into production involves installing the new e-discovery software behind the corporate firewall onto the end users' hardware, backing up the source data and then conducting the migration. During the test migration process, a utility program will check that both the source and the target data elements match, and if they don't, will note the mismatch (or any other problem) in an error log. The project team will review these logs and attempt to resolve the errors. The process can be very tedious and time consuming because errors can stem from so many sources. Multiple test migrations may be necessary to resolve just one error.

Once the test migration is complete and the migration model is validated, the configuration will be deployed to intermediary hardware for User Acceptance Testing (UAT), where the end users will begin working with it as they would in a typical work day in order to determine if it functions as planned. UAT can last anywhere from weeks to months, depending on a number of factors, including: the complexity of the migration; the volume of information being transferred; and the urgency with which the new system must be up and running. Once the business users (the legal department team) and the IT team formally accept the proposed new system, it will be moved off of the intermediary hardware and into production. The migration should be monitored 'live' in order to catch any problems and, if necessary, stop the process.

The business users and IT team will conduct a final evaluation before the system is accepted to 'go live' by revisiting the success criteria set forth at the beginning of the project and determining if it has been met with the integrity of data intact. In addition, the project sponsors must accept the final result of the migration and be prepared to cease operational practices that would cause old systems to generate new data.

Conclusion

Data migration is a necessary evil of the e-discovery world. An organization's e-discovery requirements can quickly change and necessitate the acquirement of a new system. While the systems that support e-discovery processes may rapidly change, the underlying information within them cannot. The process of moving that information is inherently complex and fraught with risk. Those risks, however, can be significantly mitigated by following the steps outlined in this article.

David Hartmann is Director of Client Success at Exterro. With more than 15 years of experience in complex software implementations, Hartmann heads up the planning and execution of project charters and criteria processes. Scott Giordano is corporate technology counsel at Exterro. Giordano holds both Information Security Systems Professional (CISSP) and Certified Information Privacy Professional (CIPP) certifications and serves as Exterro's subject matter expert on the intersection of law and technology.

The essence of any data migration project involves the following stages:

Gathering project requirements and assessing related parameters;
Defining and analyzing the source data;
Identifying and mapping the information;
Extracting and ingesting the data; and
Validating the results, gaining user acceptance and going live.

Project Requirements and Assessing Parameters

Gathering requirements for data migration initiatives involves assigning project sponsors and having them answer potentially difficult threshold questions in advance of deployment, including:

What are the overall goals of the migration?
What are the risks of going forward?
How quickly does the project need to be completed?
What information do we want to move?
Do we have the right people on the project team?

[IMGCAP(1)]

Defining and Analyzing the Source Data

Identifying and Mapping the Information

Identifying what ESI from a particular source needs to be moved;
Understanding where that information will appear in the new system;
Defining what type of character coding, field types and formatting are involved; and
Identifying the unique identifiers.

Extracting and Ingesting the Data

[IMGCAP(2)]

Validating the Results, Gaining User Acceptance and Going Live

Conclusion

Read These Next

Major Differences In UK, U.S. Copyright Laws Image

This article highlights how copyright law in the United Kingdom differs from U.S. copyright law, and points out differences that may be crucial to entertainment and media businesses familiar with U.S law that are interested in operating in the United Kingdom or under UK law. The article also briefly addresses contrasts in UK and U.S. trademark law.

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

Legal Possession: What Does It Mean? Image

Possession of real property is a matter of physical fact. Having the right or legal entitlement to possession is not "possession," possession is "the fact of having or holding property in one's power." That power means having physical dominion and control over the property.

In 1987, a unanimous Court of Appeals reaffirmed the vitality of the "stranger to the deed" rule, which holds that if a grantor executes a deed to a grantee purporting to create an easement in a third party, the easement is invalid. Daniello v. Wagner, decided by the Second Department on November 29th, makes it clear that not all grantors (or their lawyers) have received the Court of Appeals' message, suggesting that the rule needs re-examination.

Site Search

Follow Us

Law.com Subscribers SAVE 30%

Successful Data Migration

Law.com Subscribers SAVE 30%

Successful Data Migration

Major Differences In UK, U.S. Copyright Laws

The Article 8 Opt In

Strategy vs. Tactics: Two Sides of a Difficult Coin

Legal Possession: What Does It Mean?

The Stranger to the Deed Rule