Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

OpenSource Sanitize Can Speed Up KM

By Warren Jones
September 02, 2004

Just like many other knowledge organizations, we at Pillsbury Winthrop LLP have invested thousands of dollars in processes, training and technology to capture and leverage our partners' knowledge. Early on we realized that the ability to share personal knowledge and prior work would determine our place in a very competitive market. Knowledge Management had the potential to flush out communication and re-use issues, create a cohesive firm identity and ensure consistency in information quality.

With this in mind, 2 years ago, we set out on a comprehensive KM strategy that would ultimately improve how we manage firm information. Identifying and implementing the tools and processes needed to best manage electronic client, matter and expertise files took some time. Some of the tools were readily available, but others, including Sanitize by OpenSource, took significant time to either find or develop.

Once we began using our knowledge files to manage expertise, we found that most, if not all, documents created as part of a transaction include sensitive information pertinent to that transaction. This data can include deal details, such as price; company identifiers, such as name and address, and personal information, such as individuals' names, titles, and phone numbers. Similar to the metadata of a shared file, this data can give away potentially sensitive information. Removing sensitive details manually is very time-consuming, and can lead to errors and inadvertent information exposure.

A solution to this issue was simply impossible to find. We turned to the OpenSource folks, who specialize in automatic understanding of legal text, and whose products we use in other areas. They listened to our needs and developed Sanitize. Although there is no system that can provide a 100% accurate review of all sensitive information in a document, Sanitize gives our lawyers a powerful yet easy to use tool for reviewing prior work and removing potentially compromising information that might otherwise go undetected.

How It Works

The OpenSource Sanitize module is fully integrated with MS-Word, the main drafting tool that our attorneys use. Just like our metadata cleaning tool, a VBA agent (Word macro) automatically invokes Sanitize, whenever a document is filed into our knowledge folders. Sanitize can also be launched via a toolbar button.

OpenSource Sanitize analyzes the text of the document and removes pre-defined data. In our case, for transaction agreements this includes prices, names of companies and people, addresses and telephone numbers. This information is substituted with dummy data of our choice. The whole process is done in the background, without the intervention of the attorney.

Before finalizing the document, Sanitize requires feedback from the user, in our case an attorney. Currently, the system is setup so that it may capture information that is not necessarily confidential (false positives) but would not omit information that may be sensitive (no false negatives).

The user can save his/her changes and return to the feedback screen later, or “finalize” the document. Finalizing the document applies all the required substitution changes and invokes the VBA agent, which then returns the document to the Pillsbury Winthrop network in Word format. The document is now safe to post in our Knowledge folders, and otherwise share across and outside the firm.

Conclusion

OpenSource Sanitize is a small but crucial component of our Knowledge Management system. It enables our attorneys to trust the system, knowing that it operates within the confidentiality standards we are committed to.

The product is young ' it was created based on our request just a few months back ' and still needs to be expanded to support additional document types. Notwithstanding, I would recommend it to anyone who is trying to implement a Knowledge Management effort. As far as I know, this is the only working solution available, and it gets better and better with every release.



Warren Jones [email protected]

Just like many other knowledge organizations, we at Pillsbury Winthrop LLP have invested thousands of dollars in processes, training and technology to capture and leverage our partners' knowledge. Early on we realized that the ability to share personal knowledge and prior work would determine our place in a very competitive market. Knowledge Management had the potential to flush out communication and re-use issues, create a cohesive firm identity and ensure consistency in information quality.

With this in mind, 2 years ago, we set out on a comprehensive KM strategy that would ultimately improve how we manage firm information. Identifying and implementing the tools and processes needed to best manage electronic client, matter and expertise files took some time. Some of the tools were readily available, but others, including Sanitize by OpenSource, took significant time to either find or develop.

Once we began using our knowledge files to manage expertise, we found that most, if not all, documents created as part of a transaction include sensitive information pertinent to that transaction. This data can include deal details, such as price; company identifiers, such as name and address, and personal information, such as individuals' names, titles, and phone numbers. Similar to the metadata of a shared file, this data can give away potentially sensitive information. Removing sensitive details manually is very time-consuming, and can lead to errors and inadvertent information exposure.

A solution to this issue was simply impossible to find. We turned to the OpenSource folks, who specialize in automatic understanding of legal text, and whose products we use in other areas. They listened to our needs and developed Sanitize. Although there is no system that can provide a 100% accurate review of all sensitive information in a document, Sanitize gives our lawyers a powerful yet easy to use tool for reviewing prior work and removing potentially compromising information that might otherwise go undetected.

How It Works

The OpenSource Sanitize module is fully integrated with MS-Word, the main drafting tool that our attorneys use. Just like our metadata cleaning tool, a VBA agent (Word macro) automatically invokes Sanitize, whenever a document is filed into our knowledge folders. Sanitize can also be launched via a toolbar button.

OpenSource Sanitize analyzes the text of the document and removes pre-defined data. In our case, for transaction agreements this includes prices, names of companies and people, addresses and telephone numbers. This information is substituted with dummy data of our choice. The whole process is done in the background, without the intervention of the attorney.

Before finalizing the document, Sanitize requires feedback from the user, in our case an attorney. Currently, the system is setup so that it may capture information that is not necessarily confidential (false positives) but would not omit information that may be sensitive (no false negatives).

The user can save his/her changes and return to the feedback screen later, or “finalize” the document. Finalizing the document applies all the required substitution changes and invokes the VBA agent, which then returns the document to the Pillsbury Winthrop network in Word format. The document is now safe to post in our Knowledge folders, and otherwise share across and outside the firm.

Conclusion

OpenSource Sanitize is a small but crucial component of our Knowledge Management system. It enables our attorneys to trust the system, knowing that it operates within the confidentiality standards we are committed to.

The product is young ' it was created based on our request just a few months back ' and still needs to be expanded to support additional document types. Notwithstanding, I would recommend it to anyone who is trying to implement a Knowledge Management effort. As far as I know, this is the only working solution available, and it gets better and better with every release.



Warren Jones Pillsbury Winthrop LLP [email protected]
Read These Next
Overview of Regulatory Guidance Governing the Use of AI Systems In the Workplace Image

Businesses have long embraced the use of computer technology in the workplace as a means of improving efficiency and productivity of their operations. In recent years, businesses have incorporated artificial intelligence and other automated and algorithmic technologies into their computer systems. This article provides an overview of the federal regulatory guidance and the state and local rules in place so far and suggests ways in which employers may wish to address these developments with policies and practices to reduce legal risk.

Is Google Search Dead? How AI Is Reshaping Search and SEO Image

This two-part article dives into the massive shifts AI is bringing to Google Search and SEO and why traditional searches are no longer part of the solution for marketers. It’s not theoretical, it’s happening, and firms that adapt will come out ahead.

While Federal Legislation Flounders, State Privacy Laws for Children and Teens Gain Momentum Image

For decades, the Children’s Online Privacy Protection Act has been the only law to expressly address privacy for minors’ information other than student data. In the absence of more robust federal requirements, states are stepping in to regulate not only the processing of all minors’ data, but also online platforms used by teens and children.

Revolutionizing Workplace Design: A Perspective from Gray Reed Image

In an era where the workplace is constantly evolving, law firms face unique challenges and opportunities in facilities management, real estate, and design. Across the industry, firms are reevaluating their office spaces to adapt to hybrid work models, prioritize collaboration, and enhance employee experience. Trends such as flexible seating, technology-driven planning, and the creation of multifunctional spaces are shaping the future of law firm offices.

From DeepSeek to Distillation: Protecting IP In An AI World Image

Protection against unauthorized model distillation is an emerging issue within the longstanding theme of safeguarding intellectual property. This article examines the legal protections available under the current legal framework and explore why patents may serve as a crucial safeguard against unauthorized distillation.