Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

To Compress or Not to Compress

By Dean Sappey
November 30, 2015

Prolific document generation is often the mark of a successful law firm. However, as a result, the volume of the firm's electronic storage dramatically increases every year. In addition to electronically created documents, paper documents are now regularly scanned and converted to electronic files, adding more storage bulk. Proper recordkeeping is critical to support clients and compliance regulations, but the document store can become unwieldy and very expensive for a firm to maintain.

Legal IT professionals are tasked with keeping expenses low while also ensuring the firm supports client and compliance requirements. Compression can significantly reduce document file sizes in a Document Management System (DMS), SharePoint, Windows file system, or any content repository, as an automated backend process.

Reducing document file size reduces storage costs and speeds up file transfers when downloading or sending these documents via e-mail. Compressing documents can provide law firms with considerable advantages in these areas, though it is not suitable in all circumstances.

Why Compress Documents?

If a law firm is considering compressing documents, it is likely that certain strategic business or compliance priorities have forced the issue. The most common reasons for compressing documents include: 1) Paperless office initiatives; 2) Reducing storage size and associated costs; 3) e-Mail attachment size limits and mailbox storage; and 4) Long-term file retention.

Paperless Office Initiatives

Many of today's firms have established aggressive paperless office initiatives where nearly all paper is scanned and stored electronically. As paper documents are scanned into the DMS, SharePoint or a Windows file system, the scanned files accumulate quickly, putting a strain on the document repository and available storage space. Compressing scanned documents can help reduce the burden on the DMS and the requisite storage costs that accompany it.

Reducing Storage Size and Costs

Compression can shrink files by up to 50%, which significantly reduces storage costs and allows firms to save on maintenance and overhead of legacy document storage.

Consider the example of an average law firm with 300 users which has 4.5 million PDF, MSG and image documents to store. This includes incoming paper mail that is scanned, internally generated files, e-mails sent and received by all types of desktop and mobile devices, and bulk document data imports. In this scenario, the volume of data adds up to 3.5 terabytes (35,000 gigabytes) on disk, stored on-site on file servers or off-site in the cloud. Based on how many image-based files are included, the document repository can be compressed down to 40%-50% of its original size. Since the average cost to store one gigabyte is $5-$18 per year, compression can add up to tens or even hundreds of thousands of dollars in annual storage savings.

Ease of Sending or Receiving e-Mails with Large Attachments

Large attachment files can sometimes bounce an e-mail back to the sender, causing frustration and delays. This is especially problematic for firms with satellite offices that have to send outsized files. If the firm's e-mail threshold does not allow large attachments in or out, lawyers often resort to sending from personal e-mail accounts or sending the attachments piecemeal across several e-mails. If the files are compressed before being sent or upon receipt, the sender's messages and attachments can more reliably reach their destination intact. In turn, the recipient can freely forward the message if needed.

Recordkeeping and Compliance

A fourth advantage of compression is that it helps law firms afford keeping files long-term, depending on retention time periods required by law. Obligated to stay compliant with record-keeping regulations, firms can pay a very high price for this retention over many years. However, if they compress documents, the cost can be considerably less.

How Compression Works

Although JPG and GIF files can be compressed, PDF files are the global standard format for distribution and electronic document storage. Converting to PDF ensures not only smaller file size, but also long-term preservation.

PDF is an ideal format for compression because the text and images in the file are stored within different layers of the PDF. Once converted to PDF, the embedded images in the document can be compressed using standard JPEG, JPEG2000 and JBIG2 formats. Images or graphics in files can also be downsampled or converted from color to greyscale or monochrome (black and white) and with reduced resolution. This process can significantly reduce the size of documents ' up to 50% ' with more reduction potential occurring in files with high-resolution, color-rich images.

While you can compress text in a PDF, the compression gains would be insignificant. Any possibility of compression comes from downsampling the images in a PDF document.

Well-designed compression software can search through the documents in a DMS to find opportunities for size reduction based on certain criteria. Also, it can “interrogate” or look inside e-mail attachments and compress them.

Documents can be automatically marked for compression based on the percentage of image content in the overall document file size. For example, compression software can be set to look for documents where the image content is equal to 50% of the overall document file size ' the percentage threshold can be adjusted upwards or downwards. This is done by having the compression software first analyze or interrogate the PDF to determine the overall document file size and the image content size.

Compression Can Be Automated or Hands-On

When using compression software, legal IT professionals can be actively involved and hand-pick documents to be compressed, or the process can be partially or completely automated. Compression software analyzes documents in a content repository based on a particular search query and compression thresholds specified by the IT administrator. It then processes the documents that meet the criteria and saves them back into the content repository, replacing the originals with smaller compressed files. A backlog service can also be configured to compress legacy documents in batches, by date ranges and file types.

Another highly beneficial option is to OCR the PDFs prior to compression, which will then profile the smaller, fully text-searchable documents into the DMS. This allows the document text to be indexed and searched in the DMS. During this process, the administrator is able to review the document set before processing, and can also review it before the Save phase.

PDF Compression Is Permanent

Compression is a one-off permanent solution, unlike other end user “go-to” file reduction processes, such as WinZip or 7-Zip. Once compressed, an image file cannot be restored or expanded back to its original form. The compressed images in a PDF are permanently changed, and in some cases compression can lower the image quality. So a best-practice compression workflow must ensure that a backup of the original file is retained, should it be required for future reference. However, it is not recommended to compress files for which image quality and integrity is critical to maintain in a pristine state.

Best Practices for Regular Document Compression

The best way to ensure that a law firm's documents are regularly compressed is to automate the process as much as possible. When left to their own devices, end users tend to avoid compressing files as an extra step. The closest they get to compression is zipping or unzipping files when e-mailing documents, which is only a temporary compression process. Therefore, rather than asking end users to compress the documents themselves, the task can be delegated to an active monitoring system.

Once a document is profiled in the DMS, IT administrators can set parameters that decide whether it should be compressed. With this system, data is regularly assessed and legacy documents processed. With active monitoring, the DMS is continuously checked for newly profiled documents. By shifting responsibility from the end user to the trusted automated solution, law firms ensure their documents are compressed in the most timely and efficient manner possible. This frees up lawyers and staff to maximize their time by attending to higher-level work.

e-Mail Attachments

In law firms today, e-mails and their attachments are increasingly stored in document management repositories. e-Mail attachments need to be examined and considered for compression just like any other document in the repository.

Conclusion

Legal IT professionals are responsible for bringing efficiency and streamlined workflows to the law firms they serve. Compression is a proven way to reduce the burden on the firm's DMS or document store. It liberates storage space while also relieving some budget pressure. Though it does require decision-making and some oversight, at least at the outset in assigning settings within the automated solution, the benefits of compression can be well worthwhile, especially at firms where document sizes are often big and especially if their employees frequently attach large files to e-mails.


Dean Sappey is the president and co-founder of DocsCorp, which develops a range of document productivity and security tools for law firms. He can be reached at [email protected].

Prolific document generation is often the mark of a successful law firm. However, as a result, the volume of the firm's electronic storage dramatically increases every year. In addition to electronically created documents, paper documents are now regularly scanned and converted to electronic files, adding more storage bulk. Proper recordkeeping is critical to support clients and compliance regulations, but the document store can become unwieldy and very expensive for a firm to maintain.

Legal IT professionals are tasked with keeping expenses low while also ensuring the firm supports client and compliance requirements. Compression can significantly reduce document file sizes in a Document Management System (DMS), SharePoint, Windows file system, or any content repository, as an automated backend process.

Reducing document file size reduces storage costs and speeds up file transfers when downloading or sending these documents via e-mail. Compressing documents can provide law firms with considerable advantages in these areas, though it is not suitable in all circumstances.

Why Compress Documents?

If a law firm is considering compressing documents, it is likely that certain strategic business or compliance priorities have forced the issue. The most common reasons for compressing documents include: 1) Paperless office initiatives; 2) Reducing storage size and associated costs; 3) e-Mail attachment size limits and mailbox storage; and 4) Long-term file retention.

Paperless Office Initiatives

Many of today's firms have established aggressive paperless office initiatives where nearly all paper is scanned and stored electronically. As paper documents are scanned into the DMS, SharePoint or a Windows file system, the scanned files accumulate quickly, putting a strain on the document repository and available storage space. Compressing scanned documents can help reduce the burden on the DMS and the requisite storage costs that accompany it.

Reducing Storage Size and Costs

Compression can shrink files by up to 50%, which significantly reduces storage costs and allows firms to save on maintenance and overhead of legacy document storage.

Consider the example of an average law firm with 300 users which has 4.5 million PDF, MSG and image documents to store. This includes incoming paper mail that is scanned, internally generated files, e-mails sent and received by all types of desktop and mobile devices, and bulk document data imports. In this scenario, the volume of data adds up to 3.5 terabytes (35,000 gigabytes) on disk, stored on-site on file servers or off-site in the cloud. Based on how many image-based files are included, the document repository can be compressed down to 40%-50% of its original size. Since the average cost to store one gigabyte is $5-$18 per year, compression can add up to tens or even hundreds of thousands of dollars in annual storage savings.

Ease of Sending or Receiving e-Mails with Large Attachments

Large attachment files can sometimes bounce an e-mail back to the sender, causing frustration and delays. This is especially problematic for firms with satellite offices that have to send outsized files. If the firm's e-mail threshold does not allow large attachments in or out, lawyers often resort to sending from personal e-mail accounts or sending the attachments piecemeal across several e-mails. If the files are compressed before being sent or upon receipt, the sender's messages and attachments can more reliably reach their destination intact. In turn, the recipient can freely forward the message if needed.

Recordkeeping and Compliance

A fourth advantage of compression is that it helps law firms afford keeping files long-term, depending on retention time periods required by law. Obligated to stay compliant with record-keeping regulations, firms can pay a very high price for this retention over many years. However, if they compress documents, the cost can be considerably less.

How Compression Works

Although JPG and GIF files can be compressed, PDF files are the global standard format for distribution and electronic document storage. Converting to PDF ensures not only smaller file size, but also long-term preservation.

PDF is an ideal format for compression because the text and images in the file are stored within different layers of the PDF. Once converted to PDF, the embedded images in the document can be compressed using standard JPEG, JPEG2000 and JBIG2 formats. Images or graphics in files can also be downsampled or converted from color to greyscale or monochrome (black and white) and with reduced resolution. This process can significantly reduce the size of documents ' up to 50% ' with more reduction potential occurring in files with high-resolution, color-rich images.

While you can compress text in a PDF, the compression gains would be insignificant. Any possibility of compression comes from downsampling the images in a PDF document.

Well-designed compression software can search through the documents in a DMS to find opportunities for size reduction based on certain criteria. Also, it can “interrogate” or look inside e-mail attachments and compress them.

Documents can be automatically marked for compression based on the percentage of image content in the overall document file size. For example, compression software can be set to look for documents where the image content is equal to 50% of the overall document file size ' the percentage threshold can be adjusted upwards or downwards. This is done by having the compression software first analyze or interrogate the PDF to determine the overall document file size and the image content size.

Compression Can Be Automated or Hands-On

When using compression software, legal IT professionals can be actively involved and hand-pick documents to be compressed, or the process can be partially or completely automated. Compression software analyzes documents in a content repository based on a particular search query and compression thresholds specified by the IT administrator. It then processes the documents that meet the criteria and saves them back into the content repository, replacing the originals with smaller compressed files. A backlog service can also be configured to compress legacy documents in batches, by date ranges and file types.

Another highly beneficial option is to OCR the PDFs prior to compression, which will then profile the smaller, fully text-searchable documents into the DMS. This allows the document text to be indexed and searched in the DMS. During this process, the administrator is able to review the document set before processing, and can also review it before the Save phase.

PDF Compression Is Permanent

Compression is a one-off permanent solution, unlike other end user “go-to” file reduction processes, such as WinZip or 7-Zip. Once compressed, an image file cannot be restored or expanded back to its original form. The compressed images in a PDF are permanently changed, and in some cases compression can lower the image quality. So a best-practice compression workflow must ensure that a backup of the original file is retained, should it be required for future reference. However, it is not recommended to compress files for which image quality and integrity is critical to maintain in a pristine state.

Best Practices for Regular Document Compression

The best way to ensure that a law firm's documents are regularly compressed is to automate the process as much as possible. When left to their own devices, end users tend to avoid compressing files as an extra step. The closest they get to compression is zipping or unzipping files when e-mailing documents, which is only a temporary compression process. Therefore, rather than asking end users to compress the documents themselves, the task can be delegated to an active monitoring system.

Once a document is profiled in the DMS, IT administrators can set parameters that decide whether it should be compressed. With this system, data is regularly assessed and legacy documents processed. With active monitoring, the DMS is continuously checked for newly profiled documents. By shifting responsibility from the end user to the trusted automated solution, law firms ensure their documents are compressed in the most timely and efficient manner possible. This frees up lawyers and staff to maximize their time by attending to higher-level work.

e-Mail Attachments

In law firms today, e-mails and their attachments are increasingly stored in document management repositories. e-Mail attachments need to be examined and considered for compression just like any other document in the repository.

Conclusion

Legal IT professionals are responsible for bringing efficiency and streamlined workflows to the law firms they serve. Compression is a proven way to reduce the burden on the firm's DMS or document store. It liberates storage space while also relieving some budget pressure. Though it does require decision-making and some oversight, at least at the outset in assigning settings within the automated solution, the benefits of compression can be well worthwhile, especially at firms where document sizes are often big and especially if their employees frequently attach large files to e-mails.


Dean Sappey is the president and co-founder of DocsCorp, which develops a range of document productivity and security tools for law firms. He can be reached at [email protected].

Read These Next
Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

'Huguenot LLC v. Megalith Capital Group Fund I, L.P.': A Tutorial On Contract Liability for Real Estate Purchasers Image

In June 2024, the First Department decided Huguenot LLC v. Megalith Capital Group Fund I, L.P., which resolved a question of liability for a group of condominium apartment buyers and in so doing, touched on a wide range of issues about how contracts can obligate purchasers of real property.

The Article 8 Opt In Image

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

Fresh Filings Image

Notable recent court filings in entertainment law.

Major Differences In UK, U.S. Copyright Laws Image

This article highlights how copyright law in the United Kingdom differs from U.S. copyright law, and points out differences that may be crucial to entertainment and media businesses familiar with U.S law that are interested in operating in the United Kingdom or under UK law. The article also briefly addresses contrasts in UK and U.S. trademark law.