Governments and companies have been
scanning documents into electronic formats for years. This makes the information easier to store and use. The problem is that some of the information on the documents can be shared while other information needs to be kept private. The process of removing the sensitive information in a document is called redaction. Some organizations also refer to this process as "sanitizing."
We have all seen redaction on the news when the CIA releases documents with parts blacked out. As we move towards a paperless office redacting is becoming common for many organizations. This is a function of the increased access to the documents and to correct more lax privacy policies that were acceptable before the widespread problem of identity theft.
One example is with electronic health records. While it is beneficial for a medical office to share a patient charts with the entire practice a social security number should be shared with very few. The problem is that in the past a social security number was the insurance number. That puts social security numbers on many more places than needed. When the files are converted to electronic ones then sensitive information needs to be redacted.
Another is example is with public records. The information is public and if on a website it is easy for the public to access this information. The problem is even forms like land deeds can contain information that should be shared like bank account numbers. This is another situation where the information must be redacted.
In both examples there is private information that should not be widely available for privacy reasons and also to help prevent identity theft. This is not just good practice but required by federal privacy laws.
Redaction should not just cover text or images with another image. It should remove be removed for the image completely. That prevents the sensitive information from showing up in the revision history of the document or removed by software.
Most redaction takes place in two steps. The first one is done via software. The digital image is converted to text via Optical Character Recognition (OCR) software. The text is run through software designed to look for specific patterns. For example, social security numbers or phone numbers could be identified and removed.
The second step involves a human reviewing the document. The level of access to the documents and sensitivity of the information present will determine how thoroughly the documents should be reviewed. It is obvious that removing social security numbers from a public website has a very low threshold for acceptable errors.
Since redaction requires specialized software it is usually done by a
redaction service. A reputable company should be used to make sure that the job is not only done correctly but that the documents are secure during this process. Make sure you get references and review jobs they have done in the past before selecting a service for your job.
Loading...