cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Showing results for 
Search instead for 
Did you mean: 
Viewer

Systematically stripping metadata from new documents

Hi,

 

We're faced with a challenge where users produce documentation and send it out to other users and customers, however the metadata on these files is sometimes inherited from the original file where the publisher copied/pasted data from, as users don't always refer to the recommended blank templates.

 

This can sometimes be an issue for us as the metadata inherited can sometimes contain sensitive information about a previous presentation.

 

I'd like to systematically strip the publishing of inherited metadata from new documents, and possibly do some kind of pop up at the moment of saving of the document which prompts user for relevant fields they'd like to fill out with useful and relevant metadata for that particular document.  Does anyone know of a tool to accomplish this?

 

All I've found so far require manual user intervention, which is not ideal as it's due to fail. 

4 Replies
Advocate I

Re: Systematically stripping metadata from new documents


@TungstenPC wrote:

Hi,

 

We're faced with a challenge where users produce documentation and send it out to other users and customers, however the metadata on these files is sometimes inherited from the original file where the publisher copied/pasted data from, as users don't always refer to the recommended blank templates.

 

This can sometimes be an issue for us as the metadata inherited can sometimes contain sensitive information about a previous presentation.

 

I'd like to systematically strip the publishing of inherited metadata from new documents, and possibly do some kind of pop up at the moment of saving of the document which prompts user for relevant fields they'd like to fill out with useful and relevant metadata for that particular document.  Does anyone know of a tool to accomplish this?

 

All I've found so far require manual user intervention, which is not ideal as it's due to fail. 


Pablo,

I've been associated with enterprises and researchers addressing this very problem or years, and have taken part directly in a corner of the problem. My experience has been that you will only succeed in your goal by taking a total systems view for a solution that incorporates aspects of the three major system components of people, process, and tools.

Yes, there are tools (software) available to scrub metadata, and even "dirty words" based on a review list, that can operate at either the individual document file level or at the batch level. For commercial enterprise use, research the data loss prevention (DLP) market place. For individual documents look at the capabilities built into  Micro$oft Office and Adobe Acrobat Pro, to include available add-in third party tools. However, tools are not enough... EVER. 

You also need clearly defined processes for document production, review, approval, release, and transmittal that  the people will willingly and consistently follow. Those processes must be buttressed by understandable, enforceable (and consistently enforced!) policies establishing the what and why of the how that the processes define.

Here is a reality: pop-up reminders do not work. Your people will routinely click through the pop-up dialog boxes without complying, or will hurriedly dummy-fill the metadata fields and then click through., again without true compliance That is why you will only succeed if you use a workable mix of manual and semi-manual processes with integrated tools, and workers in the processes whose primary job is to ensure metadata is scrubbed for DLP purposes.  You absolutely CANNOT depend on those workers whose primary job is to produce the document content to take time for this activity. 

 

Good luck!

 

p.s. This systems approach is why I encourage all CISSPs to investigate the system engineering world of INCOSE and consider earning a CSEP certification as an added professional development step. 

 

 

 

Dr. D. Cragin Shelton, CISSP
CraginS@gmail.com
https://CraginS.blogspot.com/
Viewer

Re: Systematically stripping metadata from new documents

Thank you very much for the detailed reply, Dr. Shelton.   It seems that as with most things the real answer for this question is a bit more complicated and a "silver bullet" solution is not quite the answer.   I'll keep working on this to see what we can implement in the short term as we establish a more comprehensive solution set, similar to what you've described.

Viewer II

Re: Systematically stripping metadata from new documents

You may find this CSO Online article relative, for docs edited in the future.

 

This Microsoft article explains the use of the Document Inspector feature in Office files, but it only works on the file you have open.  It seems more thorough than unchecking the box prior to printing to pdf.

 

I believe the Office Admin Templates can be used to manage that option, as well as clearing the metadata on saves.  That may not clear the metadata of inherited data - but you can test for that.

 

I don't know of any commercial product that can be used to manage metadata in files, but it's quite possible they exist.

 

 

Advocate I

Re: Systematically stripping metadata from new documents

Hi All

 

Look up "InfoSphere® Metadata Asset Manager"

 

https://www.ibm.com/support/knowledgecenter/en/SSZJPZ_11.3.0/com.ibm.swg.im.iis.mmi.doc/topics/c_ove...

 

Welcome to the world of metadata.

 

Regards

 

Caute_cautim