About Us

DrBardo

Can I disable 'Content Copying' in a pdf file saved using the Microsoft pdf
word 2007 add in

Graham Mayor

No - and even in Acrobat which has that option, it will only stop copying
for about 10 minutes. If someone can see the document they can copy it.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org

DrBardo wrote:
Can I disable 'Content Copying' in a pdf file saved using the
Microsoft pdf word 2007 add in

Don

"Graham Mayor" wrote in
:

No - and even in Acrobat which has that option, it will only stop
copying for about 10 minutes. If someone can see the document they can
copy it.

Not necessarily so Graham.

It's dependent upon what encryption method is used (40-bit or 128-bit).
Also upon what security options are set and the number of pages (there are
some tools that will de-crpypt a 1-3 page 128-bit encrypted however these
tools choke on more pages.

BTW, the encryption additions require the full-version of Adobe and not
the free reader.
Even PDF's that have encryption set MINUS-a-password cannot be de-crypted
in the free reader.

Graham Mayor

It doesn't depend on anything of the kind. If you can see it you can copy
it. These methods only slow the user down. Worst case scenario - screen
capture the pdf one page at a time (Snag It will do that easily) and run it
through OCR. The encryption won't help there!

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org

Don wrote:
"Graham Mayor" wrote in
:

No - and even in Acrobat which has that option, it will only stop
copying for about 10 minutes. If someone can see the document they
can copy it.

Not necessarily so Graham.

It's dependent upon what encryption method is used (40-bit or
128-bit). Also upon what security options are set and the number of
pages (there are some tools that will de-crpypt a 1-3 page 128-bit
encrypted however these tools choke on more pages.

BTW, the encryption additions require the full-version of Adobe and
not the free reader.
Even PDF's that have encryption set MINUS-a-password cannot be
de-crypted in the free reader.

Don

"Graham Mayor" wrote in
:

It doesn't depend on anything of the kind. If you can see it you can
copy it. These methods only slow the user down. Worst case scenario -
screen capture the pdf one page at a time (Snag It will do that
easily) and run it through OCR. The encryption won't help there!

Your method is less than effective.
I tested it on a seven page PDF that is 100% pure text, no numerals (which
presents real issues, especially fractions).
Seven full pages of screen captures amounts to approx., 14 half page screen
captures.

I saved the resulting JPG's @ 100% non-compression (which most folks of
the masses are not even aware of).

The resulting OCR was approximately 60% accurate. (and all of this
utilizing the same spell checker that was used to created the initial RTF
from which the PDF was created. (My spell checker dictionary has been
supplemented extensively over a ten year period on "my widgets"). Anothers
dictionary not similarly focused would provide even lesser results.

99.99% of the population are simply not going to jump through this many
hoops for such ineffective results.
Hell! 90% of population have moved their scanner to a back corner of
their desktop after attemping a solitary image and/or OCR.

One example of ineffective OCR is the TIF images and their conversion to
text that are made available by the Library of Congress-American Memory
archives.

Graham Mayor

You don't need a scanner! SnagIt will output to a graphics file that any
half decent OCR software will access directly. And SnagIt will capture the
full page (the full document even), not simply half the screen.

And yes you are right that 99.9% of people will not want to jump through
hoops. It's the other .1% you should be worried about. I will repeat
(because you are deluding yourself if you think otherwise) that if you can
see it you can copy it.

Just for the hell of it I converted a four page PDF using this process and
the only (Finereader 8) OCR read errors were two superscripted date ordinals
and two misread words. It took less than 10 minutes to produce a Word
document that was close to the original, and with a bit more time it could
have been made indistinguishable. It sounds as though you need better OCR
software.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org

Don wrote:
"Graham Mayor" wrote in
:

It doesn't depend on anything of the kind. If you can see it you can
copy it. These methods only slow the user down. Worst case scenario -
screen capture the pdf one page at a time (Snag It will do that
easily) and run it through OCR. The encryption won't help there!

Your method is less than effective.
I tested it on a seven page PDF that is 100% pure text, no numerals
(which presents real issues, especially fractions).
Seven full pages of screen captures amounts to approx., 14 half page
screen captures.

I saved the resulting JPG's @ 100% non-compression (which most folks
of the masses are not even aware of).

The resulting OCR was approximately 60% accurate. (and all of this
utilizing the same spell checker that was used to created the initial
RTF from which the PDF was created. (My spell checker dictionary has
been supplemented extensively over a ten year period on "my
widgets"). Anothers dictionary not similarly focused would provide
even lesser results.

99.99% of the population are simply not going to jump through this
many hoops for such ineffective results.
Hell! 90% of population have moved their scanner to a back corner of
their desktop after attemping a solitary image and/or OCR.

One example of ineffective OCR is the TIF images and their conversion
to text that are made available by the Library of Congress-American
Memory archives.

Don

"Graham Mayor" wrote in
:

You don't need a scanner! SnagIt will output to a graphics file that
any half decent OCR software will access directly. And SnagIt will
capture the full page (the full document even), not simply half the
screen.

And yes you are right that 99.9% of people will not want to jump
through hoops. It's the other .1% you should be worried about. I will
repeat (because you are deluding yourself if you think otherwise) that
if you can see it you can copy it.

Just for the hell of it I converted a four page PDF using this process
and the only (Finereader 8) OCR read errors were two superscripted
date ordinals and two misread words. It took less than 10 minutes to
produce a Word document that was close to the original, and with a bit
more time it could have been made indistinguishable. It sounds as
though you need better OCR software.

Hey Graham,
You might try AbbyFineReader 9.0 as you'll enjoy the added PDF
capability. You may then ditch your lame Snagit procedure ;-)

With more than 10,000 articles scanned and OCR'd (not sure how many scans
that amounts to, perhaps three times as many or even more), and also having
used AbbyFineReader 9.0 for a short and every extensive trial,

I'm able to relay to you that there is miniscule difference in the manual
editing differences (recognition) between Omni Page Pro 9.0 (which I've
been using for nearly five years--with the aformentioned extensively
tailored dictionary) and Abby FineReader 9.0.
Both consitenly make/made the same OCR errors.
(as an aside; for the past approximate six month, I've been working on
three-colum text of a near standard 8-12 X 11 pages contained within 80-
year-old periodicals.)

BTW, the same basic software that came with the scanner has been used to
digtize more than 12,000 images as well.

Guess We'll just agree to disagree and move on.

Many thanks for your insights and extensive contribution to this forum and
others.

Graham Mayor

The 'lame SnagIt procedure' was merely a demonstration that protecting the
format was no guarantee that the material could not be copied. All OCR, as
you are no doubt aware, has limitations, but clearly it is not stopping you
from extracting data from those old documents. Similarly it would not stop
you from copying PDF - it merely lengthens the process.

I may get around to updating FineReader, but for the amount of OCR I
currently need to do, I can't justify the cost. I still have a corporate
version Finereader 6 installed also which has very nearly as good read
performance as 8 and has the advantage of a very useful form filling tool,
missing from the later version.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org

Don wrote:
"Graham Mayor" wrote in
:

You don't need a scanner! SnagIt will output to a graphics file that
any half decent OCR software will access directly. And SnagIt will
capture the full page (the full document even), not simply half the
screen.

And yes you are right that 99.9% of people will not want to jump
through hoops. It's the other .1% you should be worried about. I will
repeat (because you are deluding yourself if you think otherwise)
that if you can see it you can copy it.

Just for the hell of it I converted a four page PDF using this
process and the only (Finereader 8) OCR read errors were two
superscripted date ordinals and two misread words. It took less than
10 minutes to produce a Word document that was close to the
original, and with a bit more time it could have been made
indistinguishable. It sounds as though you need better OCR software.

Hey Graham,
You might try AbbyFineReader 9.0 as you'll enjoy the added
PDF capability. You may then ditch your lame Snagit procedure ;-)

With more than 10,000 articles scanned and OCR'd (not sure how many
scans that amounts to, perhaps three times as many or even more), and
also having used AbbyFineReader 9.0 for a short and every extensive
trial,

I'm able to relay to you that there is miniscule difference in the
manual editing differences (recognition) between Omni Page Pro 9.0
(which I've been using for nearly five years--with the aformentioned
extensively tailored dictionary) and Abby FineReader 9.0.
Both consitenly make/made the same OCR errors.
(as an aside; for the past approximate six month, I've been working on
three-colum text of a near standard 8-12 X 11 pages contained within
80- year-old periodicals.)

BTW, the same basic software that came with the scanner has been used
to digtize more than 12,000 images as well.

Guess We'll just agree to disagree and move on.

Many thanks for your insights and extensive contribution to this
forum and others.

Thread Tools
Show Printable Version
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Copying content of HTML Document	mm18	Microsoft Word Help	6	November 2nd 07 04:10 PM
Content types and content controls	Ã˜yvind Skaar	Microsoft Word Help	0	May 14th 07 04:17 PM
How can I see the real content of the table of content in Word?	Elly Driesen-Schepens	Microsoft Word Help	3	March 28th 07 12:58 PM
How can I disable active content in Word-generated html?	rmachin	Microsoft Word Help	5	June 6th 05 12:16 AM
How to disable task pane when copying	Jan Il	New Users	2	February 16th 05 11:14 PM

Menu

About Us