Reply
 
Thread Tools Display Modes
  #1   Report Post  
Posted to microsoft.public.word.docmanagement
John Dalberg John Dalberg is offline
external usenet poster
 
Posts: 4
Default How can a single document have a single style with different fonts and sizes?

I have tried different pdf to Word converters. I noticed that while the
output Word document shows different fonts and font sizes, when I click
anywhere in the document, the style shown is always the same. I thought
each text which have different fonts or sizes belong to different styles.

How can I change a font size for all text that uses a certain font and
size? When I choose select all instances for a style (in my case it seems
there's only one style), it selects the whole document. I can't selectively
choose certain paragraphs.

A related question would be which pdf to Word converter can output Word
documents which have different styles?

John Dalberg
  #2   Report Post  
Posted to microsoft.public.word.docmanagement
Graham Mayor Graham Mayor is offline
external usenet poster
 
Posts: 19,312
Default How can a single document have a single style with different fonts and sizes?

The whole point of PDF is that it is a graphical representation of the
document that is not intended to be edited. Any converter or OCR software
capable of handling the content and converting it to Word will be hit and
miss and if you are hoping to get an exact facsimile of the original. you
are dreaming. The best plan is usually to extract just the text and rebuild
it from scratch.

For difficult to convert PDFs the best plan is to use a good quality OCR
package such as Finereader - or you could try PDF2Text

Once you have the text loaded into Word, it behaves like any other text and
is amenable to Words extensive formatting capability.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org
.

John Dalberg wrote:
I have tried different pdf to Word converters. I noticed that while
the output Word document shows different fonts and font sizes, when I
click anywhere in the document, the style shown is always the same. I
thought each text which have different fonts or sizes belong to
different styles.

How can I change a font size for all text that uses a certain font and
size? When I choose select all instances for a style (in my case it
seems there's only one style), it selects the whole document. I can't
selectively choose certain paragraphs.

A related question would be which pdf to Word converter can output
Word documents which have different styles?

John Dalberg



  #3   Report Post  
Posted to microsoft.public.word.docmanagement
John Dalberg John Dalberg is offline
external usenet poster
 
Posts: 4
Default How can a single document have a single style with different fonts and sizes?

"Graham Mayor" wrote:
The whole point of PDF is that it is a graphical representation of the
document that is not intended to be edited. Any converter or OCR software
capable of handling the content and converting it to Word will be hit and
miss and if you are hoping to get an exact facsimile of the original. you
are dreaming. The best plan is usually to extract just the text and
rebuild it from scratch.


When you use a pdf editor, you can edit the text, it tells you what font,
font size was used for some text and other attributes. If an editor can do,
why can't a converter create some styles based on these attributes?

This is a book in electronic form. It's a huge manual task to style all the
headers, paragraphs, code snippets...etc.

John Dalberg
  #4   Report Post  
Posted to microsoft.public.word.docmanagement
Graham Mayor Graham Mayor is offline
external usenet poster
 
Posts: 19,312
Default How can a single document have a single style with different fonts and sizes?

Then you should have kept a copy of the document that the PDF was created
from to edit. There is no simple way to edit a PDF file. If you are very
fortunate, Acrobat Pro *may* allow you to save the PDF in Word document
format.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org


John Dalberg wrote:
"Graham Mayor" wrote:
The whole point of PDF is that it is a graphical representation of
the document that is not intended to be edited. Any converter or OCR
software capable of handling the content and converting it to Word
will be hit and miss and if you are hoping to get an exact facsimile
of the original. you are dreaming. The best plan is usually to
extract just the text and rebuild it from scratch.


When you use a pdf editor, you can edit the text, it tells you what
font, font size was used for some text and other attributes. If an
editor can do, why can't a converter create some styles based on
these attributes?

This is a book in electronic form. It's a huge manual task to style
all the headers, paragraphs, code snippets...etc.

John Dalberg



  #5   Report Post  
Posted to microsoft.public.word.docmanagement
John Dalberg John Dalberg is offline
external usenet poster
 
Posts: 4
Default How can a single document have a single style with different fonts and sizes?

"Graham Mayor" wrote:
Then you should have kept a copy of the document that the PDF was created
from to edit. There is no simple way to edit a PDF file. If you are very
fortunate, Acrobat Pro *may* allow you to save the PDF in Word document
format.


Why do you assumine I created the pdf?
Check out Foxit PDF Editor.

I have Acrobat Pro and it exports to Word. However it's still not smart
enough. It creates tens of styles some of which have no instances in the
document. I can see two paragraphs, one follows the other. Both have the
same font and size, yet they have different styles. I don't understand what
triggers Acrobat to create different styles for what seems to be the same
style in the pdf, yet it creates disparate instances for most of them.


John Dalberg


--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org


John Dalberg wrote:
"Graham Mayor" wrote:
The whole point of PDF is that it is a graphical representation of
the document that is not intended to be edited. Any converter or OCR
software capable of handling the content and converting it to Word
will be hit and miss and if you are hoping to get an exact facsimile
of the original. you are dreaming. The best plan is usually to
extract just the text and rebuild it from scratch.


When you use a pdf editor, you can edit the text, it tells you what
font, font size was used for some text and other attributes. If an
editor can do, why can't a converter create some styles based on
these attributes?

This is a book in electronic form. It's a huge manual task to style
all the headers, paragraphs, code snippets...etc.

John Dalberg



  #6   Report Post  
Posted to microsoft.public.word.docmanagement
Graham Mayor Graham Mayor is offline
external usenet poster
 
Posts: 19,312
Default How can a single document have a single style with different fonts and sizes?

We are still going round in circles with this, but you are missing the
essential point that PDF is a *graphics* format and if the original document
is not available for reference you are using what is essentially OCR to
recreate a document from the PDF- just as you might with a JPG or TIFF file.
OCR software, even at its best, is not capable of recreating the document
(any document) with 100% accuracy. In my opinion Finereader is the best
choice, but even that will not create the style structure of the original
document and you will have a lot of work on your hands to create an editable
document.

As you apparently didn't create the PDF in the first place, can you obtain
the original document from whoever did - presumably not?

As for Acrobat's own abilities to recreate a PDF, you'll have to take that
up with Adobe. Word is not the issue here.
Had the PDF ben created from a graphical representation of the document (as
some are to make them more difficult to recreate) Acrobat would not be able
to save the PDF as an editable document.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org


John Dalberg wrote:
"Graham Mayor" wrote:
Then you should have kept a copy of the document that the PDF was
created from to edit. There is no simple way to edit a PDF file. If
you are very fortunate, Acrobat Pro *may* allow you to save the PDF
in Word document format.


Why do you assumine I created the pdf?
Check out Foxit PDF Editor.

I have Acrobat Pro and it exports to Word. However it's still not
smart enough. It creates tens of styles some of which have no
instances in the document. I can see two paragraphs, one follows the
other. Both have the same font and size, yet they have different
styles. I don't understand what triggers Acrobat to create different
styles for what seems to be the same style in the pdf, yet it creates
disparate instances for most of them.


John Dalberg


--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org


John Dalberg wrote:
"Graham Mayor" wrote:
The whole point of PDF is that it is a graphical representation of
the document that is not intended to be edited. Any converter or
OCR software capable of handling the content and converting it to
Word will be hit and miss and if you are hoping to get an exact
facsimile of the original. you are dreaming. The best plan is
usually to extract just the text and rebuild it from scratch.

When you use a pdf editor, you can edit the text, it tells you what
font, font size was used for some text and other attributes. If an
editor can do, why can't a converter create some styles based on
these attributes?

This is a book in electronic form. It's a huge manual task to style
all the headers, paragraphs, code snippets...etc.

John Dalberg



  #7   Report Post  
Posted to microsoft.public.word.docmanagement
John Dalberg John Dalberg is offline
external usenet poster
 
Posts: 4
Default How can a single document have a single style with different fonts and sizes?

"Graham Mayor" wrote:
We are still going round in circles with this, but you are missing the
essential point that PDF is a *graphics* format and if the original
document is not available for reference you are using what is essentially
OCR to recreate a document from the PDF- just as you might with a JPG or
TIFF file. OCR software, even at its best, is not capable of recreating
the document (any document) with 100% accuracy. In my opinion Finereader
is the best choice, but even that will not create the style structure of
the original document and you will have a lot of work on your hands to
create an editable document.



It might be in a graphics format but a pdf file should contain enough meta
data to export the file with more intelligent styles. Explain to me how an
editor like Foxit Editor is able to open a pdf file, let's you select a
text, tells you what font was used plus other attributes and let's you edit
the text. If an editor is able to do this why can't a converter dump these
attributes and create Word styles out of them? When I say a Word style, all
I want the style to include is font and size so that I can select all
instances of text that's using the same fonts and size. Surely the
converter should be able to have all similar text be lumped into a single
style.

I don't believe an OCR program will produce a better Word document than a
pdf converter.

I am not looking for extracting the same original structure. All I am
looking for is being able to choose all instances of a certain style. I
don't care if the style is a dummy style, which wasn't in the original
document, created by the converter as long as it defines something like a
font style and size and all text having the same font and size point to
that style.

I am not sure if you understand what my goal is. I don't care if the styles
produced by the converter do not resemble the styles of the original
document. If the orignial used Arial size 10 and the converter produces
font Zulu size 11, it's ok as long I can choose all text of that style. It
will take me a few seconds to choose all these text and modify them back to
use Arial size 10. *BUT* the problem is the converters *do not produce
different styles*. Accurate styles from the original (name..etc) doesn't
matter.


As you apparently didn't create the PDF in the first place, can you
obtain the original document from whoever did - presumably not?


No.



As for Acrobat's own abilities to recreate a PDF, you'll have to take
that up with Adobe. Word is not the issue here.
Had the PDF ben created from a graphical representation of the document
(as some are to make them more difficult to recreate) Acrobat would not
be able to save the PDF as an editable document.


It wasn't created from a graphical representation. It's an eBook and the
author must have been using a word processor.

John Dalberg
  #8   Report Post  
Posted to microsoft.public.word.docmanagement
Graham Mayor Graham Mayor is offline
external usenet poster
 
Posts: 19,312
Default How can a single document have a single style with different fonts and sizes?

Clever though Foxit is, I do not think it is reading meta data from the
file. With tests on PDF documents created from my own PC, the fonts reported
are not necessarily the fonts used. You would need to ask Foxit (or better
still Adobe who own the format) what is possible. Until I receive convincing
evidence to the contrary I will stick with my original response.

--

Graham Mayor - Word MVP

My web site www.gmayor.com
Word MVP web site http://word.mvps.org


John Dalberg wrote:
"Graham Mayor" wrote:
We are still going round in circles with this, but you are missing
the essential point that PDF is a *graphics* format and if the
original document is not available for reference you are using what
is essentially OCR to recreate a document from the PDF- just as you
might with a JPG or TIFF file. OCR software, even at its best, is
not capable of recreating the document (any document) with 100%
accuracy. In my opinion Finereader is the best choice, but even that
will not create the style structure of the original document and you
will have a lot of work on your hands to create an editable document.



It might be in a graphics format but a pdf file should contain enough
meta data to export the file with more intelligent styles. Explain to
me how an editor like Foxit Editor is able to open a pdf file, let's
you select a text, tells you what font was used plus other attributes
and let's you edit the text. If an editor is able to do this why
can't a converter dump these attributes and create Word styles out of
them? When I say a Word style, all I want the style to include is
font and size so that I can select all instances of text that's using
the same fonts and size. Surely the converter should be able to have
all similar text be lumped into a single style.

I don't believe an OCR program will produce a better Word document
than a pdf converter.

I am not looking for extracting the same original structure. All I am
looking for is being able to choose all instances of a certain style.
I don't care if the style is a dummy style, which wasn't in the
original document, created by the converter as long as it defines
something like a font style and size and all text having the same
font and size point to that style.

I am not sure if you understand what my goal is. I don't care if the
styles produced by the converter do not resemble the styles of the
original document. If the orignial used Arial size 10 and the
converter produces font Zulu size 11, it's ok as long I can choose
all text of that style. It will take me a few seconds to choose all
these text and modify them back to use Arial size 10. *BUT* the
problem is the converters *do not produce different styles*. Accurate
styles from the original (name..etc) doesn't matter.


As you apparently didn't create the PDF in the first place, can you
obtain the original document from whoever did - presumably not?


No.



As for Acrobat's own abilities to recreate a PDF, you'll have to take
that up with Adobe. Word is not the issue here.
Had the PDF ben created from a graphical representation of the
document (as some are to make them more difficult to recreate)
Acrobat would not be able to save the PDF as an editable document.


It wasn't created from a graphical representation. It's an eBook and
the author must have been using a word processor.

John Dalberg



Reply
Thread Tools
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Format whole document instead of single instance. Kim Page Layout 2 June 14th 06 07:55 PM
I can I turn off spell check for a single style. Yonaga Microsoft Word Help 2 March 27th 06 12:51 AM
How many pages can fit in a single document? Rosemary Microsoft Word Help 1 November 3rd 05 08:35 PM
Having trouble viewing doc map in single document dzules Page Layout 1 November 3rd 05 02:50 PM
Multiple envelopes in single document? Mike Page Layout 1 April 8th 05 10:34 PM


All times are GMT +1. The time now is 04:52 PM.

Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 Microsoft Office Word Forum - WordBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Word"