Talk:Portable Document Format/Archive 1

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Archive 1 Archive 2 Archive 3

PostScript

  • "In addition the PostScript code is already interpreted, so it is faster to display on the screen."

Does this simply refer to the loops and conditionals having been expanded out into a number of repetative blocks? If so, I think "partially evaluated" (or perhaps "compiled") would be more accurate than "already interpreted"; at least in my perspective (though I hear MIT tends to educate its students into agreeing with this), some kind of "interpretation" is indeed happening at the moment when the PDF program displays the graphics on screen.

-- Ryguasu

Need to add something about range of end uses for PDFs from screen reading to litho print with widely varying resolutions and that PDFS while in themseleves are resolution independent, embedded bitmaps within are usually are usually downsampled to minimise file size for a given end use.

-- Elliot100


I not infrequently hear people complain about being forced to read PDF online (because many U.S. govts like to publish in PDF only, apparently), and I generally sympathize; it is beautiful when printed, but fairly uncomfortable to read online. I don't know if this is worth mention in a wikipedia article however :) Kyk 06:13, 3 Jan 2004 (UTC)

html comparison

Does the comparison between pdf and html really make sense? It is true that they are both commonly available formats online, but one is a markup language (and behaviour when zooming is mostly a matter of interpreter: not every browser always reduce line lenght, or can zoom at all) and the other a page description language (where layout is important, and changing it when zooming does not make much sense). Valhalla 15:24, 7 Jan 2004 (UTC)

Both are very common on the web, the same content can either be made available in the first way or in the second, or the user can be allowed a choice; so I think this is a meaningful comparison. It could be expanded. - Patrick 19:45, 7 Jan 2004 (UTC)
I strongly disagree with presenting a comparison between HTML and PDF. They're really have nothing to do with each other. One is a file format; the other is a (deprecated) markup language. Comparing them is like saying, "While bicycles and automobiles are different technologies, both are commonly found on city streets." Joshf 21:57, 21 Jun 2005 (UTC)
Comparing bicycles with cars is also meaningful (perhaps a bit obvious). It helps deciding which I will choose.--Patrick 22:13, Jun 21, 2005 (UTC)
Meaningful, perhaps; but gratuitous at best, misleading at worst. PDF is not a web document standard; it's a page layout format geared toward document printing. If you're going to compare PDF to HTML, you may as well compare it to plain text, Rich Text, Microsoft Word, and Microsoft Excel formats. In fact, it would be far more meaningful to compare it to Word format, which serves roughly the same purpose.
Also, again, HTML is a deprecated standard. The current incarnation, XHTML, doesn't focus on layout or even text formatting at all. That work is done by stylesheets. XHTML done according to the recommended standards focuses entirely on content markup.
A much more suitable replacement for this paragraph would be a concise paragraph explaining why PDF documents being used for online forms, etc. is a misapplication of the format. Joshf 9 July 2005 14:17 (UTC)
I've rewritten the Comparison to HTML section to what I believe is a more general and relevant section. 69.221.225.252 01:10, 11 September 2005 (UTC)

Download link within body

I've put a second-paragraph link to the Adobe download page, which I think is helpful in this instance owing to the use of PDF files within Wikipedia. It is a departure from normal standards. The alternative would be to have a page on PDF files, with this prominent link, in the Wikipedia address space, and link to the Wikipedia page from PDF download links rather than to this page. But this seemed simpler and quite appropriate to me. Andrewa 20:16, 16 Jan 2004 (UTC)

Hate

People hate reading Application/Pdf onscreen:

Not Fit For Human Consumption (AlertBox)

?alabio 22:05, 31 Jan 2004 (UTC)

Is that a fact? Can you prove it? Who are "people". Personally, I find PDF's great for what they are intended for. This seems to be simply your personal opinion, and rightly, only belongs here, on the talk page. Graham 23:08, 31 Jan 2004 (UTC)
¿Do you call Jacob Nielsen a liar? ¡Tog thinks highly about him! In a study, Jacob Nielsen determined that reading Application/Pdf is about 300% less efficient onscreen than Application/Xhtml+Xml.
I am not putting down Application/Pdf. Application/Pdf is great for printing. Indeed, it is easier to read printed materials than onscreen materials. The problem is people so into desktoppublishing that they put everything into Application/Pdf so that it will look perfect and then place the Application/Pdfs online. This is perfect for those printing; but unfortunately however, pure torture for the user who cannot afford toner sitting a centimeter away from the screen trying to read the information onscreen.
?alabio 00:33, 1 Feb 2004 (UTC)
I'm not calling anyone a liar. I read the article - it's not definitive; It's just opinion. Graham 07:24, 1 Feb 2004 (UTC)
Professor Nielson Has many testimonials and observations. It is not definitive because he did not specifical test the usability about reading Application/Pdf onscreen; but nonetheless however, It is an authoritative opinion. Professor Nielson is a usability expert.
?alabio 22:29, 1 Feb 2004 (UTC)
Talking of which, what's with all the dividing lines you keep putting in? (I've removed them) They are normally used to separate different topics of discussion. The indentation connects sections of the same discussion. Graham 01:40, 2 Feb 2004 (UTC)
Just my way of separating posts. If it will make you feel better, I shall use horizontal for separating threads.
66.124.224.115 03:12, 2 Feb 2004 (UTC)
I've inserted a "pilot" section about the disadvantages of PDF compared to (x)HTML, and tried to remain neutral and present facts as clearly as possible. If people don't like it, you can change/delete it. --TTD 15:14, 2004 Sep 10 (UTC) EDIT: and post why you don't like it here!
Where is this pilot? when I click on the blue text, I just wind up at the PDF article. I think things like the fact that PDF can crash systems, that it is unreadable if administrators don't set the right parameters, that it can get "lost" on a computer when one is trying to open it, and that it is not necessarily easy to read are all points that might be made here - to say nothing of its slow opening time.Kdammers 22:51, 25 September 2006 (UTC)
Articles change a lot in two years, so it has been rewritten many times. It is worth noting that none of the points you make are criticisms of PDF. Rather they seem to be criticisms of whatever PDF software you happen to be using. As such, they would be removed from this article, which is supposed to be about the file format, not software. Notinasnaid 07:39, 26 September 2006 (UTC)

In regards to searching, in this article, PDF can be searched with Adobe Acrobat Reader 6.0.

Cool, I added that. Even better would be if one search would cover PDF and other file types in a folder.--Patrick 14:08, 22 Jul 2004 (UTC)

There are many search tools which can search PDF and other files in a folder - dtSearch, Isys for example. But I don't think searching PDFs should be in this article at all.

Damn PDF FORMAT to Hell! --69.67.230.99 15:20, 30 July 2006 (UTC)

Envoy

The internal link envoy guides to a page about diplomacy. I think, it is an error. 130.119.248.11 16:28, 13 Dec 2004 (UTC)

pdf

I am not an idiot, but I can't figure out pdf. Is there a manual someplace, or is it assumed that this information is already encoded in normal human genes? What is that little hand? How do I use this technology? Why is html in giant letters that don't fit the screen? Why can't they use a normal format? Arthur

Huh? PDF is a FORMAT, not a PROGRAM. Therefore it doesn't have a manual, but it does have a specification. What little hand are you talking about? You use this technology to transfer documents from one computer to another, or from a computer to a printer. RT(F)A. Where are the giant letters you are referring to? In the article? I don't see them. PDF is a normal format, unless I've misunderstood your question/rant. You may not be an idiot, but if you want smart answers you need to ask smart questions. Graham 03:45, 20 Dec 2004 (UTC)

Extracting and using information from PDF files

Is there a program that allows the identification, extraction and use of information e.g. for making a database, from PDF files ?

BBR

Tools listing

Personally, I don't like the tools listing in this section at all. There is no quality control or context; anyone can wander past and add their favourite. If it should be here at all, perhaps a separate PDF tools entry.

However, I reverted a deletion of

  • PDFlib - leading programming library for automatically generating PDF on the server (broad languages and platform support).

The deletion of this seems to reflect either a policy I missed, or a point of view that wasn't explained... Notinasnaid 12:52, 6 Jan 2005 (UTC)

PDF is an open standard?

The beginning of the article states that PDF is "[...] an open standard, and anyone may write applications that can read or write PDFs royalty free." But as I read the History of PDF, it makes no mention of this... The history kinda lead me to think that Acrobat came up with this standard.. Since I am not familiar with the history of PDF I am refraining from editing the Histort portion of this article, but would apprechate if some would clear up that little portion.

Well, Acrobat is a product, but it is true indeed that Adobe Systems invented the PDF file, control the specification, and publish it. The only question is about whether PDF is an open standard, since it is true that anyone may write applications that can read or write PDFs royalty free (Adobe grant specific patent exemptions, with limitations; please read the reference carefully before implementing). The argument really hinges on "what is an open standard"? I don't know the right answer and I too was concerned when the text appeared. Some definitions I have encountered include:
  • An open standard is one that is available for free.
  • An open standard is one created by a committee with open membership.
  • An open standard is one that is published and available.
  • An open standard is one that allows implementation without royalties or other fees.
While the first of these is clearly wrong (since it excludes a great many important standards), all the other views are popular, and there are more besides. It seems to be one of those terms which people decide are a good thing, then seek to define to match their views. I'd suggest checking what Wikpedia has decided to define open standard as this week, and see if PDF matches. Notinasnaid 15:59, 7 Jan 2005 (UTC)

Well I would suggest to omit the expression open standard at all. And replace it by a sentence that explains what it is and what it is not. In the sense: It can be used freely in other applications, but there are some lahgal limits, namly...

DavouD--81.240.122.249 22:57, 24 Mar 2005 (UTC)

I altered the sentence about open standard to: "PDF is also an open standard in the sense that anyone may create applications that read and write PDF files without having to pay royalties to Adobe Systems." BrandonCsSanders 15:13, 30 December 2005 (UTC)

PDF is also an open standard”. He-he. Wide open.

“Amid threats of a lawsuit from Adobe, Microsoft acknowledged Friday that it would remove support for saving files in PDF from Office 2007, as well as dropping its own rival format XPS from the suite and Windows Vista.” (http://www.betanews.com/ 06.06.2006). Yuri7 09:57, 25 June 2006 (UTC)

Perhaps more study of the news stories will show this is nothing to do with whether the standard is open... Notinasnaid 10:14, 25 June 2006 (UTC)

Misleading statement in the article

I have noticed the following statement within the article: With HTML the same can be achieved by using a raster graphics (or recently, SVG, a vector graphics standard) image to present text, but then the text can not be copied as such, nor can a subtext be searched within it. But this is not true for an SVG file format, for it is a vector graphics format and any text within it is searchable by definition, and actually there is a means to do so. Maybe we should correct the passage for the sake of better understanding? --Dennis Valeev 16:09, May 14, 2005 (UTC)

Accessibility vs Compliance with Accessibility Laws or Guidelines

I have found a lot of confusion regarding Adobe's use of the word "accessible" and the legal variations of "accessible" based on various government requirements. Ontario, Canada uses the WCAG 1.0 AA to evaluate a document's "accessibility." Since PDF is not a W3C technology then a PDF can never be "accessible" accordng to WCAG 1.0 AA. I would like to see something distinguishing "accessiblility features" vs. "legally compliant."

Waaay too many links

the "External links" section is overflowing with links, most of which seem to simply promote various software programs. what do you think should be done about them?

boredzo () 20:56, 2005-08-09 (UTC)

  • Good idea, this has been troubling me too. I've seen several articles which contain lists, and no matter what the subject, people keep coming by to add their favourite, whether it's notable or not. The only way to avoid that is to have no examples at all. On the other hand, even with no examples, the adventurous will find a way to add a paragraph describing their favourite software. I think that for now they can be moved to a "List of PDF software" which I don't intend to read, but which gives people a target for editing. Retain only non-software links (e.g. file formats). Once this is done, the article badly needs a complete clean-up/refocus/rewrite because it has grown organically without enough direction. Notinasnaid 21:20, 9 August 2005 (UTC)
    • I just flamethrew most of the links, leaving only the Adobe software links and format information. — boredzo () 06:06, 2005-08-19 (UTC)

Criticisms

I removed the Criticisms section because it was poorly-written, not very NPOV, and dealt primarily with PDFs online. I'll cobble together the relevant points from that with the current superfluous "Comparison to HTML" section tonight into a "PDF Use Online" section. Joshf 15:01, 10 September 2005 (UTC)

Section added. Forgot to log in; whoops. Joshf 01:08, 11 September 2005 (UTC)


some ideas: PDF's are not editable or at least highlightable without purchasing Adobe software. PDF's using fonts not available on the one's computer will not show up correctly. Acrobat Reader does not have color-printer friendly print options. (Will unnecessarily waste color ink on full color pages) It is difficult to copy text from more than two pages them without unnecessary line-breaks. --68.255.233.108 20:18, 5 March 2006 (UTC) (Anonymous)

Note that PDF's using fonts not available on the one's computer will not show up correctly. is incorrect, unless the creator of the PDF forgot to embed fonts. Notinasnaid 08:33, 6 March 2006 (UTC)
These criticisms are either wrong (e.g. the fonts - the whole point of PDF is to avoid this), or else are criticisms of particular implementations (e.g. Adobe Reader). None of these things have anything to do with PDF as a format. Try using Mac OS X- it uses PDF throughout as its metafile format and you don't need any Adobe software. You also don't run into any of these problems. Graham 14:11, 6 March 2006 (UTC)
By the way, these "suggestions" have one flaw: it sounds like you thought of them yourself. Wikipedia isn't for reflecting your opinions, but the research of others. The link under Criticism in the article is a good example of what is needed: you don't have to agree or disagree, just note that a source, considered to be reputable, made these criticisms. Above all, if the opinion of the writer comes through in an article, then the writing is wrong. Notinasnaid 09:09, 9 March 2006 (UTC)
That's fine; I'm still not aware of all the Wikipedia rules or the actual inner-workings of the PDF format. That's why I just listed some ideas here as possible starting points. Here are some articles I found which mention some similar (uneditability) and some new criticisms. I realize that many of these have to do more with programs rather than the format itself. I also note that the following articles mostly praise the format. If I could find anything on the subject of PDF criticisms in credible sources, this was it. --Macrowiz 06:07, 10 March 2006 (UTC)
-"Now for the bad news. It would seem logical that, since .pdf files contain text, you could open a .pdf document with a word processor, but generally you cannot." source: van Horn, Royal. Adobe Acrobat Mysteries. Phi Delta Kappan 83 no3 186-7 N 2001
-"How can one prove conclusively that the PDF version of a native file was accurately converted from the original file and is an authentic copy of the original file for legal and regulatory audit purposes?" "Viewing documents online can be equally frustrating. 'Viewing an online PDF file involves several components: a Web browser, the Acrobat viewing plug-in for the browser, the Acrobat viewing program itself, ... and the server.'" source: Phillips, John T. Should PDF be used for archiving electronic records?. Information Management Journal v. 35 no1 (Jan. 2001) p. 60-3
-"For example, Acrobat's original read-only electronic renditions became-in the words of an Adobe Acrobat marketing executive a 'roach motel.' Documents got in but couldn't get out. Acrobat's WYSIWYG strategy made PDF files static in a world that increasingly values interaction. Further, in a wireless-Web world with many different screen sizes, WYSIWYG becomes less meaningful. Few screens have the dimensions of printed pages." source: Boeri, Robert J. "Infoinsider - Acrobat Scores Again" EContent. 26, no. 11, (2003): 40.
These are fair points if you are thinking of PDF as something to view documents on the web with. However, that not the reason PDF was invented - like many things, it was invented for another reason altogether, then the internet came along and everyone decided to jump on that particular bandwagon. As an internet technology, PDF is sort of OK, but it's a bit of a square peg. However, when used for its original purpose - creating exact files for print reproduction, it's absolutely marvellous. Prior to PDF, printing stuff was a nightmare, with no guarantee of having the right fonts at both ends, no colour proofing, no guarantee that line layouts would be preserved correctly, etc. PDF fixes all of that - send a PDF to the printer, you get the results you intend. Even Adobe seem to have forgotten that that was all it was meant to be for in the first place, to solve the print nightmare that existed. Once you think of PDF as an exact snapshot of a printable page, all these other criticisms become moot. It's electronic paper, so just like paper you can't (generally) edit it. And as for not opening in a word processor, well, speak for yourselves, Windows users. PDF files open just fine in my Mac word processor. And in my eMail client. And in my web browser. And in any image editor. Again, don't blame the format for deficiencies in the implementation. Graham 06:23, 10 March 2006 (UTC)
This is much better, something to work on. It's an interesting question: what to say if you know that a reputable person is making a mistaken criticism. Do you have the right to challenge it? Probably only by finding a counter blast from someone of equal repute. Sources win over personal opinions hands down. However, the criticisms are usefully put in context: who said this, what were they writing about overall? I would worry about the first "It would seem logical that, since .pdf files contain text, you could open a .pdf document with a word processor, but generally you cannot." That doesn't sound like a criticism, but the admission of someone who found they had made a mistake. The quote may still be useful, but as a neutral one under a heading relating to editing/reuse of PDFs. Many people do make this mistake, but it's their assumption that is wrong. Notinasnaid 08:36, 10 March 2006 (UTC)
Actually, ditto the last one " Acrobat's original read-only electronic renditions...". It's a reasonable quote but shows a mismatch of expectations. These could be placed in critisms if the general concept is editorialised under the heading of "non-editability and non-reversibility", however, this would have to be balanced by pointing out that this isn't what they are for; rather like the criticm that my car doesn't fly: I would like it to fly, but it isn't a fault. Notinasnaid 08:41, 10 March 2006 (UTC)