Talk:Diff

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computing (Rated B-class, Mid-importance)
WikiProject icon This article is within the scope of WikiProject Computing, a collaborative effort to improve the coverage of computers, computing, and information technology on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B-Class article B  This article has been rated as B-Class on the project's quality scale.
 Mid  This article has been rated as Mid-importance on the project's importance scale.
 

Subsequence vs. substring[edit]

This article mentions the longest common substring algorithm, but shouldn't it be subsequence? 129.241.124.103 (talk) 21:07, 10 July 2013 (UTC)

Hmm, I'm rambling. It *is* subsequence, but for some reason I had the substring page up in another tab, and thought I clicked it from here. 129.241.124.103 (talk) 21:14, 10 July 2013 (UTC)

Diff3[edit]

Another contributor and myself are unsure whether some of the information on diff3 is accurate. Specifically, we do not fully agree (for lack of knowledge, rather than conflicting knowledge) on the evolution of diff3, its current relevance to users, and its technical relationship to the merge utitlity. Any contributions and corrections (with references, please!) are welcomed. (However, having just looked at the GNU diff3 source, I have to say that it's such a simple-minded program that I doubt it's worthy of further analysis).

Diff Feature Comparison[edit]

This feature comparison matrix in its current existence looks really superficial. It's a good idea, but I don't think very many software articles on WikiPedia serve as running software reviews of different packages. Most of these diff applications are graphical interfaces anyway.


Diff on Wikipedia[edit]

Anyone know what diff program Wikipedia uses for its page histories, which shows the spefic location in a line where text begins to differ, or what such a program would be called? This seems to be slightly different from the UNIX program which only shows differing lines but not where in the line it differs.

Wikipedia currently doesn't use diff at all, the history functionality is being done by MediaWiki and is code written in PHP. I imagine this includes the functionality showing results on a word-by-word basis --132.198.104.111 17:38, 13 Jul 2004 (UTC)
So is there any piece of software that does this thing for files? -- 193.226.167.123 17:52, 20 March 2006 (UTC)
If any of my reverted edits were allowed to remain or if my merge proposal wasn't voted down, this discussion might not be necessary.
Wikipedia uses a file comparison utility that works when you click "diff." It might not be the Unix diff program though. File comparison utilities that might do what you want are listed in various sections at the bottom of this article, and on the bottom of the file comparison page. One of the links at Help:Contents might lead you to more information.
Usability is dead. -Barry- 18:41, 20 March 2006 (UTC)
Thanks! I just found something that looks somewhat promising (the external link at Help:Diff in the "Tracking changes" section). -- 193.226.167.123 19:16, 20 March 2006 (UTC)

In Unix land there's something called Wdiff. --65.19.87.53 01:35, 22 March 2006 (UTC)

This article should probably mention the use of diff pages on Wikipedia. They are a pretty accessible example of diff pages in broader (non-strictly-software) use. -Bonus Onus (talk) 15:54, 21 January 2009 (UTC)

Layman's Terms[edit]

Could some one explain this in layman's terms?

The diff algorithm is an implementation of the Longest-common subsequence problem, which is now linked to in Diff's "See also" section.
I've added an example for clarity. I don't say it's a good (or realistic) example, but it will do for now. Without an example it's not clear what a diff really is, or looks like. --Shinobu 18:14, 28 Mar 2005 (UTC)


Binary v. Text Sentence in Intro[edit]

Apart from that, shouldn't "The first editions of the program were designed for line comparisons in text files. By the 1980s, support for binary files was necessary resulting in a shift in the application's design." be in the History section? Just a thought. --Shinobu 18:14, 28 Mar 2005 (UTC)

I agree it seems out of place, but I think it's necessary to somehow mention in the intro that diff handles text and binary data. The sentence is just being a little more specific. No worries.


Why I deleted Unix and the $-sign[edit]

I deleted Unix from the intro because there are diffs for every platform. The Unix heritage is discussed in the History section so I think that's okay. I removed the $-sign, because prompts look different on every system (even if one compares two Unixen). Apart from that, most texts illustrating a command to be entered don't display the prompt, because it might be confused with input that has to be entered by the user. Bye, Shinobu 18:12, 25 Apr 2005 (UTC)


Not enough said about patch(1)[edit]

I think the page should have a little more about patch(1) - not the whole content of patch's page, but a short introduction and a link to the page; elsewhere than in the GNU advertisements or "See also" section. I might write something when I have time.

I've added a sentence to the History section. It could probably use mentioning in the top section, I couldn't think of what. --132.198.104.164 17:49, 15 July 2005 (UTC)

Where to list DiffNote[edit]

DiffNote parses the output of the GNU diff command 'diff -y' and it's a free web application. I originally listed it under Free software implementations, but someone moved it to External links because it's not software and it's a web app. I think it is a free software implementation, as in an implementation of software, and the description of Free software implementations fits it. I don't want to revert it back to the Free software implementations section because it's my web app and some people might say I shouldn't even have posted it in the first place, but if there's agreement here that it belongs under Free software implementations, I'll move it back. -Barry- 19:56, 25 February 2006 (UTC)

I do not think it is an implementation as it is merely parsing diff's output, according to you. --maru (talk) contribs 20:24, 25 February 2006 (UTC)
I'm not sure whether parsing is the right word. DiffNote shows invisible characters as red dots, while 'GNU diff -y' renders them. If you use 'GNU diff -y' on a line that has a carriage return in the middle of the text, the CR will cause the text after the CR to show at the beginning of the line and the text that was at the beginning of the line in the file you were diffing won't be shown. In DiffNote, all text is shown and the CR is represented by a red dot (which will eventually be titled "ASCII 13" so you'll know exactly what it represents). The order of the lines in the output, and (essentially) the gutter markers are the same as in 'GNU diff -y' but other things, like the red dots, are different. -Barry- 20:41, 25 February 2006 (UTC)

Patience diff[edit]

There seems to be increasing interest in the use of the patience diff algorithm, particularly for revision control. In particular, Codeville's precise merge algorithm (pcvd) employs it, and there has been interest in employing it for Bazaar-NG (see https://lists.ubuntu.com/archives/bazaar-ng/2005q4/005971.html, https://lists.ubuntu.com/archives/bazaar-ng/2006q2/010652.html). I've seen a few references (including Codeville's own documentation for pcvd at http://revctrl.org/PreciseCodevilleMerge) that point to the Patience sorting article.

I feel that this is something that wikipedia could treat better, but I'm not sure how to go about it.

Daf 19:59, 21 May 2006 (UTC)

Diff links[edit]

You're allowed to think the numerous links on the Diff page are "useful"[1], but that's not the measuring bar. The links I deleted are not relevant to the diff command. They're neat little Web sites that do the similar task that Diff does, but every generic Web file comparison tool can't be expected to be listed. After reviewing all of them, I did allow one Web site of note that provides output similar to using the actual Diff command. A lot of them are just more of the same thing and pollute the articles "External Links".

If you don't know, Wikipedia is not a link repository. Suggestions on what to include in "External Links" can be read at Wikipedia:External links. Thanks to your revert I did notice that I deleted the Unix commands template. Thanks for that. --71.254.13.237 23:53, 30 June 2006 (UTC)

One of the links you deleted was to DiffNote, which I created. It's a wrapper for diff -y, with some extra features, so it's pretty closely related to diff. Several months ago, I researched the other diff tools that were listed in the article, and many use diff as their back end. I'd agree to use that as the standard for inclusion. Right now, I can only confirm that DiffNote uses diff for its back end, so I'd like DiffNote put back.
I'd also like to make it easy to find alternative file comparison tools, so I'd add a link to File comparison to the See also section and mention that other tools are listed there. I'd also add something like "similar tools listed at File comparison" at the bottom of the external links section, even though it's not an external link, since that's the section that would catch people's eye if they're looking for similar tools. File comparison would then be linked three times from Diff. I obviously like usability. That's why I created DiffNote. -Barry- 01:15, 1 July 2006 (UTC)
Having mentioned that your revert was based on protecting your own Web site's listing would have been the better thing to do for the sake of full disclosure. This conflict of interest of yours is obviously going to cloud what you think the "standard of inclusion" will be. This is one of the things Wikipedia advises against at Wikipedia:External links#Links to normally avoid. --71.254.13.237 22:45, 1 July 2006 (UTC)

I proposed a file comparison infobox here. -Barry- 03:28, 5 July 2006 (UTC)

Are you attempting to circumvent the removal of these external links on the Diff article by pursuing a different process at Wikipedia that will result in having these external links instead forced on every single file comparison tool page on Wikipedia? I'm sorry if that sounds like a rhetorical question, but I think you're acting quite boldly. --71.254.7.215 05:30, 5 July 2006 (UTC)

Links to free file comparison tools that use Diff as the back end are appropriate for this article, but I think my infobox is even better because it's easier to maintain and offers more information in a consistent, neat package. Your last two replies to me are argumentative and don't address the issue of whether my ideas will improve the article. -Barry- 06:11, 5 July 2006 (UTC)

I'm sure you think your infobox is better, but could you withdraw your proposal for the sake of it being poor form, so that we can instead wait for comments on whether these link deserve to be shown on a single page before you start placing them on multiple pages on Wikipedia? I'm sorry if I seem argumentative. I am finding it difficult to deal with an editor who has inserted themself in whether the appearance of external links should exist on a page when one of the links is of their own personal Web site, who wants to forego the current discussion on whether or not to list these external links and has gone ahead with an infobox proposal that would include such links on numerous page, a person who hasn't really acknowledged their conflict of interest, a person who ignores both precedent and suggestion provided by the Wikipedia project, and a person who copied private discussion on a user page to an article's talk page without permission. On the last, this is definitely legal act and not against Wikipedia policy, but I would have submitted my comments to the discussion page if that's where they were directed. Would you be willing to withdraw your proposal and actually let time pass so that others can comment, something you were originally interested in? I would also accept deleting both our entries in this argument back up until your propsoal on 5 July, so that people can avoid the noise we've contributed and have a better chance of engaging in the discussion. What do you think? --71.254.7.215 15:01, 5 July 2006 (UTC)

I was about to try a request for comments, then I noticed the third opinion page, then I decided to propose that the third opinion page be deleted rather than ask for a third opinion. I won't delete this discussion or go back to the old idea.
Got a response to my post on third opinion. I think I'll try RFC. Or I might just add the infobox tomorrow and leave it to you to dispute. -Barry- 19:42, 5 July 2006 (UTC)
And how would you see that as promoting civility and consensus? William Pietri 20:20, 5 July 2006 (UTC)
It doesn't, but I don't think it violates that either. The IP guy seems to mainly be questioning my motives. I don't think that should be the focus of a dispute. But if anyone here prefers the RFC or third opinion option, go ahead and try it. I tried RFC once and it didn't attract any comments. I've been through failed mediation too, and I'm currently in arbitration, so I'm familiar with the process. I prefer to avoid it, but I'll participate if someone wants to initiate it. -Barry- 20:37, 5 July 2006 (UTC)
I think saying that you may just ignore another user with reasonable questions is working against consensus. At least at first look, I agree with the numeric editor: the article is better without the array of links, and your personal interest in promoting your own work is indeed a reasonable concern. Per WP:VAIN and WP:AUTO, it's better to let others notice the significance of your work and add it. William Pietri 22:49, 5 July 2006 (UTC)
I started an RfC at Wikipedia:Requests_for_comment/Maths,_science,_and_technology#Telecommunications_and_digital_technology -Barry- 23:09, 5 July 2006 (UTC)
Great. Could you tell me about your reasoning in adding your link back without (or possibly in spite of) consensus? Thanks, William Pietri 04:15, 6 July 2006 (UTC)
The anonymous editor put back the link to a free, online interface to Diff, and that's basically what my link is. -Barry- 04:29, 6 July 2006 (UTC)

How is that justification for your action given that the link you added in particular is under debate here and at the center of an RfC (spawned without consensus as of late)? The link that I allegedly "put back" was mentioned in the original comment that started this whole discussion and even gave a rationale, "I did allow one Web site of note that provides output similar to using the actual Diff command." (I actually don't think it was "put back", it was kept.) Did you read my comment? If I did put it back on the second edit, I meant to keep it in my first attempt to pare down the external links. Regardless, nothing has changed that justifies adding your Web site.

This is at least the second example of both poor form and etiquette by user -Barry-. I don't really have time for this and my willingness and repeated offers to negotiate and collaborate on this is received with comments like, "I might just add the infobox tomorrow and leave it to you to dispute." --71.254.7.215 05:02, 6 July 2006 (UTC)

Why don't you want my link while you allowed a link to a similar tool to remain? Just because it's my tool? -Barry- 05:05, 6 July 2006 (UTC)
I think you adding a link to your tool is inappropriate. As WP:VAIN and WP:AUTO suggest, it's best to let others note the notability of your efforts; we are unavoidably biased about our own projects, or we wouldn't do them. William Pietri 05:52, 6 July 2006 (UTC)
So let me hear you or IP guy discuss whether my link should be added. There's a similar link. Why not mine too? Disallowing something for no reason is bad faith. -Barry- 05:59, 6 July 2006 (UTC)
I'm not disallowing it. I'm saying you shouldn't add it back after it has been removed by another editor because of the potential violation of assorted policies and guidelines. What links belong here, if any, is an interesting issue, but it's a different issue. William Pietri 06:11, 6 July 2006 (UTC)
I just noticed the reasoning "I did allow one Web site of note that provides output similar to using the actual Diff command." That's something, so I'll delete my link. We still have to settle the infobox issue though. -Barry- 06:13, 6 July 2006 (UTC)

unified diff example[edit]

the current diff -u example does not show at all that diff -u includes context above and below differences; for a moment, I had to wonder whether the article was showing an old version of diff -u that did not do context; but it's the example that does not allow any context as the differences are directly at both begin and end. --Habbie 01:13, 7 August 2006 (UTC)

Yeah. We'll need a better example. I believe I'm the originator of the example - I at first used it as a stop-gap until something better was thought up, but eventually forgot about it and that's where we are now. Shinobu 06:50, 7 August 2006 (UTC)

The example has been modified to show the context in patches, and the article now includes a "context format" output. --69.165.73.238 14:46, 15 August 2006 (UTC)

unified diff[edit]

I've majorly overhauled the 'unified diff' section... please give comments. AFAICT, the chunk size is redundant, as the patch program should be able to count the removed/added/contextual lines on its own. Anyone know otherwise? --=== Jez === 14:50, 7 September 2006 (UTC)

I assume by your second set of edits that chunk size is indeed required information:

If either chunk size doesn't correspond correctly with the number of actual such lines in the chunk, or any lines in the chunk are not preceded by +, -, or space, then the diff file is invalid.

I'm not so sure all the technical information on the unified diff format is required because some of your conclusions may not apply to all implementations of the patch utility. Plus, its a little bit of original research, and may not be necessary in an encyclopedia. You could try contributing it to the GNU diffutils manual (assuming your notes are based on GNU patch). It also now duplicates the summary I derived from your contributions. I'll try and fix that later. --71.161.218.87 00:51, 12 September 2006 (UTC)

I've tried my best to remove duplication. The section on unidiff now looks much shorter and even reasonable when compared to the context format. It still needs work. I imagine the file hunk header information applies to the context format and could be moved to its section. --72.92.128.194 02:24, 13 September 2006 (UTC)

The paragraphs about range information in unified format were unnecessarily complicated, invoking a symbol R for the reader to mentally rewrite into "l,s or l, as the case may be". I've changed the example to the basic l,s case and left the optional omission for the text discussion, as well as cleaning up a confusing chiasmus and making smaller readability edits. --Thnidu (talk) 18:49, 10 June 2010 (UTC)

Example error?[edit]

Is there an error on the diff examples? Specifically, the sentence "things can be added after it." never appears in the given orig/new examples.


Good eye. You may be correct. I've added it to the starting files. I don't have time, now, but maybe we can verify the examples actually correspond verbatim with eachother. (You can use diff to make sure). --71.161.214.168 21:52, 19 November 2006 (UTC)

I ran diff on the files, and the diffs appear to have the correct line number information. --75.69.52.204 12:32, 20 November 2006 (UTC)

You diffed the diffs? — SheeEttin {T/C} 16:21, 25 July 2007 (UTC)

interdiff[edit]

I think this article should mention interdiff, which is a program similar to diff that shows differences between diff files. Apparently it's part of patchutils by Tim Waugh. —Preceding unsigned comment added by 207.41.202.13 (talk) 16:01, 7 December 2007 (UTC)

HTML Diff and Daisy Diff[edit]

Suggest to add after Daisy Diff HTML Diff http://www.aaronsw.com/2002/diff/ —Preceding unsigned comment added by 88.149.243.200 (talk) 20:01, 15 March 2008 (UTC)

There are lots of specialized Diff tools. I just added a bunch of citation to research papers on the subject. Obviously HTML and XML are popular mentions since Wikipedia is a Web encyclopedia, but I think we need to avoid the temptation of turning the page to a directory of common day differencing tools. Such mentions are helpful to Web browsing but not to an article on Diff.

I suggest we remove most of the "free comparison tools" list in the "See also" section. --Ashawley (talk) 19:43, 12 October 2009 (UTC)

Is there a diff for WindowsXP?[edit]

Is there a diff for WindowsXP? —Preceding unsigned comment added by 71.131.2.120 (talk) 04:04, 9 June 2008 (UTC)

Yes. See Comparison of file comparison tools. It also lists many other very similar tools that run on WindowsXP. —Preceding unsigned comment added by 68.0.124.33 (talk) 04:14, 26 May 2009 (UTC)

Definition[edit]

The definition in the first paragraph is a little long. This part seems to make it repetetive, "or the changes made to a current file by comparing it to a former version of the same file". It would be better to have the first sentence be more concise by removing this, or making it a separate sentence.

I feel like it doesn't belong just because of its already assumed that diff compares multiple versions of the same file. --Ashawley (talk) 18:44, 14 August 2008 (UTC)

I've changed it to, "It is typically used to show the changes between a file and a former version of the same file." I'll let the Wikipedia wordsmiths sort it out from here. --Ashawley (talk) 04:12, 13 May 2009 (UTC)

Error about context format chunk ranges?[edit]

Are you sure that "the chunk ranges specify the starting line number and the number of lines the change hunk applies to in the respective file"? By looking at the example's ('diff -c') range numbers I think that these numbers represent starting line number and ending line number (not the affected lines count). What do you think? —Preceding unsigned comment added by 62.212.196.191 (talk) 01:38, 12 May 2009 (UTC)

Looks like you're right. I've changed it to, "The chunk ranges specify the starting and ending line numbers in the respective file." If I had to guess, it was probably a copy and paste typo from the section on the "unified format", where it is the case. --Ashawley (talk) 03:57, 13 May 2009 (UTC)

External links[edit]

I'm saddened to see so many links in the "External Links" section. Just because something has the name "diff" included in it or does file comparison of some sort does not mean its relevant. Looking over the revision history, most of these edits are made by one-time anonymous edits, and not be long-time editors.

There are nearly 20 links, currently. Let's see what they are shall we?

Altova DiffDog
is a "unique diff / merge utility" with a "visual interface" for "files, directories, and database tables" to "compare and merge text or source code files, synchronize directories, and compare database tables" and also "provides advanced XML-aware differencing and editing capabilities". Interesting, and related, but doesn't have anything to do with the article. Removing.
GNU Diff utilities
looks like classic unix diff as implemented by the GNU project, with source code to boot. Good.
Kdiff3
"compares or merges two or three text input files or directories" with a "graphical color display". Interesting, and useful, and probably has source code, and maybe again this is how people will be working with Diffs some day (and become a mention in the article), but unfortunately, Wikipedia is not a crystal ball. And even if it's useful to some, Wikipedia is not a link farm. Kdiff3 could have an article of its own and be a "See also" mention from this article. But that's just not the case, since it doesn't right now. I'm not going to start notable 3 years ago, either[2]). Removing.
DiffUtils for Windows
Seems to be a duplicate since it's just a port the GNU Diff utilities. Windows is a dominant operating system, but every operating system can not be listed, and besides why should Windows receive special privilege? Further, what if even more people port and package GNU diff utilities to Windows? List those too? Removed.
Algorithm-Diff, TextDiff, java-diff and the JavaScript diff algorithms [5]
all seem to be libraries written in various programming languages with the name "diff" in them, and perhaps the core diff algorithm implemented--though I haven't verified them. Their purpose is generic by design, but none of these describe an implementation of diff in toto. Removing.
DaisyDiff
is a "Java library" that compares "HTML files" and "highlights added and removed words and annotates changes to the styling". This is something that again isn't Diff, is related--especially since HTML files are also text files--but every domain that the diff concept is applied to can be listed--even the interesting one's. Again it would be a valid "See also" section. Removing.
WinMerge
is in the "See also", so I'm not sure why this is here since there is already a WinMerge article. Removed.
Adobe Flex Diff
is another library implementation so removing per above. Removed.
Meld
is the "Gnome GUI Diff tool". Doesn't belong for the same reason as "Kdiff3". Removed.
DeltaWalker
is a "file and directory comparison and synchronization" tool "for Linux". It "lets you compare, edit, and merge files and synchronize directories, and "does so visually". Doesn't belong for the same reason as "Kdiff3" and "Altove DiffDog". Removing.
JSBlend
is a "web-based file comparison and merge tool" written "mostly in Javascript" that "relies on a Python or PHP backend" that in turn "relies mostly on the GNU DiffUtils toolset". Sounds like it's really "mostly" GNU diffutils with a Web frontend. Removing.

If the "External Links" section looks paltry by consequence, don't worry, the "References" section of the article has relevant links to make up for it.

So that leaves only one link. I'm fine with not having anything, since the GNU diffutils link is not sacrosanct and the Diffutils manual is referenced in the article, already.

When you remove links that are irrelevant please refer to the Talk page in your edit. --Ashawley (talk) 05:20, 13 May 2009 (UTC)

File addition and deletion to the patch format[edit]

The patch format discussion is missing the fact that if the "+++" line in the header refers to "/dev/null" rather than the file being edited, the patch program will interpret the hunk as file deletion (and will probably expect single hunk deleting all lines from the file. If "---" refers to "/dev/null", the patch program will interpret that as file creation (and will probably expect a single hunk adding all lines to the file). I am not sure about what will the patch program do if the "/dev/null" reference appears next to any other type of patch. —Preceding unsigned comment added by Jozue (talkcontribs) 21:59, 30 August 2009 (UTC) I also discovered that the added and removed files can be denoted by making the timestamp inside the "---" or "+++" line equal to the start of Epoch (1970-01-01 0:0:0 UTC). Also I want to tell that these discoveries are related to the Unified Diff format (not the Context Diff format), although they might be relevant to the Context Diff as well. Jozue (talk) 22:23, 30 August 2009 (UTC)

Categories[edit]

What is the justification for being in Category Formal Languages? It was added to Revision as of 15:48, 14 May 2008. I suggest deleting it. H.Marxen (talk) 20:16, 10 September 2009 (UTC)

Separate "References" section[edit]

Shouldn't there be a separate section that is separate from the footnotes? I made such an edit as [3], but it was reverted. --Ashawley (talk) 19:57, 12 October 2009 (UTC)

Project Xanadu reference[edit]

This surprised me. Transclusion is the inclusion of parts of documents into other documents,. I don't see the relationship with diff, which doesn't do transclusion, but takes two text files and produce the list of differences. Besides, Xanadu was a system to hold documents, not a utility; Xanadu was vaporware while diff was highly practical. My proposal is to omit this paragraph entirely. Rp (talk) 08:38, 26 March 2010 (UTC)

Done. Rp (talk) 12:35, 26 August 2010 (UTC)

The patch program should be aware of this.[edit]

From the article: As a special case, unified diff expects to work with files that end in a newline. If either file does not, unified diff will emit the special line "\ No newline at end of file" after the modifications. The patch program should be aware of this.

Is the patch program aware of this (as it should be) or isn't it, and which is a problem that should be addressed? --221.6.44.4 (talk) 05:31, 2 June 2010 (UTC)

be careful with colors[edit]

The Usage section contains a nice, colorized example of the normal diff output, including the words "the following normal diff output", but this is misleading in that normal diff output does of course not include those helpful colors! Need to figure out some nonintrusive way of mentioning that the colors have been added for illustrative purposes but are not a feature of normal diff output. (Though perhaps this is an inspiration for yet another feature...) —Steve Summit (talk) 16:45, 21 October 2013 (UTC)