Wikipedia talk:WikiProject Computer science

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated Project-class)
WikiProject icon This page is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 Project  This page does not require a rating on the project's quality scale.
 


Pseudocode Use[edit]

I am of the opinion that pseudocode examples are superior to actual-language code when it comes to demonstrating how an algorithm works for an encyclopedia.

  • Hopefully, the pseudocode is written so that people familiar with a wider range of languages are able to understand it.
  • Pseudocode avoids it being necessary to know one particular language.
  • Pseudocode is free from the quirks of individual languages. For example, a well-written program in C would free all of the memory that it allocates, which is an issue that is not important for demonstrating how an algorithm works. Also, C, for example, lets people say things like if (!thing) to test if thing is null, but for someone unfamiliar with C, this may be confusing. If thing is a FILE, for instance, it makes no sense to take its logical inversion.
  • Using pseudocode means that there does not need to be multiple implementations in numerous languages in an article. Having multiple implementations of the same thing does not add anything to the wiki.

I found some time ago that the page on binary search trees uses many code examples in C++ and Python. This is great for anyone who knows C++ and Python, but if someone unfamiliar with either language wanted to know how to, say, insert an item into a BST, it might be difficult for them to find that information on Wikipedia. To facilitate the transfer of knowledge, I suggest that there be an emphasis on the use of pseudocode rather than actual code.

I have rewritten the examples from the BST page in some form of pseudocode at User:Hwalter42/draft article on binary search tree. I have not at all tried integrating the result with the prose in the article, and do not want to replace the current BST article with this one. But I do want to feedback on the quality/style of the pseudocode, and, of course, its correctness. (I am an enthusiast, not an expert. Also, I am convinced that my delete function is wrong, but my brain is too turned off right now to figure out how to fix it.) More importantly, I also want to know if the use of pseudocode everywhere is something that people support or if the idea should be dropped now. Perhaps there is some substantial upside to using real languages that I am missing?

hwalter42 (talk) 22:40, 20 February 2015 (UTC)

I support this kind of thing in general. I'd suggest not using := vs. =, though; maybe := vs. == to help avoid ambiguity. ErikHaugen (talk | contribs) 23:08, 20 February 2015 (UTC)
The consensus has always been that pseudocode is preferred in articles on algorithms (see MOS:ALGO). Actual code is really only necessary in articles on specific programming languages or articles that discuss programming languages contructs and a very precise semantics is required. —Ruud 23:24, 20 February 2015 (UTC)
I agree pseudocode benefits articles by removing multiple implementations. However, there is no way of verifying psuedocode implementations are correct, either through unit testing or more formal methods (admittedly these standards aren't imposed on implementations in programming languages). We could use only pseudocode published in well-known papers, but those are notorious for technical errors - and what is pseudocode but a technical expression of a concept? Still, pseudocode copied from a published & cited paper is much, much more reliable and traceable than some implementation by a drive-by editor. Known errors with the pseudocode are often mentioned in subsequent papers, and so can be fixed. If it's just you implementing an algorithm in pseudocode rather than a programming language though, that seems worse than having a programming language implementation. Andrew Helwer (talk) 23:36, 20 February 2015 (UTC)
"seems worse"–It seems better to me, especially if the reader isn't familiar with the syntax of the language. I'm not sure I am understanding your point – most of Wikipedia's content is written by "drive-by editor[s]", and we seem to manage; why are code samples any different? Can you think of an example that would illustrate this concern? Also I'm not sure we can copy pseudocode from papers, most of the time, due to legal issues. ErikHaugen (talk | contribs) 00:04, 21 February 2015 (UTC)
Let's look at a notoriously complicated algorithm: Boyer-Moore string search. I have a bunch of unit tests for the article's Python implementation here, and still some bugs get through. A pseudocode implementation wouldn't have the tiniest chance of being bug-free unless it were copied from a published source (and even then, no guarantees) I don't know anything about the legal issues of copying pseudocode. I'm not sure I understand your second question. What do you mean by "this concern"? Andrew Helwer (talk) 00:41, 21 February 2015 (UTC)
If I may shout down from my ivory tower: pseudocode is perfectly amenable to a formal correctness proof—even much more so than a real implemenation—and such a presentation (pseudocode combined with a correctness proof) is the going standard in the scholarly literature. Assuming the bug in your code was not due to a basic failure in your understanding of the algorithm, it was likely caused (or managed to escape your attention) by the increase of complexity in the real implemenation as compared to pseudocode: pseudocode has a much greater chance of being correct and correctly understood than a real implemenation.
There's also the more pragmatic advantage that an n line psuedocode implementation will be read be a greater number of people than a 5n line implementation in programming language X, which would make it more likely that any mistakes get spotted and corrected.
(Your anecdote of course also nicely demonstrates why unit testing is fairly useless when it comes to demonstrating the correctness of your core algorithms and data structures: a limited number of hand-written testcases will almost surely miss some of the cornercases. With a formal proof, or at least some form of property-based testing, you are much more likely to cover those.) —Ruud 02:14, 21 February 2015 (UTC)
To be clear, I definitely don't believe unit tests are sufficient to ensure correctness. When you say formal correctness proof, do you mean the formal computer-checked way or the proofs-should-compel-belief non-formal way you see in papers? Because I've definitely read papers which have a proof of correctness but where the pseudocode is incorrect as exposed by a basic test. Example, a very useful and insightful algorithm, proved correct, but pseudocode that is just dead wrong. I feel I'm derailing this thread though. I concede you can get pseudocode correct enough that anyone writing code based on it will find & fix the bugs themselves. Plus, it's just plain easier to read than actual code. Andrew Helwer (talk) 03:02, 21 February 2015 (UTC)
Unit tests may not be sufficient, but they certainly help avoid many mistakes. I also believe that pseudocode is generally better than code for our purposes, but algorithms such as the ones discussed above that are too complex to have much hope of implementing correctly without testing may be an exception. —David Eppstein (talk) 03:26, 21 February 2015 (UTC)
Thank you for the feedback. I'm glad that the idea is supported, but I don't believe that I am the one who should be writing any pseudocode for most anything, as I don't see myself as qualified (again, I am not confident in the code I wrote for even BST's, which should be fairly simple). As sort of a side note, to address the :=, =, and == thing, I understand your point and am willing to change my own style; however, it shows an issue as to what style of pseudocode should be used. Preferably, it would be consistent across pages, but there is plenty of room for personal style considering as this is, by definition, not a strict language. I saw at the link given above that standardization has already been proposed but abandoned though. hwalter42 (talk) 03:40, 21 February 2015 (UTC)
Both forms, although informal proofs will of course make up the vast majority of correctness arguments you'll find. Nonetheless, they are often sufficently detailed that you should in principle be able to extract some more formal, or even computer-checkable proof, from them in for example Hoare logic. Any steps that require insight, such as comming up with loop invariants, should be derivable from the more informal proof.
Note that I'm not advocating including any formal correntess proofs in our articles, although I do think we can do a bit better on including informal/intuitive correctness arguments in our articles. Such arguments also help readers understand why a particular algorithm works, as merely opposed to how by giving the code. —Ruud 11:07, 21 February 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Pseudocode (in whatever flavor) is just another programming language – without compiler. Well-written C code should be the norm since it has been around since 1843 and likely will be around 2043. Besides, it is punishable by federal law in large parts of the world not to know it/being in the process of learning it for a student of computer science. As has been pointed out, pseudocode is an extra invitation to bugs. (There are enough of those in code that compiles.) Besides, not everybody knows pseudocode (whatever that is, definition please). YohanN7 (talk) 13:53, 7 April 2015 (UTC)

It is true that pseudocode lacks a compiler or any formal semantics. However, its expressive power does not suffer for it; English is not a formal language, after all. It is difficult to say whether there would be greater or fewer bugs in a C program compared to its pseudocode equivalent. For C we can check that it compiles and passes static analysis and unit tests, but bug-free C code is nearly unachievable (unless you're DJB or something). Let us consider: who is the audience for code in Wikipedia articles? I submit it is people who are trying to learn an algorithm. In my experience as a person trying to learn an algorithm, translating pseudocode to working implementation is where all the magic happens. Pseudocode correctness is just a nice-to-have, which could be accomplished through a high-level specification language such as PlusCal. Achieving consensus on use of a high-level specification language sounds like a lot of work, but if anyone is interested please start another topic and I will support it. Andrew Helwer (talk) 20:54, 7 April 2015 (UTC)
It really shouldn't matter what language an algorithm is written in unless it is being hidden say in object code nor provided. C and C++ are commonly known. C is close to a C++ subset. A few things are different. I have programed in many languages. Some are vary different. But on the other hand most algorithms are procedural.
I think we should try and be consistent what ever code is used.
Just for the record pseudo code has been used to describe directives and the inclusion of macros in some programing languages and assemblers. The "org" assembly directive is a pseudo instruction to the assembmer. Old terminology I know. But there are historical topics here.
I used the term pseudocode to describe instructions of a pseudo machine. A made up computer instruction set. It is part of a metacompiler. The compiler translates the source language into pseudocode, pseudo instructions. The pseudo instructions might be defined to best implement the language being compiled or generalized for translation to a wide range of real inctructions. pseudo instructions were much like an assembly macro that procedurally generated real machine code.
Things change. What I once called pseudo code is now something quite different. Another 10 years who knows. I just recently started editing. It is hard to explain, in writing, some concepts. I think linking terms to their topic is sometimes the best way to explain a subjects. The needed knowledge is found following those links. To bad we can not easly create flow charts for algorithms.
Oh wait we shouldn't ignore thar there is an international standard language for describing algorithms. It's called ALGOL. ALGOL has a publication form for communicating algorithms.
Steamerandy (talk) 07:57, 8 May 2015 (UTC)

Quote_notation: overly optimistic[edit]

The article on "Quote_notation", although sort of interesting, is overly optimistic on the usefulness of this notation for general computation with fractions. There is an obvious problem that the length of a quote notated fraction is linear in its denominator, often even close to it.

This optimism is already there in the original article.

For example, the suggestion is that subtraction of two quote notated number is "just subtract". Here a bad counterexample: To subtract 1/19 from 1/17 (giving 2/323), you compute 2941176470588235'3 - 894736842105263159'9, and after subtraction you get a number with a repeating part of 144 digits, ending in ...4334365325077'4

This is not easy by any stretch of the imagination. However, the notation is still an interesting thought experiment, so I would suggest not to remove it, but just make it a bit more realistic.

GA status for technical articles[edit]

What would a highly technical CS or mathematics article with Good Article status look like? Are there any examples to work off of?

Second question: Does the "useful to nearly all readers" requirement bar some potentially great technical articles from GA status? Or should the articles just be less technical, and leave the tricky details to specialty wikis? — Preceding unsigned comment added by Andrew Helwer (talkcontribs) 04:59, 24 May 2015 (UTC)

It's more difficult (especially to get readers willing to do the reviews) but iit does happen. For instance, Euclidean algorithm and Problem of Apollonius are featured articles, and Pseudoforest and Shapley–Folkman lemma are good articles. (These are more mathematical than computer science because that's the part of Wikipedia I'm more familiar with and can recall examples in, not because of any important differences that I know about in how CS vs math articles are treated.) —David Eppstein (talk) 06:37, 24 May 2015 (UTC)

Proposed deletion of Consultation (object-oriented programming)[edit]

Ambox warning yellow.svg

The article Consultation (object-oriented programming) has been proposed for deletion because of the following concern:

No reliable source cited for this decade-old article which appears to be a fringe POV fork

While all constructive contributions to Wikipedia are appreciated, content or articles may be deleted for any of several reasons.

You may prevent the proposed deletion by removing the {{proposed deletion/dated}} notice, but please explain why in your edit summary or on the article's talk page.

Please consider improving the article to address the issues raised. Removing {{proposed deletion/dated}} will stop the proposed deletion process, but other deletion processes exist. In particular, the speedy deletion process can result in deletion without discussion, and articles for deletion allows discussion to reach consensus for deletion. greenrd (talk) 10:56, 19 June 2015 (UTC)

Request for Comment: Split Virtual machine into separate pages for systems and process?[edit]

Interested editors are invited to comment at: Talk:Virtual machine: Split into separate pages for systems and process? on whether to split the Virtual machine page into separate pages for Systems virtual machine and Process virtual machine.

—Nils von Barth (nbarth) (talk) 17:51, 28 June 2015 (UTC)

Possibly incorrect diagram[edit]

Hello,

Browsing through https://en.wikipedia.org/wiki/Wavelet_Tree it looks like the diagram in the top right is incorrect. If I am not mistaken, the "C" needs to be where "D" is, "D" needs to be where "R" is, and "R" needs to be where "C" is.

Also, there doesn't seem to be a way to directly talk about issues on a page. As just someone who is noting that it may not be correct, I have no idea if this is the right way to go about this. — Preceding unsigned comment added by 141.219.235.253 (talk) 16:05, 1 July 2015 (UTC)

You can click the "new section" button when viewing the article's talk page to raise an issue. But for obscure articles, posting to this WikiProject is more likely to get you are timely response.
I didn't hear of wavelet trees before today, and didn't read the cited references, so I'm speculating a bit here: you suggest the the C, D and R nodes need to be rotated to make the tree more "balanced"? But this would not affect the shape of this particular tree depicted in the image?
Also, the image seems to have been created by Giuseppe Ottaviano and was copied almost varbatim from the one in his PODS paper on wavelet trees tries, which makes it rather unlikely the image is incorrect. —Ruud 16:51, 1 July 2015 (UTC)

Review of Heap's algorithm[edit]

I recently came across Heap's algorithm and discovered that our pseudocode did not correspond to Heap's description. The algorithm should use exactly one swap between each permutation. Our version has of course spread out to blogs and other places.

The error in our previous pseudocode was that it was performing a swap in the last iteration of the for loop, and you see if you expand the recursive calls, it leads to multiple swap calls in sequence.

I corrected the article (verified the algorithm by implementing it separately). A contributor had made a pretty illustration of the algorithm, however it needs to be updated now for the corrected version. The illustrator has requested we verify the algorithm before they attempt to illustrate it again. If you feel you can offer a second review of this, please help! Thanks sverdrup (talk) 15:18, 7 July 2015 (UTC)

AfC submission[edit]

Could anyone assess Draft:Ring Learning with Errors? Thank you, FoCuSandLeArN (talk) 17:56, 20 July 2015 (UTC)

NURBS - Knots v Control Points[edit]

See Comparison of Knots and Control Points — an expert review seems needed.

Hobart (talk) 19:42, 21 July 2015 (UTC)

Requested edit[edit]

Dear AI experts: I left a requested edit message on the Talk:Brendan Frey page, but then realized that I was the only one watching that page, so the request would likely not be seen. I have a conflict of interest, so I prefer not to edit the article myself. Frey is involved in deep learning and computational biology. —Anne Delong (talk) 19:10, 22 July 2015 (UTC)