Wikipedia talk:WikiProject Computer science

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated Project-class)
WikiProject icon This page is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 Project  This page does not require a rating on the project's quality scale.

GitHub and its reign of terror over External Links sections[edit]

Probably the most common form of vandalism I see in CS articles is people adding links to their toy GitHub project where they implement the algorithm/concept/data structure in question. I've been removing them on sight. It's only a matter of time before someone gets defensive about that, so should we be proactive and codify removal of GitHub links into policy? Some reasons:

  • The repo owners are almost always the ones adding the links, which falls under WP:PROMOTION
  • Almost all examples are unmaintained/abandoned and don't have a proper license
  • A bunch of obtuse, unedited, and buggy source code contributes very little to the informative power of an article
  • If the code really does contribute to the article, it can be added to the article itself (I contributed some Python code here)

I'm still on the fence about encyclopedic source code in general, actually. I've had people point out bugs in that Python code over the years though, so I guess it's useful to some people. Anyway, thoughts about GitHub (and source code in general)? Andrew Helwer (talk) 01:12, 17 December 2014 (UTC)

Yes, in general a pseudocode* implementation in the article is sufficient and external links to real implementations don't add much value. However, I don't think this is specific to code hosted on GitHub and I think having a link to an "industrial strength" implementation (Stony Brook Algorithms Repository, Boost, LEDA, part of some other well-known and well-maintained library) is fine, and I wouldn't want to exclude those if they would ever happen to get hosted at GitHub.
* I don't think having a Python, C and Java implementation in the Boyer–Moore article is all that useful either. A single easy to read pseudocode implementation would suffice. The status quo has been to move the actual implementations to the Algorithms WikiBook or Rosetta Code. I'd conjecture that the vast majority of actual code on Wikipedia is incorrect: often because well-meaning editors make small changes over time without bothering to check if the code still actually runs. Psuedocode has the advantage that it can often be sourced to a research paper, is easier to check for correctness due to its high-level nature, and is immune to a large class of subtle bugs actual code can suffer fromRuud 14:33, 17 December 2014 (UTC)
Having implemented many wikipedia pseudocode examples for Rosetta code, they can sometimes miss out "obvious" (to the writer), steps; but are generally good. --Paddy (talk) 06:43, 31 December 2014 (UTC)

I agree with most or all of the above: multiple implementations within an article in different languages aren't helpful and pseudocode is generally a better choice than a specific programming language; external links to personal projects on github or wherever else are generally not very helpful and should probably not be included (per WP:ELNO); industrial-strength projects should be linked, regardless of whether they are hosted on github. —David Eppstein (talk) 23:58, 17 December 2014 (UTC)

Just like wikipedia, Rosetta code examples are reviewed and maintained by its community. To a programmer, having source in their language can be a great aid to understanding a topic, Links to Rosetta Code are useful. --Paddy (talk) 06:43, 31 December 2014 (UTC)
  • Generally support the ideas and the observations above. Generally, when I see someone add a github link, I follow the link. If the project is recent, then I revert the added link on the grounds that the addition probably was advertising and the project hasn't been around long enough to have any significant review. Glrx (talk) 23:32, 20 December 2014 (UTC)
I would think we don't want to link to Boost even for some general algorithm, unless the subject of the article was about Boost specifically in some way or at least something that was known/popular because of Boost. I can't think of an example of the latter. Please remove these links if you see them: per WP:ELNO I think we want pseudocode at least in the article so we don't need a link to an example impl. ErikHaugen (talk | contribs) 23:51, 20 December 2014 (UTC)

Okay, so how about the following? An external link to an implementation is allowed if one of the following is satisfied:

  • The implementation itself satisfies WP:IMPORTANCE (it could arguably have its own page where it is WP:ELOFFICIAL, but no editor has yet created it)
  • The implementation is the product of a WP:IMPORTANCE organization or project (ex: projects under Apache or Boost)
  • The implementation is written by a WP:IMPORTANCE individual or person involved in a foundational paper on the subject

On the secondary topic of encyclopedic source code, it is to be discouraged in favor of pseudocode. I'm also trying to think of a different area of Wikipedia which might have dealt with similar issues - maybe fan covers or remixes of famous songs in the age of YouTube and SoundCloud? Andrew Helwer (talk) 19:00, 23 December 2014 (UTC)

Proposed deletion of Helper class[edit]

Ambox warning yellow.svg

The article Helper class has been proposed for deletion because of the following concern:

"dictionary definition" which is not what Wikipedia is for (see WP:NOT) - but also no-one agrees on the definition of a helper class and there are no reliable sources!

While all constructive contributions to Wikipedia are appreciated, content or articles may be deleted for any of several reasons.

You may prevent the proposed deletion by removing the {{proposed deletion/dated}} notice, but please explain why in your edit summary or on the article's talk page.

Please consider improving the article to address the issues raised. Removing {{proposed deletion/dated}} will stop the proposed deletion process, but other deletion processes exist. In particular, the speedy deletion process can result in deletion without discussion, and articles for deletion allows discussion to reach consensus for deletion. greenrd (talk) 22:57, 3 January 2015 (UTC)

Question about OWL Article[edit]

How does one add an article to the attention of a project? I noticed that this article: Web Ontology Language isn't assigned to any projects. I think it would make sense for this project. Actually, another thing I was wondering about is: is there are any guideline as to when something is relevant to the Computer Science project vs. the Computing project? Seems like a lot of overlap. Finally, back to the OWL article someone has slapped a lot of "non primary source" tags on it. That seems wrong to me. I'm going to look for other sources anyway to address that issue (my philosophy is you can never have too many good refs) but in general (see my comment on the OWL talk page) it seems to me that if I say "Version X of Fact++ uses the OWL standard" that quoting the manual or spec for Version X of Fact++ is a perfectly fine way to reference. --MadScientistX11 (talk) 14:53, 9 January 2015 (UTC)

I added the Computing banner to the top of the article's talk page, which adds it to the Computing WikiProject. I chose Computing rather than Computer Science, as OWL seem more of an application than a theoretical computer science concept. Investigating the closely related article Ontology language also shows the Computing banner, so it it probably a good choice. Yes, I agree that the non-primary tag bombing in the lead is a bit much. While we prefer secondary sources, authoritative primary sources can be OK for verifying basic, uncontroversial facts, such as whether OWL2 is used in Pellet; ref 8 is pretty clear about OWL2 in Pellet. But I'm no expert--I could be missing some controversy about OWL2 inclusion. --Mark viking (talk) 00:13, 10 January 2015 (UTC)
Thanks for doing that. I added several references to books such as Programming the Semantic Web that are secondary sources and talk about reasoners such as Fact++ and Pellet so I felt it was justified to remove the tags. I left the primary source documents as well though, IMO in this case someone coming to that article should be directed toward them as good references and also as documents that would be a logical place to go to get more info. --MadScientistX11 (talk) 17:36, 14 January 2015 (UTC)

WikiProject X is live![edit]

WikiProject X icon.svg

Hello everyone!

You may have received a message from me earlier asking you to comment on my WikiProject X proposal. The good news is that WikiProject X is now live! In our first phase, we are focusing on research. At this time, we are looking for people to share their experiences with WikiProjects: good, bad, or neutral. We are also looking for WikiProjects that may be interested in trying out new tools and layouts that will make participating easier and projects easier to maintain. If you or your WikiProject are interested, check us out! Note that this is an opt-in program; no WikiProject will be required to change anything against its wishes. Please let me know if you have any questions. Thank you!

Note: To receive additional notifications about WikiProject X on this talk page, please add this page to Wikipedia:WikiProject X/Newsletter. Otherwise, this will be the last notification sent about WikiProject X.

Harej (talk) 16:57, 14 January 2015 (UTC)

Request to check edit in Differential evolution[edit]

I added pseudocode example in Differential evolution. I would appreciate if anyone could check if edit look good and follow wikipedias guidelines- Esa-petri (talk) 19:13, 18 January 2015 (UTC)

Invitation to Participate in a WikiProject Study[edit]

Hello Wikipedians,

We’d like to invite you to participate in a study that aims to explore how WikiProject members coordinate activities of distributed group members to complete project goals. We are specifically seeking to talk to people who have been active in at least one WikiProject in their time in Wikipedia. Compensation will be provided to each participant in the form of a $10 Amazon gift card.

The purpose of this study is to better understanding the coordination practices of Wikipedians active within WikiProjects, and to explore the potential for tool-mediated coordination to improve those practices. Interviews will be semi-structured, and should last between 45-60 minutes. If you decide to participate, we will schedule an appointment for the online chat session. During the appointment you will be asked some basic questions about your experience interacting in WikiProjects, how that process has worked for you in the past and what ideas you might have to improve the future.

You must be over 18 years old, speak English, and you must currently be or have been at one time an active member of a WikiProject. The interview can be conducted over an audio chatting channel such as Skype or Google Hangouts, or via an instant messaging client. If you have questions about the research or are interested in participating, please contact Michael Gilbert at (206) 354-3741 or by email at

We cannot guarantee the confidentiality of information sent by email.

The link to the relevant research page is m:Research:Means_and_methods_of_coordination_in_WikiProjects

Ryzhou (talk) 03:46, 28 January 2015 (UTC)

Too Many Experts Spoil the Wiki[edit]

I have virtually given up on Wikipedia as a useful source of information with respect to Math or Science, and especially computer science. There was a time in the past when I would refer my children to WP for more information regarding physics, computing topics, etc. And in the past, they were able to learn something. No so anymore. Recently (over the past few years), articles are being rewritten by so-called subject matter experts, seemingly without regard to the audience.

The vast majority of readers who want to learn about networks and for example, graph theory, have no formal education or background in the field. It would be nice if the hyper-technical terms were kept to a minimum, and examples would be geared less toward the scientist, and more toward an average reader who just want to get a feel for the subject matter.

I am an experienced computer scientist, and I find the discussions regarding almost every single topic, whether it be number theory or architecture, confusing and frustrating to read. This should not be the case. I fear that most would-be contributors of late would rather see themselves appear "smart" on the page, rather than impart wisdom and accurate information. It's as though the "keep it simple" concept has been abandoned for the sake of ego.

There needs to be a movement from within WP to simplify ALL articles, and to ensure that readability and comprehensibility is enhanced for the average reader, which would probably be a 9th grade level reader (in the US). If this isn't done, and done soon, I fear that once was good and useful will be lost forever.

Wikipedia is very good at biographical topics. That's pretty much all I use it for now. It should be more like an encyclopedia used to be... A place where anyone could go to learn something on just about any topic. To the extent that it fails at that goal, it will become increasing irrelevant and unusable. Therefore, you would-be "expert" contributors need to ask yourself if it's really important to cram in a "big word", where 2 or 3 smaller ones would suffice. Most of the tech articles now read about as well as a poorly developed college textbook. And that is in no way a Good Thing.

Good feedback. I also have trouble learning new algorithms from their wiki pages, but we must work within WP:NOTGUIDE where we inform rather than instruct. Many topics just cannot be succinctly explained without reference to established knowledge, which requires "hyper-technical" terms. Can you provide an example of where this becomes intellectual showboating? Can you give an example (or create one yourself) of a good computer science article? Andrew Helwer (talk) — Preceding undated comment added 21:01, 9 February 2015 (UTC)

While Wikipedia:About says it is a source for everyone...
  • Is this project the right scope for a rehaul of the entire wikipedia?
  • What level of Flesch–Kincaid_readability_tests are you proposing for being required for reading comprehension?
  • How do you propose to track or identify these articles which are too high level or specific?
  • Can compromise be found in by other experts or teachers adding in summaries at the top without violating Wikipedia rules?

IamM1rv (talk) 17:16, 4 March 2015 (UTC)

Too few experts spoil the wiki (actually I just want to ask for an article on OptP to be created)[edit]

I actually came here to suggest an article on the OptP complexity class... but I couldn't help notice the complaint above... and I want to complain about the opposite! I guess it depends who you ask. (talk) 14:37, 9 February 2015 (UTC)

Dealing with self-promotion[edit]


I am a French contributor and quite a beginer on Wikipedia.

The page seems biased to me. It has mostly been written by the software's creator, according to the page history.

I don't know how to deal with this and I don't have informations about the subject, so I am not able to add content to the article. The search I made on the Internet was not fruitful.

This is why I pass the problem on... I would gladly receive information about what to do on those occasions. Eilean Liber (talk) 12:33, 20 February 2015 (UTC)

Good spot! You're right, this article is blatant promotion: written by the software's creator using marketing language. So, we have WP:COI and WP:PROMOTION. The proper course of action here is to set the WP:PRD tag on the article (which I have done). We should also look at reverting most of ThinkProductivity's contributions, since they are a WP:SPA. Andrew Helwer (talk) 21:38, 20 February 2015 (UTC)
Oh wow, this whole collaborative/project management software category is an enormous trash pit of spam. Luckily we have the List of collaborative software and Comparison of project management software pages to act as quarantine zones. I deleted a few spammy "competitor" sections from some borderline-spam articles. Andrew Helwer (talk) 21:54, 20 February 2015 (UTC)

Pseudocode Use[edit]

I am of the opinion that pseudocode examples are superior to actual-language code when it comes to demonstrating how an algorithm works for an encyclopedia.

  • Hopefully, the pseudocode is written so that people familiar with a wider range of languages are able to understand it.
  • Pseudocode avoids it being necessary to know one particular language.
  • Pseudocode is free from the quirks of individual languages. For example, a well-written program in C would free all of the memory that it allocates, which is an issue that is not important for demonstrating how an algorithm works. Also, C, for example, lets people say things like if (!thing) to test if thing is null, but for someone unfamiliar with C, this may be confusing. If thing is a FILE, for instance, it makes no sense to take its logical inversion.
  • Using pseudocode means that there does not need to be multiple implementations in numerous languages in an article. Having multiple implementations of the same thing does not add anything to the wiki.

I found some time ago that the page on binary search trees uses many code examples in C++ and Python. This is great for anyone who knows C++ and Python, but if someone unfamiliar with either language wanted to know how to, say, insert an item into a BST, it might be difficult for them to find that information on Wikipedia. To facilitate the transfer of knowledge, I suggest that there be an emphasis on the use of pseudocode rather than actual code.

I have rewritten the examples from the BST page in some form of pseudocode at User:Hwalter42/draft article on binary search tree. I have not at all tried integrating the result with the prose in the article, and do not want to replace the current BST article with this one. But I do want to feedback on the quality/style of the pseudocode, and, of course, its correctness. (I am an enthusiast, not an expert. Also, I am convinced that my delete function is wrong, but my brain is too turned off right now to figure out how to fix it.) More importantly, I also want to know if the use of pseudocode everywhere is something that people support or if the idea should be dropped now. Perhaps there is some substantial upside to using real languages that I am missing?

hwalter42 (talk) 22:40, 20 February 2015 (UTC)

I support this kind of thing in general. I'd suggest not using := vs. =, though; maybe := vs. == to help avoid ambiguity. ErikHaugen (talk | contribs) 23:08, 20 February 2015 (UTC)
The consensus has always been that pseudocode is preferred in articles on algorithms (see MOS:ALGO). Actual code is really only necessary in articles on specific programming languages or articles that discuss programming languages contructs and a very precise semantics is required. —Ruud 23:24, 20 February 2015 (UTC)
I agree pseudocode benefits articles by removing multiple implementations. However, there is no way of verifying psuedocode implementations are correct, either through unit testing or more formal methods (admittedly these standards aren't imposed on implementations in programming languages). We could use only pseudocode published in well-known papers, but those are notorious for technical errors - and what is pseudocode but a technical expression of a concept? Still, pseudocode copied from a published & cited paper is much, much more reliable and traceable than some implementation by a drive-by editor. Known errors with the pseudocode are often mentioned in subsequent papers, and so can be fixed. If it's just you implementing an algorithm in pseudocode rather than a programming language though, that seems worse than having a programming language implementation. Andrew Helwer (talk) 23:36, 20 February 2015 (UTC)
"seems worse"–It seems better to me, especially if the reader isn't familiar with the syntax of the language. I'm not sure I am understanding your point – most of Wikipedia's content is written by "drive-by editor[s]", and we seem to manage; why are code samples any different? Can you think of an example that would illustrate this concern? Also I'm not sure we can copy pseudocode from papers, most of the time, due to legal issues. ErikHaugen (talk | contribs) 00:04, 21 February 2015 (UTC)
Let's look at a notoriously complicated algorithm: Boyer-Moore string search. I have a bunch of unit tests for the article's Python implementation here, and still some bugs get through. A pseudocode implementation wouldn't have the tiniest chance of being bug-free unless it were copied from a published source (and even then, no guarantees) I don't know anything about the legal issues of copying pseudocode. I'm not sure I understand your second question. What do you mean by "this concern"? Andrew Helwer (talk) 00:41, 21 February 2015 (UTC)
If I may shout down from my ivory tower: pseudocode is perfectly amenable to a formal correctness proof—even much more so than a real implemenation—and such a presentation (pseudocode combined with a correctness proof) is the going standard in the scholarly literature. Assuming the bug in your code was not due to a basic failure in your understanding of the algorithm, it was likely caused (or managed to escape your attention) by the increase of complexity in the real implemenation as compared to pseudocode: pseudocode has a much greater chance of being correct and correctly understood than a real implemenation.
There's also the more pragmatic advantage that an n line psuedocode implementation will be read be a greater number of people than a 5n line implementation in programming language X, which would make it more likely that any mistakes get spotted and corrected.
(Your anecdote of course also nicely demonstrates why unit testing is fairly useless when it comes to demonstrating the correctness of your core algorithms and data structures: a limited number of hand-written testcases will almost surely miss some of the cornercases. With a formal proof, or at least some form of property-based testing, you are much more likely to cover those.) —Ruud 02:14, 21 February 2015 (UTC)
To be clear, I definitely don't believe unit tests are sufficient to ensure correctness. When you say formal correctness proof, do you mean the formal computer-checked way or the proofs-should-compel-belief non-formal way you see in papers? Because I've definitely read papers which have a proof of correctness but where the pseudocode is incorrect as exposed by a basic test. Example, a very useful and insightful algorithm, proved correct, but pseudocode that is just dead wrong. I feel I'm derailing this thread though. I concede you can get pseudocode correct enough that anyone writing code based on it will find & fix the bugs themselves. Plus, it's just plain easier to read than actual code. Andrew Helwer (talk) 03:02, 21 February 2015 (UTC)
Unit tests may not be sufficient, but they certainly help avoid many mistakes. I also believe that pseudocode is generally better than code for our purposes, but algorithms such as the ones discussed above that are too complex to have much hope of implementing correctly without testing may be an exception. —David Eppstein (talk) 03:26, 21 February 2015 (UTC)
Thank you for the feedback. I'm glad that the idea is supported, but I don't believe that I am the one who should be writing any pseudocode for most anything, as I don't see myself as qualified (again, I am not confident in the code I wrote for even BST's, which should be fairly simple). As sort of a side note, to address the :=, =, and == thing, I understand your point and am willing to change my own style; however, it shows an issue as to what style of pseudocode should be used. Preferably, it would be consistent across pages, but there is plenty of room for personal style considering as this is, by definition, not a strict language. I saw at the link given above that standardization has already been proposed but abandoned though. hwalter42 (talk) 03:40, 21 February 2015 (UTC)
Both forms, although informal proofs will of course make up the vast majority of correctness arguments you'll find. Nonetheless, they are often sufficently detailed that you should in principle be able to extract some more formal, or even computer-checkable proof, from them in for example Hoare logic. Any steps that require insight, such as comming up with loop invariants, should be derivable from the more informal proof.
Note that I'm not advocating including any formal correntess proofs in our articles, although I do think we can do a bit better on including informal/intuitive correctness arguments in our articles. Such arguments also help readers understand why a particular algorithm works, as merely opposed to how by giving the code. —Ruud 11:07, 21 February 2015 (UTC)

Pseudocode copyright concerns[edit]

Branching discussion from above to the legal issue of using pseudocode from papers and textbooks. Anyone have ideas on this? I've been using pseudocode from this paper on the CRDT article. Andrew Helwer (talk) 04:17, 21 February 2015 (UTC)

Remember that it's not the algorithm that can be copyrighted (patented maybe, but that's a different discussion), but only it's expression. Various syntactic modifications you make to the pseudocode in order to integrate it with the article are thus likely to invalidate any copyrights. If this is not possible for some reason, the code is critical to the article and you explicily attribute the code to a the paper, then this would likely be fair use. —Ruud 10:46, 21 February 2015 (UTC)

Is there a page here for PARTITION[ING] INTO TRIANGLES?[edit]

It's one of the classic NP-complete problems. (talk) 07:09, 4 March 2015 (UTC)

3-dimensional matching, maybe? —David Eppstein (talk) 07:46, 4 March 2015 (UTC)
Two NP-complete possibilities I know of: graph partitioning of suitable graphs into triangles is NP-complete, e.g. [1], and minimum-weight triangulations. --Mark viking (talk) 11:24, 4 March 2015 (UTC)

Request for feedback on Talk:Zachman Framework#Lead sentence[edit]

The lead sentence Zachman Framework has been recently changed from

The Zachman Framework is an enterprise architecture framework...


The Zachman Framework is an enterprise ontology...

Which has been questioned at Talk:Zachman Framework#Lead sentence. I would be grateful if any of you could take a look, and comment on this topic. -- Mdd (talk) 15:39, 13 March 2015 (UTC)

Yes check.svg Done resolved by Kku, thank you. -- Mdd (talk) 12:03, 25 March 2015 (UTC)

Invention of BASIC[edit]

There is a discussion concerning who developed the BASIC programming language at Talk:BASIC#Sister Keller. --Guy Macon (talk) 19:31, 14 March 2015 (UTC)

Soft goal - cite[edit]

"Non-functional requirements (or quality attributes, qualities, or more colloquially "-ilities") are global qualities of a software system, such as flexibility, maintainability, usability, and so forth. Such requirements are usually stated only informally; and they are often controversial (i.e. management wants a secure system but staff desires user-friendliness). They are also often difficult to validate." is not cited! ( — Preceding unsigned comment added by (talk) 12:35, 24 March 2015 (UTC)