Talk:Analysis of algorithms

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated C-class, Top-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C-Class article C  This article has been rated as C-Class on the project's quality scale.
 Top  This article has been rated as Top-importance on the project's importance scale.
 

Initial discussion[edit]

Why do you give the behaviour of binary sort as 1 + log2 N? Surely this is an implementation issue, and depends on exactly what you define as a step? Sjn28


Binary search, you mean. It does depend on what one defines as a step. Usually we want a step to be something which can be performed in time that is bounded above by a constant which depends only on the implementation. In this case each lookup at a particular position in the array counts as a step, and we assume that any such lookup can be done in a guaranteed amount of time. But if we were able to look at many entries in a single step, that would change the efficiency. Seb


Yes, I meant binary search. Sorry, slip of the brain. My point was that I don't think there's much point in saying any more than that the search is O(log N). Specifying something more specific like "3ceil(log_2(n)) + 2" or "2 log_2(n) - 1" is not terribly useful, because (a) you haven't defined what a step is; (b) it depends on exactly which datum you're looking for (you might be lucky); (c) other (unspecified) details of the implementation; but most importantly (d) it doesn't really matter what the constants are. Sjn28


I've updated the page. Tell me if you think it's better. Seb


Thanks! It's much clearer now. Sjn28

Outer loop[edit]

Is it correct that the outer loop executes (n + 1) times as in the statement 'The outer loop test in step 4 will execute ( n + 1 ) times'. I cannot see how this is the case as the loop appears to iterate from 1 to n, and if n were 1, the loop would execute only once. Bjwyse (talk) 16:57, 1 October 2008 (UTC)

Ok. Never mind. I see it includes the actual test for the loop condition so even if n were 0, the statement would execute once! Bjwyse (talk) 17:41, 1 October 2008 (UTC)

Footnote[edit]

What's the deal with the footnote "However, this is not the case with a quantum computer"? Aren't the commonly-understood models of quantum computers either non-deterministic Turing Machines (which are Turing Machines that (in deterministic time!) perform every possible state transition instead of just one state transition), parallel Von Neumann machines with an infinite number of processors, or black-box oracles that solve certain problems in constant time by magic? It doesn't seem familiar or meaningful to talk about some kind of computer that runs conventional algorithms, just not in deterministic time. 86.151.224.142 (talk) 09:02, 21 December 2008 (UTC)

removed sentence[edit]

I removed "Exact measures of efficiency are useful to the people who actually implement and use algorithms, because they are more precise and thus enable them to know how much time they can expect to spend in execution. To some people (e.g. game programmers), a hidden constant can make all the difference between success and failure." Of course implementers care about the exact time their program takes to run, but it would be hard to find any example where an exact complexity analysis is used as a tool. Almost all practical situations are much too complicated for that. McKay (talk) 00:14, 1 April 2009 (UTC)

Proposed merge of computation time[edit]

What's there to merge? A redirect seems more appropriate for now. The real beast of a WP:CFORK appears to be Algorithmic efficiency. Pcap ping 19:02, 22 August 2009 (UTC)

Right. I'll redirect. About Algorithmic efficiency: A beast indeed. A Balrog. --Robin (talk) 19:24, 22 August 2009 (UTC)
The last sentence in that article was not covered at computational complexity theory, and it should have been, so I copied it there. Pcap ping 20:17, 22 August 2009 (UTC)

What is [T1..T7].[edit]

This statement is unclear to me...

"A more elegant approach to analyzing this algorithm would be to declare that [T1..T7] are all equal to one unit of time greater than or equal to [T1..T7]."

Perhaps something more akin to the following is meant?...

A more elegant approach to analyzing this algorithm would be to declare each step executes in time k equal to one unit of time greater than or equal to Maximum of [T1..T7]. —Preceding unsigned comment added by 174.114.248.239 (talk) 22:05, 22 January 2011 (UTC)

I'll second this. What is trying to be said here? I'm not even sure that the comment above does much better. 69.66.249.35 (talk) 13:38, 13 April 2011 (UTC)

Why is this not the correct runtime for the for-loops[edit]

Outer will run n times. For each loop the inner will run 1,2,3,4...n times.

This means n*1 + n*2 + n*3 + n*4 times. Generally, n*(1+2+3+4+...+n) and we know the parentheses sum up to as Gauss said, sum of n numbers = n(n+1) / 2. Hence the running time is n*(n(n+1)/2 = n^3 + n^2 / 2 = O(n^3). — Preceding unsigned comment added by 129.11.106.156 (talkcontribs)

There is no outer multiplication by n; the outer loop's repetition has already been accounted for by virtue of the sum having 10 terms. The outer loop merely repeatedly adjusts the upper bound for the inner loop; it does not repeat each inner loop n times.
For it to be O(n3), the code would have to be:
for i = 1 to n
    for k = 1 to n # repeats n times
        for j = 1 to i # repeats i times
            print i * j
HTH, --Cybercobra (talk) 12:15, 8 March 2011 (UTC)

Ok thanks! Wouldn't be easier to argue in this way for O(n^2): The inner loop will run this many times, 1,2,3,...,n times. Which means 1+2+3+...+n = O(n) times.

Since it's wrapped in an outer loop that runs n times we get n * O(n) = O(n^2). —Preceding unsigned comment added by 129.11.249.250 (talk) 13:37, 8 March 2011 (UTC)

No, that's wrong. 1+2+3+...+n = (n*(n+1))/2 = (n2+n)/2 = O(n2), which is what the article already says. --Cybercobra (talk) 00:38, 9 March 2011 (UTC)

Maybe add[edit]

The discussion from Tarjan [1] on simplex vs. ellipsoid method asymptotic vs practical performance, which is a relevant example of this issue. Tijfo098 (talk) 16:54, 18 April 2011 (UTC)

What does "reasonable" in the preface part mean?[edit]

 However the efficiencies of any two "reasonable" implementations of a given algorithm are related by a constant multiplicative factor called a hidden constant.

This word's usage is confusing to non-professionals. Darth Ming (talk) 15:50, 1 May 2013 (UTC)

It is confusing to non-professionals and meaningless to professionals :-). Yet there is some purpose to the sentence. Consider for example an algorithm which is implemented by a good programmer in two different programming languages. The two programs won't run at exactly the same speed, but we can expect the speeds to be related (say, one of them is between 2 and 5 times as fast as the other for large problems). On the other hand, the programmer could add a lot of junk statements to the program to make it artificially run much slower — that would an example of an "unreasonable" implementation. McKay (talk) 07:52, 2 May 2013 (UTC)
I understand for professionals that word means nothing, but this is Wikipedia... I was just wondering how I can translate the sentence in Chinese without making it too lengthy. Thank you for the clarification anyway. :) Darth Ming (talk) 19:26, 2 May 2013 (UTC)

Section "Growth rate analysis of other resources"[edit]

Although it is definitely necessary to mention resources other than time, the section "Growth rate analysis of other resources" with its unrealistic pseudo-code seems to be essentially useless. McKay (talk) 02:15, 29 September 2015 (UTC)

lack of clarity in first sentence[edit]

In the first sentence, the phrase "the amount of resources (such as time and storage)" is not only problematic from a purely grammatical standpoint, its meaning is also unclear. Is the amount referred to measurable in some way? If we can perhaps specify this, then we can hopefully rephrase that sentence in a better way.

any help appreciated

--TyrS 11:26, 30 March 2017 (UTC)

p.s. are the resources mentioned above the same as System resources or Computational resources? --TyrS 11:38, 30 March 2017 (UTC)

The resources (time and storage) are mass nouns not count nouns, so "amount" is the correct word for them. It is grammatical; it means how much of each resource, not how many different resources (which is what your rephrasing, "number of resources") would mean. How do you think time might not be measurable?? Possibly it would be less confusing as "amounts of resources" but that would also be somewhat misleading as usually one analyzes only one type of resource at a time. —David Eppstein (talk) 15:07, 30 March 2017 (UTC)
What about "... of amount of time, storage, or other resources necessary ..."? This should at least eliminate the source of mass/count confusion. - Jochen Burghardt (talk) 19:09, 30 March 2017 (UTC)
Thanks for the suggestion Jochen. Yes, that would be better. Can we use an "and/or", as in "amount of time, storage and/or other necessary resources"? --TyrS 21:28, 30 March 2017 (UTC)

I'd also like to, if possible, make crystal clear that "them" (there are several potential referents in the sentence, and I think it might be possible to make the meaning clearer for non-computer scientists) at the end of the sentence ("In computer science, the analysis of algorithms is the determination of the amount of time, storage and/or other resources necessary to execute them.")

Would something a bit more like:

"...analysis of algorithms is a process whereby a measurement of the total resources necessary to execute a particular set of algorithms is determined"

work?

Also, is the "the" crucial to the expression? Do computer scientists say "an analysis of an algorithm"? Would you ever do an analysis of a single algorithm? ("Them" implies is that it's only multiple algorithms being analysed/costed at one time & I'd just like to confirm whether or not that's the intended meaning.) --TyrS 12:15, 31 March 2017 (UTC)

Isn't it obvious in the first sentence that "them" refers to "algorithms"? Nothing else mentioned before can be "executed". - Concerning "the": it does make sense to analyze a particular algorithm (as it does to multiply two particular numbers), but the article is about methods that work in general (as the multiplication method taught in school is applicable in general). Moreover, the article's title is given as a plural. - Finally, "analysis" is not "measurement". While the latter can only be performed for finitely many of all possible inputs, the former attempts to obtain a general formula how e.g. time consumption of a given algorithm depends on its given input. - All these issues should become clear by reading the article. If they don't, the text should be improved. - Jochen Burghardt (talk) 15:43, 1 April 2017 (UTC)
Thanks for the response Jochen. That "them" could possibly be trying to refer to "time, storage and/or other resources". I suspect that, with this minor wording issue, for people who are already very familiar with the subject matter, and who already know what the sentence is trying to get across, the lack of clarity is invisible. We need to write (at least our opening sentences & paragraphs) to be as understandable as possible to people who aren't necessarily familiar with the subject matter. Such people may not realize that computer science people don't (I assume) talk about "time, storage and/or other resources" being "executed". There's also the chance that 'analysis' could be being referred to, just with dodgy grammar (as "amount of ... resources" is, imo, poor grammar). Anyhow, even if it's known that it refers to algorithms, the sentence doesn't specify which algorithms. All algorithms? It's grammatically vague.
Would the following wording work instead?: "...analysis of algorithms is a process that determines the total resources (for example, time and/or storage) necessary to execute a given set of algorithms" --TyrS 17:21, 8 April 2017 (UTC)
I'd write "... a given algorithm". What about the "amount"? If you had read your above sentence in the article, wouldn't you have asked whether the analysis determines whether time is needed for one algorithm, while another one just needs storage? - Jochen Burghardt (talk) 07:06, 9 April 2017 (UTC)

You ask "If you had read your above sentence in the article, wouldn't you have asked whether the analysis determines whether time is needed for one algorithm, while another one just needs storage?" I don't think I would wonder about that (whether or not only one resource might be required by a particular algorithm) any more with this proposed/tweaked version of the sentence than with the previous version. Currently the unclarity that strikes me is around whether "them" refers to analyses, or to algorithms, or even (just looking at it from a grammatical perspective, though this one's pushing it a bit, but when there's already unclarity, it doesn't help) to time and/or storage.

Current: "In computer science, the analysis of algorithms is the determination of the amount of time, storage and/or other resources necessary to execute them."

Proposed: "In computer science, the analysis of algorithms is the determination of the amount of time, storage and/or other resources necessary to execute a given algorithm."

What do you think? --TyrS 18:23, 25 April 2017 (UTC)

Clarity in first paragraph[edit]

The binary search algorithm can lookup an entry in an ordered list of length n with at most log2n checks (blue), while the linear search algorithm (which ignores ordering) typically needs n/2 checks (red).

With the graph (top right) in the intro, is that an example of an analysis of an algorithm (or algorithms)? The caption currently is "Graphs of number of operations, N vs input size, n for common complexities, assuming a coefficient of 1", which doesn't explain (to non-experts) why it's there.

Also, how/why does the caption treat the graph as a plural? Is it another professional computer science/mathematician usage or a typo? Any help appreciated. --TyrS 02:53, 9 April 2017 (UTC)

Perhaps Wikipedia's content on the analysis of algorithms is not for you. You could try an algorithms textbook instead; they are typically more appropriate for people trying to learn a subject (per WP:NOTTEXTBOOK). The one I use for my classes is Algorithm Design by Goodrich and Tamassia. But I suspect you will need some other background material first, since you do not appear to understand that it is possible to combine the graphs of multiple functions into a single diagram. The caption could probably be worded more clearly, but the image compares the growth rates of different functions that frequently come up in the analysis of algorithms. It is important, in this subject, to know which of these functions grow more quickly, and which grow more slowly. —David Eppstein (talk) 04:27, 9 April 2017 (UTC)
Well David, I agree that the caption could and should be worded much more clearly. It sounds like you might be unaware of Wikipedia's guidelines on technical articles, i.e.WP:TECHNICAL and in particular on intro paragraphs of technical articles, WP:EXPLAINLEAD. Have a read of those & then get back to me.--TyrS 18:31, 25 April 2017 (UTC)
Concerning the image, it might be possible to illustrate binary search vs. linear search in a list, say, of names of all 45 US Presidents, ordered by name, vs. by time. This would explain the topic on an example which is familiar from everyday experience. (In earlier days, search in a phone directory was a common example, but it is obsolete today.) However, I haven't an idea how to depict binary search in an instructive way; @David Eppstein: maybe, you have? -
On second thought, the names shouldn't be familiar, such that there is a real need for linear search, which cannot be shortcut by historical knowledge.
The current growth rates image should be kept in any case. Its caption could name some functions, e.g. n and log n explicitly, and possibly even relate their results on an example input length, say 45; in this way, we could refer to the search picture when is realized. - Jochen Burghardt (talk) 06:49, 9 April 2017 (UTC)
There is an animation of binary search on the binary search article, but that's an illustration of an algorithm, not an illustration of its analysis. I'd point you to a picture of some analysis of quickselect on my blog (the analysis breaks down a two-index sum into pieces where it is easily summed and the picture shows the breakdown) but the blog is down while LiveJournal goes through its death throes and I doubt it would be an improvement over the current lead image for helping people who don't understand algorithm analysis to understand it. —David Eppstein (talk) 07:09, 9 April 2017 (UTC)
I had in mind an algorithm that virtually every reader has used some times (even if he never heard the word "algorithm"). The image would then just need to evoke his experience, its caption essentially saying that different methods for the same task can greatly differ in the amount of time effort, e.g. searching in an ordered list can be done fast, but if a wrong method (ignoring ordering) is used, it is much slower. Hopefully, we could also fit in "n checks" (or/and "n/2 checks average") and "log2n checks" somehow. Picture and caption would just be intended as a teaser, so the analyses of the speeds needn't be adressed there, they are explained in the article. - I think quickselect, and likewise Fibonacci computed with (e.g. File:Algo FcRec Exo Fibb 1 trace 5 faux.png) vs. without (File:FibbonacciRecurisive.png) caching which also came to my mind, are not sufficiently familiar to everybody for that approach. The binary search illustration is a good start, but it should be about some everyday list, not about a computer-programming array. I'll keep thinking about an image, and maybe come up with a suggestion in a few days. - Jochen Burghardt (talk) 12:28, 11 April 2017 (UTC)
Today, I made a first suggestion (see picture and caption). Please comment. (I just saw that chosing blue for binary search wasn't a good idea.) - Jochen Burghardt (talk) 12:41, 12 April 2017 (UTC)
Keeping in mind WP:TECHNICAL, and in particular WP:EXPLAINLEAD, how about a caption that, for non-experts, links the image (i.e. the article's the most prominent illustration of the subject) with, simply, the article's subject? There's clearly a tendency for discussion about this subject to get infinitely more convoluted, rather than less so. The article's first image should obviously show an analysis of an algorithm. I know it goes against the grain, but how about something like "graph showing the analysis of an algorithm [or of x number of algorithms, whichever the case may be]" (with possibly some particulars about the algorithm in question)? At the moment the caption just reads (again, obviously, to a non-expert, which is who we're supposed to be writing for, especially in our intro paragraphs) as not necessarily related to the subject. Of course more detail and complexity can still be included, just later on.--TyrS 18:41, 25 April 2017 (UTC)
The problem is that it is difficult to show an algorithm in an image, let alone its analysis. The latter is handcrafted activity often requiring a lot of creativity (comparable to a mathematical proof). As an example, section Binary_search_algorithm#Procedure shows the binary search algorithm, and Binary_search_algorithm#Performance shows its analysis. Although both sections probably could be improved, it should be obvious that it is difficult to shorten them in a way that they fit into an image and its caption. Nevertheless, I attempted to give an impression of the algorithm in my picture (by way of an example run), and to give (the result of) its analysis in the caption. Obviously this didn't become clear from my image and caption; probably the word "analysis" should appear there somewhere. What about the following caption?
«For looking up a given entry in a given ordered list, both the binary and the linear search algorithm (which ignores ordering) can be used. The analysis of the former and the latter algorithm shows that it takes at most log2n and n check steps, respectively, for a list of length n. In the depicted example list of length 42, searching for "Lobo, Haddock" takes 5 and 34 steps with binary (shown in blue) and linear (red) search, respectively.»
Would that be ok? - Jochen Burghardt (talk) 09:52, 27 April 2017 (UTC)