Talk:Program optimization

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Computer science (Rated B-class, High-importance)
WikiProject icon This article is within the scope of WikiProject Computer science, a collaborative effort to improve the coverage of Computer science related articles on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
B-Class article B  This article has been rated as B-Class on the project's quality scale.
 High  This article has been rated as High-importance on the project's importance scale.
 
WikiProject Software / Computing   
WikiProject icon This article is within the scope of WikiProject Software, a collaborative effort to improve the coverage of software on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
 ???  This article has not yet received a rating on the project's quality scale.
 ???  This article has not yet received a rating on the project's importance scale.
Taskforce icon
This article is supported by WikiProject Computing.
 

Assessment[edit]

I've done the assessment very quickly, the article still feels like having some issues. For instance, the "premature optimization" mantra has been added over and over without a global review, so the article needs a global restyling to fix this. --Blaisorblade (talk) 12:02, 7 January 2009 (UTC)

Redirect : Software Optimisation[edit]

'Software Optimization' redirects here. In the UK that would be spelt 'Software Optimisation', which currently does not redirect here. I don't know how to do it, but if we could have that redirect here too, that would be very handy, as I keep on landing on the search page, wondering what I've done wrong. 152.78.162.177 (talk) 14:39, 24 October 2011 (UTC)

Have submitted a request for this 81.174.243.233 (talk) 23:36, 17 November 2011 (UTC)
Submission has been accepted. This issue is now closed. 81.174.243.233 (talk) 01:32, 18 November 2011 (UTC)

Load Balancing[edit]

I know you may say load balancing is a performance tuning not optimization but one likely says "optimize the whole system by balancing load" so. -- Taku


About this addition

Usually, the optimizer does not modify actual code but produce code in the same time when compilers produces target code automatically. In other words, redundancy or any sort of inefficient code in source program remain unchanged with optimizers. To get an efficient code without optimization, one needs to optimize code manually. However, since most today's compilers have at least naive optimzers, manual optimization that can be done by optimizer is usually unnecessary.

I know this completely sounds awfully redundant for computer programmers. But what I found is that often people who have no knowledge about compliers assume optimizers actually covert code written by human to better, efficient code. And of course, we know what's not actually happening. At18, sorry but I think your revision misses this point. I will try once again. See and fix if needed. -- Taku 22:44 May 14, 2003 (UTC)

Hi, I just found your paragraph a bit confusing, and replaced it with a shorter version. You can of course try to expand it a bit to state your point better. At18 08:13 May 15, 2003 (UTC)

Too large[edit]

I think this article tries to discuss too many distinct topics at once, and should be split into different articles discussing compiler optimizations, manual optimization by humans, and performance tuning of large systems with things like load balancing. They can cross-link all they like, but this article isn't focused enough, in my opinion. Derrick Coetzee 15:39, 15 Nov 2003 (UTC)

Yes, I agree. There used to be a few optimization-related articles because they had overlapped discussion. I was thinking of spliting off but was unsure what organizations should be good. Yeah, I guess at least compiler optimization can be splited. -- Taku 15:51, Nov 15, 2003 (UTC)

Remember the other uses of optimization in computer science. I've written programs to optimize the efficiency of routing of bulk cargo ships and done outline design for a system to optimize the crude oil shipping system of the Kuwait National Oil Company and there's the ever-present travelling salesman optimization problem (which the other two are somewhat representative of). This article seems more about code and compiler optimization that the whole spectrum of optimization in computer science. I think that Derrick made a good point and that this article needs to cover the general concepts in a shallow overview, then link to the detailed articles for specific techniques in each field. JamesDay 12:34, 17 Nov 2003 (UTC)

While I agree that the article has issues, I'm sorry but I disagree with you, and I'll quote the article disambiguation to explain why. "For algorithms to solve optimization problems, see Optimization (mathematics)." The disambiguation can be refined, but you are really talking about optimization problems solved through computer programs, while this article is about optimizing performance of computer programs themselves. Renaming the article as performance optimization might be useful to reduce the confusion. I don't agree about a destructive split, a common article should be left to introduce the general topic, while maybe moving more material elsewhere. Uh, this discussion is quite old! --Blaisorblade (talk) 12:06, 7 January 2009 (UTC)
I agree with this. Any practical suggestion regarding the organization of this article? -- Taku 19:13, Nov 17, 2003 (UTC)
I think the article should be sensibly merged with Algorithmic efficiency but kept 'comprehensive' to retain all the relevant issues in one place, a point which has largely escaped notice because of disparate articles all slightly different with non neutal POV. I have tried to link a good many of the topics but a coherent and logical approach is really called for. I believe the existing algorithmic efficiency article is a good starting point with any current;y ommitted topics that appear in other articles merging into it. One editor commented out my additions about good choice of database and data definitions being very important at the design level because he claimed "most programs don't use databases" - the gross ignorance out there is quite unbelievable! —Preceding unsigned comment added by 81.129.233.212 (talk) 16:29, 17 September 2008 (UTC)

Premature Optimization[edit]

"Premature Optimization" gets redirected to this page - I don't think it should. Premature optimization is a subtle problem which isn't obvious to a novice. That's why Donald Knuth chose to highlight it. I'd like to see some rules of thumb or red flags which signal that an optimization is premature. Something like, when someone says "this will improve performance two-fold and we can easily manage the slight increased risk", or, as the project nears completion, there is a mad rush to implement a few more performance improvements before release.


In my opinion (but that's just mine !), the question of deciding whether an optimization is premature is essentially a matter of experience and personnal choices, and thus a whole list of what kinds of optimizations are premature is really difficult to establish, at least in an encyclopedia. Maybe a Wikibook on this topic would be interesting, or you may look at the bibliography of this article.
King Mike 15:03, 29 January 2006 (UTC)
All optimizations are premature unless you have _measured_ the system to be too slow. And you can't measure something that is not built yet. Lost Goblin 16:34, 29 January 2006 (UTC)
Exactly. Gritzko 04:15, 16 June 2007 (UTC)
I highly disagree with this comment. It may be the case that many programmers working on many kinds of programs (particularly in large project settings) are completely clueless as to where their machine resources are going, but this hardly applies to everything one might program. In particular, the lower level the code is and the more familiar with the underlying architecture the programmer is, the better its performance characteristics will be naturally understood. It's also unclear exactly what "profiling" means. I often find it much more useful to write specialized profilers that match the nature of the program than to simply rely on function call counts, but this is what people usually mean when they mention profiling. For code that intends to be high performance, function calls may break down entirely and you won't end up with information that is nearly as fine tuned as you require it to be.
I believe that the opposite of premature optimization (waiting too late to optimize) may be equally damaging, as the more mature a program's design is the less accommodating it will be to major underlying revision. It's important to design, program, and incrementally revise with particular performance goals in mind and simultaneously maintaining a clean and elegant design, rather than to just ignore performance and believe you can optimize it later. The practice of optimization is largely a matter of understanding resource usage to begin with, so in that sense truly "premature optimization" can hardly be considered optimization at all - it is usually just application of a few trivial transformations that the user was taught, and most likely shows that the user is incapable of effectively optimizing to begin with. A proficient optimizing programmer will have a deep understanding of both the underlying architecture (machine, libraries, OS) and the tools at hand (compiler). Even profiler results just give you a broad place to begin and capture some dynamic data (that you might be able to infer simply by your understanding of the problem); if you don't understand where resources are going then you won't be able to optimize it.
Interestingly, although premature optimizations (which I would consider to be pseudo-optimizations) can damage the clarity and functionality of code, true optimizations often enhance these aspects by making code simpler and more concise. Also interesting is the sheer number of optimizations Knuth himself has given in his books. I see the "root of all evil" mantra being preached so heavily without much further explanation (in a kind of fire and brimstone way where the phrasing and authorship itself carries so much weight that such explanation is deemed redundant) that I'm absolutely certain that it has perpetuated an attitude of lazy programming that has resulted in software with sub-par performance. Experience leads me to believe that often the people pushing this phrase are (much unlike Knuth) incapable of performing serious program optimization themselves, with or without profilers. Exophase (talk) 19:10, 8 February 2008 (UTC)
Your last paragraph is the important one. While somebody may quote Knuth to defend sloppy coding or horribly slow designs, that's IMHO misquoting, and I'd ignore that. The quote should be applied only to optimizations which hinder readability/maintainability. However, even getting a design right takes time, and sometimes optimizing a design does not optimize the correct problem.
Also, your reasoning seems to imply that only not-so-competent programmer can need profiling about performance. That's not true. Obviously a good programmer has some clues about where time is spent, but those are often wrong. Actually, I've heard the Knuth mantra only from very good programmers, like in the Linux development community, or in the Google V8 community. --Blaisorblade (talk) 00:12, 4 January 2009 (UTC)
I agree 100%81.129.233.212 (talk) 16:32, 17 September 2008

I agree with Exophase 100%. I've been in the Snes homebrew development scene for a long time and there certainly are a lot of arrogant pricks who use this phrase as an excuse for sloppy horrible code. The Snes always had a bad reputation for having a slow cpu because of this reason. They complain about how they optimize their finished code for so long and not seeing a significant speed increase. I always advise them to program their code consistantly efficient to begin with, so they don't have to worry about it later, but they never take my advise and incorrectly quote the phrase "premature optimizations are the root of all evil" and continue writing sloppy slow code thinking that they can fix all their problems at the end. (UTC)

Reference 'the art of programming'[edit]

I dont know exactly what parts of the article are inspired by the cited source 'the art of programming'. I know of one book entitled like that, but it has no real focus on optimization. Instead, since D.E. Knuth is cited several times in this article, the real source may be his book, 'the art of computer programming'. Am I wrong ?

King Mike

Yes check.svg Done Fixed by somebody else. --Blaisorblade (talk) 01:12, 4 January 2009 (UTC)

Mergefrom Loop optimization[edit]

Well, this is a classic, isn't it? (regardless if this article is too long or not). Ripper234 20:35, 2 April 2006 (UTC)

The loop optimization article looks fine on its own, and since we're trying to pare down this article I'd recommend against merging the two. CRGreathouse 23:38, 19 July 2006 (UTC)
I am against merging both articles, maybe you are looking for a more specific one, such as Compiler optimization. In fact I would propose to remove the Loop optimization article or to merge it with Compiler optimization (which seems to contain the former one).
Yes check.svg Done The mergefrom has been removed already by somebody else. --Blaisorblade (talk) 01:13, 4 January 2009 (UTC)

Contradiction[edit]

Improving about 20% of code is often responsible for 80% of the results contradicts with 90% of the time is spent in 10% of the code, and only 10% of the time in the remaining 90% of the code Sjord 17:59, 14 April 2006 (UTC)

Strictly, I don't think they are talking about the same thing -- one's about writing the code and the other about rewriting it. Still, I'd be happy to see both removed since they're essentially unverifiable. CRGreathouse 23:38, 19 July 2006 (UTC)
Since the statements are still there, even if they have been clarified, I'll answer by saying that this is a well-known rule of thumb among programmers. If anyone wants to challenge that, look for computer science literature about optimization or add [citation needed] to that. About the seeming contradiction, the rule is stated with various rates (80-20, 90-10), because those are not exact rates. I think, however, that the first statement has been incorrectly inferred from "80% of the time is spent in 20% of the code", or something like that. I say "incorrectly inferred" because one cannot copy-n-paste ratios here, for instance because the remaining source code should not be optimized at all. --Blaisorblade (talk) 22:35, 3 January 2009 (UTC)

Very high-level language[edit]

in the article it says that python is a "very high level language". I would say very high level languages are Domain-specific_programming_languages (DSLs). IMHO Python is just a high-level language and it's definitely no DSL!

--Riedel 21:57, 19 July 2006 (UTC)

Energy efficiency[edit]

This article should also address hardware and energy aspects of efficiency and optimization. But it does not currently address them at all, and has no links at all to any such info!-69.87.202.60 12:05, 17 May 2007 (UTC)

Yes it should. Please add them. Derek farn 12:31, 17 May 2007 (UTC)

Should link to "A Plea For Lean Software" be removed?[edit]

The link

refers to an article requiring paid subscription to read. WP:EL guidelines state: "6. Links to sites that require payment or registration to view the relevant content." should general be avoided. If the IEEE is hosting it, I expect it is a classic article. Nevertheless, it's not accessable without payment. Is it OK to remove the link? Regards, --Camacan 03:33, 26 July 2007 (UTC)

Could we link to something like: http://cr.yp.to/bib/1995/wirth.pdf instead? Also, the current link does atleast point to an abstract, that's not all that bad.... --DFRussia 07:57, 18 August 2007 (UTC)

System versus Program versus Code[edit]

It appears to me that this topic starts out correctly -- talks about system optimization -- then goes right down into talking about optimizing pieces of code. But these are two separate beasts. The comments about premature optimization (by Knuth and others) focuses on optimizing pieces of code. But note -- many of these quotes are very old -- 30 years or so. And they are applicable to that domain (code optimization) but not to systems optimization which holistically cover the entire infrastructure.

What do you think? SunSw0rd 17:58, 13 November 2007 (UTC)

interpreted languages[edit]

The section on interpreted languages is bullshit. Interpreters usually work in 3 phases: parsing the source into a series of tokens, parsing the tokens into a syntax tree, and then interpretation of the syntax tree. By far, the most computationally expensive is the last phase, because that's where the code is actually executed. Removing extraneous whitespace and comments only affects the first phase (the lexical analyzer), which is already extremely fast.

This section is really talking about size optimization over networks. This has nothing to do with interpreted languages. You see this for any form of data. For any data, whether it is source code, executables, or plain media, it is better to have smaller sizes so it takes less time to transfer over a network.

In conclusion, I suggest this section be removed, and possibly replaced with an obvious "data should be optimized for size to speed transfer over networks" statement.

--72.177.62.3 (talk) 22:58, 11 December 2007 (UTC)

Macros[edit]

The macros section seems inaccurate to me. First, the claim is made that C preprocessor macros are mainly limited to performing the same function as inline. I've seen this argument before, which has led people to largely condemn macros as inline is safer and more practical. In reality, C preprocessor macros are capable of many of the same things that Lisp-like macros or C++ templates are. This is because they do not merely expand to text limited to one pass but will also recursively expand all expansions. This allows macro names (or parametric parts of macro names) to be passed as macro parameters, and although the expansion is either very wide and specific or very general (as opposed to something mixed like what template specialization allows) it can be hierarchically arranged to minimalize the amount of redundant macros needed to allow for specialization. I've seen few people actually use CPP macros in this fashion but it can be done and it's quite powerful (and I regularly program with this technique).

Second, I don't understand what kind of claim is being made that "compiled code" is substituted in Lisp-like macro systems. Compiled code is neither modified nor inserted. Macro expansion is as much textual expansion as the CPP style is, it merely has more facilities for pattern matching, hygiene, self recursion, specialization, etc. It's certainly a much more powerful system however.

I'd appreciate it if some others who feel experienced with these topics could give their input. I feel that this entire section should be rewritten, although it might not really belong in this node to begin with.

Exophase (talk) 19:53, 8 February 2008 (UTC)

In my opinion, this section does not belong at all. Macros (whether C++ or Lisp like) offer some performance efficiencies at the cost of parsing and compilation time, but such efficiencies rarely apply to the concept of optimization being discussed in the main article. I suggest removing this section completely. 97.117.232.119 (talk) 03:55, 27 September 2008 (UTC) Faraz Babar.

I agree that macros are a much broader topic than optimization. Specifically, I think of macros as a "poor man's automatic programmer", and if used intelligently they can turn your source code into something that gets closer to a domain-specific-language. I suppose that will raise some people's fur. MikeDunlavey (talk) 22:06, 7 November 2008 (UTC)

Usage of C macros which cannot be replaced by inline is something I use regularly, but it does not matter for this article at all (even if you can use them for some optimizations, I don't see any notable pattern in that). I did rewrite the section - I knew both C macros and Scheme macros, and did some research on Lisp ones. I agree about Domain-Specific Languages through macros (never used them like that, but it's quite obvious that it's possible - after all, even libraries can be described as DSL). About "compiled code substituted" in Lisp systems, I hope I clarified the issue in the new text. --Blaisorblade (talk) 11:56, 7 January 2009 (UTC)

I agree with Faraz Babar that the section on macros is not directly relevant to optimisation, or at least certainly does not warrant such expansive coverage. It implies that macros are of themselves an optimisation technique, which they are not - it is possible to degrade performance with a macro in C through multiple evaluation of argument expressions. It is also incorrectly placed as a sub-section of "When to optimize" —Preceding unsigned comment added by 217.40.148.115 (talk) 10:05, 13 March 2009 (UTC)

I agree too. It should probably go, if there's consensus. For now, I've fixed it to not be a subsection of "When to optimize"; thanks for pointing it out. Shreevatsa (talk) 13:47, 13 March 2009 (UTC)
I also agree that macro-use is purely code substitution and therefore may optimise software development time but does not improve the performance which is what this article is about. — Preceding unsigned comment added by 141.228.106.150 (talk) 09:51, 11 October 2012 (UTC)

Dispute of sentence (about macros)[edit]

"In some procedural languages, such as C and C++, macros are implemented using textual substitution, and so their benefit is mostly limited to avoiding function-call overhead."

Anyone that's read the inlining article on wikipedia would know that removing the function-call overhead is the -least- effect that inlining code has. Constant folding and dead code removal have a far greater impact. I'm not game to change it myself, as I don't know anything about functional programming languages. Themania (talk) 10:19, 5 April 2008 (UTC)

Yes check.svg Done I've changed the paragraph, the old text didn't make a lot of sense - I guess that was written by a Lisp programmer with limited knowledge of C/C++. Some details may need fixing though. --Blaisorblade (talk) 10:02, 5 January 2009 (UTC)

Opening[edit]

The third paragraph of the opening:

Optimization can occur at a number of levels. At the highest level, the design may be optimized to make best use of the available resources. The implementation of this design will benefit from the use of efficient algorithms and the implementation of these algorithms will benefit from writing good quality code. Use of an optimizing compiler tends to ensure that the executable program is optimized. At the lowest level, it is possible to bypass the compiler completely and write assembly code directly. With modern optimizing compilers and the greater complexity of recent CPUs, it takes great skill to write code that is better than what the compiler can generate, and few projects need resort to this ultimate optimization step. Although a large amount of code written today is still compiled to run on the greatest percentage of machines. As a result, programmers and compilers don't always take advantage of the more efficient instructions provided by newer CPUs. Since optimization often relies on making use of special cases and performing complex trade-offs, a fully optimized program is usually more difficult for humans to comprehend hence tends to contain more faults than unoptimized versions; though this tends to hold mostly true for those not well-schooled in the programming language it was written in.

is a mishmash of several ideas; I tried to edit it and was unable to find a good way to handle it. Any suggestions?

I see the following points:

  • There are different levels of optimization
  • Hand-coding rarely beats compilers (clarification: not my claim)
  • Code is often compiled without machine-specific flags
  • Optimized code is more fragile than unoptimized code

Which of these need to be covered in the opening, and need they all be in this paragraph together?

CRGreathouse (t | c) 04:33, 11 February 2008 (UTC)

I think the second point in particular should not be in the opening and should have its own section altogether, because it deserves some elaboration. For one thing, I think what you said is exaggerated (and doesn't reflect what the original paragraph said) - for some platforms such as ARM (at least the versions widely commercially available right now) where GCC is used predominantly hand coding can beat it on a regular basis. "Rarely needed" can be verified by general consensus, but claiming that hand coded ASM can rarely beat a compiler should be backed by empirical evidence, if any exists.
Exophase (talk) 18:04, 11 February 2008 (UTC)

I totally agree with Exophase, hand coding can benefit from experience and good practise far more than a compiler can ever be capable of. A compiler can not do what a cleverly designed JIT 'compiler' can do for instance. I know, because most of my programs have been designed by me that way for 40 years plus, since I realized the advantages around 1968. My code was also exceedingly robust, far more so than most other compiled or otherwise and usually clearer, shorter and built on decision table lines showing each decision type and actions clearly. I still believe Assembler is the easiest language to write good code in, far more efficient than C and, properly documented is actually easier to maintain. It is a myth that Assembler cannot be produced economically and using macros, can be actually shorter than equivalent HLL stuff. See [[1]] for a branch table example where SWITCH/case style branch table is implemented in about 4 machine instructions with no comparison at all!81.129.233.212 (talk) 14:34, 17 September 2008 (UTC)

assembler optimisation requires "great skill"[edit]

"With modern optimizing compilers and the greater complexity of recent CPUs, it takes great skill to write code that is better than what the compiler can generate, and few projects need resort to this ultimate optimization step. Although a large amount of code written today is still compiled to run on the greatest percentage of machines. As a result, programmers and compilers don't always take advantage of the more efficient instructions provided by newer CPUs."

I have to disagree, in the case of streaming instructions and common assembler optimisations the only skills required are to be able to write assembler and to know of their existence. Its very easy to out do even the MS C++ compiler with some asm... my first foray into assembler programming was for just this purpose and was so easy that my horribly inefficient inline assembler was faster. Even simple stuff like the memcpy and memset packaged in the latest runtime libraries for MSVC++ can be easily optimised by using unrolled loops and the non-temporal copy instructions... Certainly nothing more skillful than just using the provided instructions as they were intended to be used.

Another classic example is that you have to specify the instruction set in the MSVC++ compiler, whereas it is easy (though time consuming) to write code that operates on all Intel CPUs utilising the SSE/SSE2 etc... instructions as appropriate.

I think its just that these technologies are still relatively new to the compiler and so the support is not yet full or powerful. I'm sure the compilers and libraries will catch up, but there will forever be new technologies available that will allow people with knowledge of them to outdo the compilers...

Of course this only applies to a small set of optimisations, its certainly not easy in all cases... but in many cases where you want to do it it is.

I find it disappointing that assembly programming is labelled as "something hard that you probably don't need", when really its easy and useful... particularly when you run out of optimisations to do... you may even find that optimising memcpy or memset will provide a signifcant boost (in a ~512x512 software triangle renderer used in Winamp AVS the speed gain from zeroing memory with assembler was about 3-4fps, from 27fps up to about 30fps).

I'm going to edit this statement to remove the "great skill" part, it seems out of place even once I ignore my personal feelings on the matter... "it is most often diffcult" is better imo... —Preceding unsigned comment added by 194.168.3.18 (talk) 15:28, 25 March 2008 (UTC)

oops... not logged in -- Jheriko (talk) —Preceding unsigned comment added by 194.168.3.18 (talk) 15:31, 25 March 2008 (UTC)

"You have to specify the instruction set in the MSVC++ compiler" doesn't count, obviously, as assembly programming. Then, you say you can outdo "even the MS C++ compiler"... I'm a GCC user, but the only really good optimizing compiler I know of is the Intel compiler ICC (but GCC, for some reason, used time ago to often outperform it on integer programming - I hope they fixed ICC). Finally, loop unrolling is not without disadvantages, so most users should just rely on libraries (and have good ones). I'm surprised compilers can't handle this. The current text is anyway OK, so let's mark this as Yes check.svg Done. --Blaisorblade (talk) 11:49, 7 January 2009 (UTC)

Abstract Syntax Trees?[edit]

I might have missed it in the "unbulleted lists" presented at the end of this article, but a discussion of AST-based optimizations seems to be conspicuous in its absence. There are other forms of "optimization" in computer science that don't necessarily center on code rewriting and rearrangement; caching comes to mind. SqlPac (talk) 03:01, 19 September 2008 (UTC)

Concurrent Programming[edit]

Use of multi-core processors, multiple threads and other concurrent processing techniques have been totally omitted from this article. (I added a reference on the list). But really in this age it is one of the key techniques to optimising software with regards to time. (Even on a single-core processor, using multiple threads will often be faster as the operating system is often able to slice up the processor time more optimally). — Preceding unsigned comment added by 141.228.106.150 (talk) 09:56, 11 October 2012 (UTC)

Trade-offs[edit]

I have a comment on the Trade-offs section. It don't doubt that there is an optimal trade-off curve between, for example, time and memory usage in an algorithm. However, the reader should be given to understand that very few programs (as opposed to theoretical algorithms) are actually on that curve. In my experience, many programs can be improved in both time and memory usage, and thus be brought closer to the curve. MikeDunlavey (talk) 15:50, 8 November 2008 (UTC)

Renamed this article to program optimization[edit]

This is what 99% of the article is discussing. Attempting to generalize this to XXX optimization in computer science, e.d. disk defragmentation, IP route optimization (google this, since we don't seem to have an article) or even various forms of hardware optimization, e.g. VLSI optimization is hopelessly broad. Pcap ping 18:11, 1 September 2009 (UTC)