Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Considering performance in the small is not “premature optimization” (joshbarczak.com)
120 points by mef on Jan 2, 2015 | hide | past | favorite | 68 comments


Physician, heal thyself.

It always seemed evident to me that Knuth's exhortation was about making sure that the thing you were optimized was in fact a major contributor of code time usage -- and that your change was an improvement.

Telling people to avoid, for example, bounds checking because it might turn out to be a cycle soak later sounds like a good way to make your software worse, not better, in the hopes that a few instructions saved will make the difference. I once worked on a code base once with three different half-right hand-hacked versions of date formatting code. I replaced them with strftime(). Certainly the call was slower, but I was provably better off optimizing the timer routine that ran 50 times a second than worrying about hand-formatting dates into strings.


I think the principle of charity that dang has recently been trying to promote applies here. It's probably safe to assume that the author isn't suggesting that software be allowed to critically fail more often in the interests of micro-optimization.

The PDF he linked to for bounds-checking is a series of slides that notes that conventional approaches to bounds-checking incurs a 60% runtime performance hit, and that there is another approach that works as well but incurs only about a 23% runtime performance hit.

Aside: rapidly improving energy efficiency has probably been one of the greatest technological developments of the last 20 years; it would be amazing if software development practices could be similarly adapted. Like, is it possible to have a demoscene approach to software development while still keeping portability and maintainability? Imagine the transformation in software we could see in the next 10 years if someone figured out how to do that.


Mobile battery life might be just the thing that pushes this over the critical point. The first party to create a smartphone with a four week battery life because they made their (tightly integrated) system work on a 5 Mhz cpu will make a lot of money.


Smart phones have poor battery life due to background data sync (radios are expensive to power up) and screen usage.

Turn airplane mode on, leave your screen off, and your phone will last a long time.


Airplane mode is not fair. After all, "dumb" phones last for many days even without airplane mode, as long as you don't over use them. The fair test is this: how long will it last when used as if it were a dumb phone. Answer: pretty good, but much less than an actual dumb phone.


Dumb phones typically don't need to use their radio to do things like check mail though.


That's because they typically don't check mail, at least not in the background. That's fine. Turn off the mail check feature of the smart phone. Now will it last as long as the dumb one? Not quite. I would know, since I do have that feature turned off. As well as location tracking.


I had a professor that used to say, "Make it work and then make it work fast." The point being that you need to both understand and solve the problem before you can figure out how to make the solution faster. It is the reason that the concept of a prototype exists in every engineering discipline.

As an additional perspective, compare the solution of an engineer with 2 years of experience to that of an engineer with 5 years of experience. If the solutions are drastically different then interpretation of a rule such as "avoiding premature optimization" will be drastically different as well.

Like any overly simplified statement, it is actually highly subjective. The author of the article even calls out the specific context in which their interpretation of Donald Knuth's rules are being applied, "I’ve listed lots of relatively low-level things up there, but that’s just because it’s the level I work at." and as such their interpretation doesn't necessarily apply in other contexts.

Premature optimization is a problem if you're approaching it from a place of ignorance. If you're doing it mindfully based on experience and domain knowledge then it starts to make sense. But even under these conditions your best intentions can be wrong. I've been in plenty of situations where I thought I'd identified a code bottleneck only to have a far easier, cheaper, and better solution completely unrelated to code come to light.


> Make it work and then make it work fast.

The version I've always heard is:

Make it work, make it right, make it fast.

http://c2.com/cgi/wiki?MakeItWorkMakeItRightMakeItFast


Today I had to debug code with some database calls within 5 levels (at least) of for-each loops.

I stopped measuring at ~40,000 database round-trips.

"Engineering time costs more than CPU time" was the attitude, and for the original problem in its original specification, the solution was clearly OK.

But here we are, now needing to work out the original specifications, work out the current implementation (in case they differ anywhere), and work out whether it's worth re-writing it top-down or just fixing the worst of the loops.

And I'm not blaming whoever wrote this originally, it must have done its job to make it into the code base, but it really sucks to have to unpick it because an assumption of "database calls are free" is an assumption that unravels in a really messy way.


"...for the original problem in its original specification, the solution was clearly OK."

That statement leads me to believe this was technical debt that went unpaid and not a failing of design. Additionally, if your company is in a successful enough position to spend the time required to pay down this technical debt then if I were in your position I would count myself as lucky.

"...it must have done its job to make it into the code base...", exactly right! Every time I hit one of these situations I view it as a way for me to, "Try and leave this code a little better than you found it." I'm all about making things easier for the next guy who comes along.


I quickly learned that batching database and web service calls instead of making them inside a loop is generally not a premature optimization. It is one of the few performance considerations I consider while writing and reviewing code.


> now needing to work out the original specifications, work out the current implementation (in case they differ anywhere), and work out whether it's worth re-writing it top-down or just fixing the worst of the loops.

That always seems to be the problem: refactoring is only cheap if you can be sure that your test coverage catches it if you break something. Of course, the code bases that need heavy refactoring probably don't have it, because otherwise they would have been improved already...


"The website is temporarily unable to service your request as it exceeded resource limit. Please try again later."

Cache: https://webcache.googleusercontent.com/search?q=cache:zjuUa-...


This was a very nice essay, and I'm glad to have read it.

But it's impossible not to gloat at least a little bit about someone lecturing us about low level efficiencies from a site that can't manage to serve static blog content to a few hundred visitors a minute (or whatever it is that the top slot on HN drives these days).


The essay is about software you write. The author most likely didn't write the blog software nor the web server software.


No he didn't write it. He had a much easier job, to pick software that does do it right.


He likely doesn't give a shit about that.

If my blog is unreadable for five minutes a month, I am perfectly happy and I'm not going to change anything.


That's fine, but it's exactly the opposite of the attitude expressed in the post.

"The ratio of bytes loaded to load time should be very close to the I/O throughput of the machine."

How do you think the bytes served for a few hundred requests a minute compares to the theoretical I/O throughput of the relevant machines?

And because no one can see my tongue in my cheek across the internet, I'll be explicit that I'm sure he is a user of his publishing environment, not it's author, and so it's fair to view this as just another symptom of the problem he's discussing.


But it will be unreadable when most people want to read it. Sure, the 200 hundred readers that normally read your blog in the course of the month won't be much disturbed. It's just the 20000 readers that tried to reach it in in a couple of minutes that you will lose.


Doesn't give a shit? Seriously?

I wouldn't dream of keeping a blog that couldn't handle a HN traffic spike. That's the whole point of having a blog.

Failing at this is like launching your startup's MVP to an audience of the first fifty people who were able to load your product launch page before it crashed under the load. In other words, a giant, easily-avoidable waste of effort.


> I wouldn't dream of keeping a blog that couldn't handle a HN traffic spike. That's the whole point of having a blog.

It seems possible that some people don't care that much about HN traffic spikes.


All too often premature optimization is brought up as the antidote to carefully think about what you're implementing prior to actually opening up the IDE and start coding tests madly.

Thinking is hard, and takes time, and we want to get the feature out now, immediately, and worry about performance later. If at all.

And, sad to say (for an engineer), it's not clear that from a "business" perspective it's wrong. Hard to argue when accumulating features seems to matter more than crappy software. We have a lot more software these days, to run all of these bright new pieces of hardware, and perhaps because I'm an old-timer, the general quality seems to have degraded significantly. But the novelty of the stuff certainly has exploded, and I'm continually delighted by the twists and features that folks are coming up with, while being saddened by crashes, slowdowns, need for restarting, etc.


For some reason, the vast majority of developers take 'premature' to be a synonym of any.

If CS majors spent a fraction of the time learning how to optimize the way EE/CEs do, we'd need a lot less magic from the latter.


A huge amount of CS is algorithms and data structures. My undergrad degree also required two semesters of learning how a computer works from a logic gate level upwards, culminating with writing a simple pipelined CPU in Verilog. After three years I had a pretty good understanding from both the high level (algorithms) and the low level (caches, etc).

CS programs vary, but basically all of them give you the tools needed to write efficient code. They generally don't teach you much about software "engineering", though, which remains a chaotic mess of a field.


Part of the problem, is that Dijkstra's complaint is often still quite relevant: "Software engineering, of course, presents itself as another worthy cause, but that is eyewash: if you carefully read its literature and analyse what its devotees actually do, you will discover that software engineering has accepted as its charter 'How to program if you cannot'."


Drawing a parallel between software optimization and hardware optimization is unsure footing at best. Trying to map real world and often vague requirements to their corresponding software implementation is outright difficult. The closest analogy I can come up with regarding hardware would be to implement a piece of hardware that can take a varying number of I/O each with a completely unique decoding and encoding scheme.


> a piece of hardware that can take a varying number of I/O each with a completely unique decoding and encoding scheme.

So, a CPU?


If that CPU is magical and can encode/decode audio, and video, and any other format without additional software written for it, then sure, "a CPU" as you so simply put it.


Yes, yes it is. Simply asserting the opposite doesn't suddenly invalidate an entire industry's decades worth of experience.

One good point from the essay though is Knuth's example of 12% speed increase for low effort is definitely worth doing. I agree.

A better way of putting things is:

Considering low effort, small performance improvements that don't affect other factors such as code readability or system maintainability is not "premature optimization".

If you are considering performance "in the small" and it affects maintainability you are indeed prematurely optmizing.


"Premature" means "before measuring"


There are many optimizations that make sense to do "before measuring" because they're free and easy to do as a matter of habit. For example, the use of value types in C# is just as easy as classes--you just need to know how your runtime works and pick the correct allocation method.

It takes me literally no time whatsoever to know intuitively that, yes, this should be a struct rather than a class. But I've seen developers dismiss this as "premature optimization" rather than knowing how my software works.


If you already know one method is faster than another, then it is post-measurement, even if you measured it in a previous project, and thus not premature.


You have made the considerable investment required to learn a language that allows making such a distinction, and when to use which. Other people have spent that time doing other things, and yet they are still sometimes able to produce software that is valuable to themselves or others.

One man's saw sharpening [1] is another man's yak shaving [2].

Personally, I like my saws very sharp, thank you very much, but I think it's important to also have sympathy for people with different backgrounds and priorities.

[1] http://c2.com/cgi/wiki?SharpenTheSaw [2] http://programmers.stackexchange.com/a/34788


Can you quantify the % performance gains you've gotten from this on a given project? Because from what you've said here, it seems entirely possible you're not actually writing appreciably more performant code than they are. This is why the measuring is usually how you know how your software works.


I think you missed the point of the parent comment. If there's two ways of doing the same thing, and one way is clearly more efficient, there's no reason not to do it the efficient way. Get in the habit, and you won't even waste brainpower thinking about it. 99% of the time it might not make a difference, but for that 1% you're ahead. You're also covered if a change somewhere else causes your code to be executed a lot more often than you expected it to be.


Agreed. Though, to answer your parent post: when helping friends with XNA or MonoGame projects, I've shaved entire milliseconds off of frame times (particularly notable when you have a 16ms budget) by removing needless reference types, either by unpacking and using primitives or by replacing those reference types with structs.

Or, when I was at TripAdvisor, we (not me originally, though I inherited it) greatly improved typeahead performance by dropping the built-in collections classes (which used boxed primitives, because lolbrokengenerics) in favor of Trove (or Colt, I can't remember) and the reified primitive types. If I write Java today, I reach for Trove as a matter of course just because it means I never write List<Integer>.

This stuff matters. Getting in the right habits is not hard and will pay off in the ten percent of cases where it matters.


The problem is that when the software works well, it's hard to attribute it to all the experience of the developer whose knowledge of what performance traps to avoid has helped the software. It is only when performance problem shows up that people actually notice things and measure things since it's worth measuring at that point. But there is still value to the experience based optimization steps developers use based on their prior experience with imilar software without any measurement (with the caveat that it doesn't complicate the software or make it hard to understand).


I don't entirely agree with this.

There's a lot of low hanging fruit, especially in languages like C and C++, where a slight code modification can make a difference in run time, with zero impact on code readability. Here's one example:

Using:

    size_t len = strlen(myString);
    for (int i=0; i<len; ++len) { ... }
Over:

    for (int i=0; i<strlen(myString); ++len) { ... }
The second for loop is potentially O(n^2) because strlen needs to scan the entire string every time through the loop, where the first for loop computes strlen once. It's possible the compiler can optimize so that only one strlen is used in the second case, but it's not guaranteed. Probably it doesn't matter, but it will start to slow down once this function is called with longer strings and/or inside tight loops.

They're almost the same, so why ever use the slower one?

And C++ is even worse with destructors and copy and move constructors being called all over the place for people who aren't careful.


I would call using the first example good programming and not an optimization. Anytime a loop is involved it is simply good practice to limit calling anything multiple times unless absolutely necessary.

A second reason the first is better is that it clearly conveys to the next programmer that there is no intention for myString to change length during the running of the loop. If I came across the second example I would wonder if the length of myString was changing in the loop.


In most cases I expect the C compiler to hoist the strlen out of the loop (I don't know whether it will in practice, but the V8 compiler for Javascript will definitely make a similar harder optimization, so it doesn't seem unlikely).


I just tested this with GCC (gcc -S test.c --std=c99) and it doesn't optimize out the strlen inside the loop if the string is a variable (even if defined by const char *foo = "constant string";). If you're calling strlen on a constant string directly, it is indeed smart enough to optimize it out.


This is the correct behavior. It's not like they're immutable strings, or even that the variable's guaranteed to be pointed at the same place on every iteration of the loop. (You could say that a sufficiently smart static analyzer should be able to do that...but that static analyzer better ask about that thread you're running over there...)

Compilers have a limit to how smart they can be, and that totally makes being smart about your stuff all the more important.


As far as I understand, a string defined as const char * can be treated as immutable. Casting away a const qualifier and then modifying the referenced value is undefined behavior.


I don't think that's "zero impact on code readability" at all; the extra line and extra variable adds overhead when reading, all of which will add up over many maintenance cycles.

And I do think it's "zero impact on performance" in any real program; even if the compiler didn't optimize, I'll bet that particular loop wouldn't even show up in your profiler.


What makes you think it's zero impact on performance, necessarily? Any sort of parsing is normally CPU-bound if anything, and anytime you're lexing+parsing, you're iterating over characters in a string almost by definition.


and "before defining acceptance criteria"



It's incredible how there is always a relevant xkcd comic.


The problem I have found is not early optimization, but knowing what to optimize early and what can wait. Everything in a system can be optimized, but there obviously isn't the time to do this.

I work in Android, so optimizing I look at first is bitmap loading, backgrounding tasks, and sqlite queries. Usually, this is where 90% of the performance benefit comes from.


Exactly. I've been working on a codebase where we're trying to make a rush-job project into something mature and reusable. One of the developers on my team was concerned about our obviously and grossly inefficient business-logic layer that applied business rules to a given record. I told him not to worry about it - lo and behold, we ran a profiler and the business layer was barely a blip on our performance problems, and our dynamic GUI stuff represented the lion's share of wasted time.


He is right.

He is right about premature optimizations and also right about craftsmanship.

Many performance problems stem from bad overall design decisions, bad abstractions and bad data structures (summarized: craftsmanship). I always wonder, how fast today's processors have become and how slow they get with today's software.

One of my first computers had 64k of memory and a processor that was so incredible slow compared to today's processors, that it is unbelievable. And still, they made so many good software with it. In today's programs the resources of this computer would not suffice for the idle loop.

Also the first Unix computer that I worked on, had only 8MB of memory and had X11, LaTeX and many other stuff!

It was a long way down the road of abstractions, that today's computer could hardly live with 1GB of memory and 1GHz processor (2 cores, please, please!) for basic usage.

The real art in computer science is, to know where to optimize -- to use your time most effectively.


Hmm, he complains about the awesome bar being coded inefficiently, but I'm on a several year old laptop and the awesome bar is pretty much instant at showing results for me when I type a key. So his evidence is not compelling in my experience. Reading his links, it sounds like his memory error checking tool is what causes slow down, not the usual Chrome code. Kind of bizarre he is pointing fingers at others when his stuff is the problem that needs to be worked on.

His whole argument boils down to: > Considering performance in the small is not “premature optimization”, it is simply good engineering, and good craftsmanship

But that's the same reason German industry tends to compete so badly right now. They do a lot of hand trimmed and finished components that really could have been better designed for automation and require less custom craftsmanship. His argument seems to boil down to aesthetics.


I'm very happy to see this article and more people taking this mindset.

Many people don't even know the context or the original quote. And many times I've seen discussions about improving a piece of code shot down by a single incantation of this Knuth. It's sad.


Considering performance is a basic tenant of software architecture and engineering. Obsessing about it needlessly is the "premature" part. Considering the business case is the most important thing to remember.


Brilliant blog. Thank you for writing this. Optimization is important! You don't have to tune everything to the fastest possible speed. But remember, in web programming, you may be writing a function that is called 10 or 100 or 500 times per request. That little optimization will add up quickly.

Be smart when developing, and know where the easy optimizations and common pitfalls are in your language and toolset. Optimize as you can without sacrificing maintainability.


"As a developer, I am tired of my IDE slowing to a crawl when I try to compile multiple projects at a time. I am tired of being unable to trust the default behavior of the standard containers. I am tired of my debug builds being unusably slow by default."

Don't forget web browsers. Web browsers are horrible.


Nobody's arguing that speed isn't important. What's more important is to get things right.

And no, incorporating speed in the specification doesn't make any difference. Suppose the spec says, "page must load in 200ms." Fine, if you don't care what correct page loading means, you're perfectly served with a blank one.

What's at the root of such intellectual capitulation? Complacency? Absence of skills? "Correctness is hard, let's just randomly perturb settings instead while fiddling with a stopwatch. Correctness is hard, let's just conflate motion with progress."

Whence the shabby treatment of correctness like porn: I'll know it when I see it.


TLDR : sometimes you need to optimize.

I don't understand what's new.


It's not a question of optimize/don't optimize. There's a continuum, and the pendulum swings back and forth. The author makes a case that the prevailing industry attitude is too far on the "don't optimize" side.

And stop quoting Knuth to justify it.


This article is not that well optimized. It could have been much shorter. I think just the lesser-known Knuth quote would be enough.


Your small is another man's big. So, "don't prematurely optimize" could be said another way, "a different point of view is like losing 80 IQ points" or you could say it positively like Kay does http://en.wikiquote.org/wiki/Alan_Kay, but I think he's only positive because he drinks once in a while.


We changed the title to a representative sentence from the article. If anyone can suggest a better one, we'll change it again.


Please don't. As a reader, not seeing the same article under a different title when I return to the homepage is much more valuable than having the "right" title.


The article's title is both linkbait ("Stop Doing Bad Thing, Plus Famous Person") and misleading: people are mostly not misquoting Knuth, as the article itself points out ("Donald Knuth once wrote these unfortunate words"). To leave that title unchanged would violate both of the HN guidelines about titles.


Then please violate and/or change the guidelines.


Misleading and linkbait titles on the front page would make Hacker News much worse.


"Stop changing the story title", Donald Knuth




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: