I saw someone on Twitter complaining that this behavior can be confusing since without being aware of it you can create drastically slower code. However, I must say that I find this approach quite pragmatic and neat. In most cases, string concatenation is not going to be a performance issue, and then you never care about these details at all. If you're in a performance-oriented setting you need to be aware of this, but as long as it's well-understood and documented (which it unfortunately doesn't seem to be) then it's quite easy to make it fast.
The big advantage is that there is just a single "string" type. Compare this with Java or Rust where you have separate types for mutable strings (StringBuilder in Java; String in Rust) and immutable strings (String in Java; str in Rust). This is nice both for newcomers (which don't have to learn about all the details about how to optimize concatenation) and if you're just writing code which is never going to be in the hot path anyway.
I understand that in some settings (when you really care about performance) it's good to have an explicit difference between these, but I quite like the reduced complexity.
> Compare this with Java or Rust where you have separate types for mutable strings (StringBuilder in Java; String in Rust) and immutable strings (String in Java; str in Rust).
Not to detract from your point (which is broadly correct) but this isn't exactly right for Rust. Or at least isn't the right way to phrase it.
Rust has one string type, `str`, which can be referenced immutably (`&str`) or mutably `&mut str`. What `str` can't do is manage memory, which means it can't grow a string. `String` is more correctly a `StringBuffer` (technically a `Vec<u8>` albeit with additional checks). It can manage memory and therefore grow the size of the string, either by using spare capacity or by reallocating the memory. It is not really a string type in itself; it's just a memory buffer for holding a `str`.
This is all a long winded way of saying Rust separates managing memory from operating on the memory. This isn't the same as an mutable/immutable distinction although it is in some ways similar to growable/un-growable.
> In most cases, string concatenation is not going to be a performance issue, and then you never care about these details at all.
I would have thought if the performance of anything in a language like Python mattered, it'd be how quickly you can build up a text response, like a JSON document or rendered HTML?
In my implementation of Ruby I use immutable ropes for strings - so instead of mutable arrays of characters I have persistent trees of immutable arrays of characters. This makes a massive literal 10x impact on real-world code like template rendering.
This is actually a good point. If you're never accessing inside a string directly, but only iterating over it, then it's often far better to build a tree/rope structure than to create a continuous memory section.
If you are really performance concerned, you wouldn't use Python to begin with.
That said, there are many use cases where performance does not matter as long as it is not "too much", for different values of "too much" depending on context.
Assembly code making poor choices can be outperformed by Python code making smart algorithmic choices. The sentiment behind "don't use Python for performance critical code" isn't wrong, but there's nuance. Programmers should make informed choices about space and time complexity regardless of the language being used.
Your typical assembly programmer may be far more aware of their obligation to do so than your typical Python programmer, so in my mind it's more important that languages occupying Python's ecological niche behave predictably. It can be challenging to balance that need against other constraints like limiting the number of abstractions someone needs to master in order to be productive.
Sure assembly is overkill for most tasks, but Python's performance is so poor that you can sometimes write a brute-force double loop on C++ and have it outperform anything in native Python.
Sometimes raw performance does save developer time, because you don't have to worry that much about algorithm. :)
It may be confusing, but it applies to any dynamically-sized container (for mutable containers, the impact is smaller and often can be lessened by increasing the capacity of the container before appending items, but often, you don’t know the target capacity in advance). You’ll have to learn it, or use a language that doesn’t have dynamically-sized containers such as classical COBOL, FORTRAN or (to a lesser extent) Pascal.
So, this is a lesson you just have to learn at some time.
It's crazy that even on the 'fast path' there are so many checks being run over and over. Provides an interesting alternative to the JavaScript Core article from a couple of days ago https://webkit.org/blog/10308/speculation-in-javascriptcore/