Disclaimer: I know perfectly well the semantics of prefix and postfix increment. So please don’t explain to me how they work.
Reading questions on stack overflow, I cannot help but notice that programmers get confused by the postfix increment operator over and over and over again. From this the following question arises: is there any use case where postfix increment provides a real benefit in terms of code quality?
Let me clarify my question with an example. Here is a super-terse implementation of strcpy
:
while (*dst++ = *src++);
But that’s not exactly the most self-documenting code in my book (and it produces two annoying warnings on sane compilers). So what’s wrong with the following alternative?
while (*dst = *src)
{
++src;
++dst;
}
We can then get rid of the confusing assignment in the condition and get completely warning-free code:
while (*src != '')
{
*dst = *src;
++src;
++dst;
}
*dst = '';
(Yes I know, src
and dst
will have different ending values in these alternative solutions, but since strcpy
immediately returns after the loop, it does not matter in this case.)
It seems the purpose of postfix increment is to make code as terse as possible. I simply fail to see how this is something we should strive for. If this was originally about performance, is it still relevant today?
16
While it did once have some performance implications, I think the real reason is for expressing your intent cleanly. The real question is whether something while (*d++=*s++);
expresses intent clearly or not. IMO, it does, and I find the alternatives you offer less clear — but that may (easily) be a result of having spent decades becoming accustomed to how things are done. Having learned C from K&R (because there were almost no other books on C at the time) probably helps too.
To an extent, it’s true that terseness was valued to a much greater degree in older code. Personally, I think this was largely a good thing — understanding a few lines of code is usually fairly trivial; what’s difficult is understanding large chunks of code. Tests and studies have shown repeatedly, that fitting all the code on screen at once is a major factor in understanding the code. As screens expand, this seems to remain true, so keeping code (reasonably) terse remains valuable.
Of course it’s possible to go overboard, but I don’t think this is. Specifically, I think it’s going overboard when understanding a single line of code becomes extremely difficult or time consuming — specifically, when understanding fewer lines of code consumes more effort than understanding more lines. That’s frequent in Lisp and APL, but doesn’t seem (at least to me) to be the case here.
I’m less concerned about compiler warnings — it’s my experience that many compilers emit utterly ridiculous warnings on a fairly regular basis. While I certainly think people should understand their code (and any warnings it might produce), decent code that happens to trigger a warning in some compiler is not necessarily wrong. Admittedly, beginners don’t always know what they can safely ignore, but we don’t stay beginners forever, and don’t need to code like we are either.
13
It is, err, was, a hardware thing
Interesting that you would notice. Postfix increment is probably there for a number of perfectly good reasons, but like many things in C, it’s popularity can be traced to the origin of the language.
Although C was developed on a variety of early and underpowered machines, C and Unix first hit the relative big-time with the memory-managed models of the PDP-11. These were relatively important computers in their day and Unix was by far better — exponentially better — than the other 7 crummy operating systems available for the -11.
And, it so happens, that on the PDP-11,
*--p
and
*p++
…were implemented in hardware as addressing modes. (Also *p
, but no other combination.) On those early machines, all less than 0.001 GHz, saving an instruction or two in a loop must almost have been a wait-a-second or wait-a-minute or go-out-for-lunch difference. This doesn’t precisely speak to postincrement specifically, but a loop with pointer postincrement could have been a lot better than indexing back then.
As a consequence, the design patterns because C idioms which became C mental macros.
It’s something like declaring variables right after a {
… not since C89 was current has this been a requirement, but it’s now a code pattern.
Update: Obviously, the main reason *p++
is in the language is because it is exactly what one so often wants to do. The popularity of the code pattern was reinforced by popular hardware which came along and matched the already-existing pattern of a language designed slightly before the arrival of the PDP-11.
These days it makes no difference which pattern you use, or if you use indexing, and we usually program at a higher level anyway, but it must have mattered a lot on those 0.001GHz machines, and using anything other than *--x
or *x++
would have meant you didn’t “get” the PDP-11, and you would might have people coming up to you and saying “did you know that…” 🙂 🙂
10
The prefix and postfix --
and ++
operators were introduced in the B language (C’s predecessor) by Ken Thompson — and no, they were not inspired by the PDP-11, which didn’t exist at the time.
Quoting from “The Development of the C Language” by Dennis Ritchie:
Thompson went a step further by inventing the
++
and--
operators,
which increment or decrement; their prefix or postfix position
determines whether the alteration occurs before or after noting the
value of the operand. They were not in the earliest versions of B, but
appeared along the way. People often guess that they were created to
use the auto-increment and auto-decrement address modes provided by
the DEC PDP-11 on which C and Unix first became popular. This is
historically impossible, since there was no PDP-11 when B was
developed. The PDP-7, however, did have a few ‘auto-increment’ memory
cells, with the property that an indirect memory reference through
them incremented the cell. This feature probably suggested such
operators to Thompson; the generalization to make them both prefix and
postfix was his own. Indeed, the auto-increment cells were not used
directly in implementation of the operators, and a stronger motivation
for the innovation was probably his observation that the translation
of++x
was smaller than that ofx=x+1
.
5
The obvious reason for the postincrement operator to exist is so that you don’t have to write expressions like (++x,x-1)
or (x+=1,x-1)
all over the place or uselessly separate trivial statements with easy-to-understand side effects out into multiple statements. Here are some examples:
while (*s) x+=*s++-'0';
if (x) *s++='.';
Whenever reading or writing a string in the forward direction, postincrement, and not preincrement, is almost always the natural operation. I’ve almost never encountered real-world uses for preincrement, and the only major use I’ve found for predecrement is converting numbers to strings (because human writing systems write numbers backwards).
Edit: Actually it wouldn’t be quite so ugly; instead of (++x,x-1)
you could of course use ++x-1
or (x+=1)-1
. Still x++
is more readable.
5
The PDP-11 offered post-increment and pre-decrement operations in the instruction set. Now these weren’t instructions. They were instruction modifiers that allowed you to specify that you wanted the value of a register before it was incremented or after it was decremented.
Here is a word-copy step in the machine language:
movw (r1)++,(r2)++
which moved a word from one location to another, pointed at by registers, and the registers would be ready to do it again.
As C was built on and for the PDP-11, a lot of useful concepts found their way into C. C was intended to be a useful replacement for assembler language. Pre-increment and post-decrement were added for symmetry.
4
I doubt it was ever really necessary. As far as I know, it doesn’t compile into anything more compact on most platforms than using the pre-increment as you did, in the loop. It was just that at the time it was made, terseness of code was more important than clarity of code.
That isn’t to say that clarity of code isn’t (or wasn’t) important, but when you were typing in on a low baud modem (anything where you measure in baud is slow) where every keystroke had to make it to the mainframe and then get echoed back one byte at a time with a single bit used as parity check, you didn’t want to have to type much.
This is sort of like the & and | operator having lower precedence than ==
It was designed with the best intentions (a non-short circuiting version of && and ||), but now it confuses programmers everyday, and it probably won’t ever change.
Anyway, this is only an answer in that I think there is not a good answer to your question, but I’ll probably be proven wrong by a more guru coder than I.
–EDIT–
I should note that I find having both incredibly useful, I’m just pointing out that it won’t change whether anyone like it or not.
5
I like the postfix aperator when dealing with pointers (whether or not I’m dereferencing them) because
p++
reads more naturally as “move to the next spot” than the equivalent
p += 1
which surely must confuse beginners the first time they see it used with a pointer where sizeof(*p) != 1.
Are you sure you’re stating the confusion problem correctly? Is not the thing that confuses beginners the fact that the postfix ++ operator has higher precedence than the dereference * operator so that
*p++
parses as
*(p++)
and not
(*p)++
as some might expect?
(You think that’s bad? See Reading C Declarations.)
Amongst the subtle elements of good programming are localisation and minimalism:
- putting variables in a minimal scope of use
- using const when write access isn’t required
- etc.
In the same spirit, x++
can be seen as a way of localising a reference to the current value of x
while immediately indicating that you’ve – at least for now – finished with that value and want to move to the next (valid whether x
is an int
or a pointer or iterator). With a little imagination, you could comparable this to letting the old value/position of x
go “out of scope” immediately after it’s no longer needed, and moving to the new value. The precise point where that transition is possible is highlighted by the postfix ++
. The implication that this is probably the final use of the “old” x
can be valuable insight into the algorithm, assisting the programmer as they scan through the surrounding code. Of course, the postfix ++
may put the programmer on the lookout for uses of the new value, which may or may not be good depending on when that new value is actually needed, so it’s an aspect of the “art” or “craft” of programming to determine what’s more helpful in the circumstances.
While many beginners may be confused by a feature, it’s worth balancing that against the long-term benefits: beginners don’t stay beginners for long.
Wow, lots of answers not-quite-on-point (if I may be so bold), and please forgive me if I’m pointing out the obvious – particularly in light of your comment to not point out the semantics, but the obvious (from Stroustrup’s perspective, I suppose) doesn’t yet seem to have been posted! 🙂
Postfix x++
produces a temporary that’s passed ‘upwards’ in the expression, while the the variable x
is subsequently incremented.
Prefix ++x
does not produce a temporary object, but increments ‘x’ and passes the result to the expression.
The value is convenience, for those who know the operator.
For example:
int x = 1;
foo(x++); // == foo(1)
// x == 2: true
x = 1;
foo(++x); // == foo(2)
// x == 2: true
Of course, the results of this example can be produced from other equivalent (and perhaps less oblique) code.
So why do we have the postfix operator? I guess because it’s an idiom that’s persisted, in spite of the confusion it obviously creates. It’s a legacy construct that once was perceived to have value, though I’m not sure that perceived value was so much for performance as for convenience. That convenience hasn’t been lost, but I think an appreciation of readability has increased, resulting in the questioning of the operator’s purpose.
Do we need the postfix operator anymore? Likely not. Its result is confusing and creates a barrier to understandability. Some good coders will certainly immediately know where to find it useful, often in places where it has a peculiarly “PERL-esque” beauty. The cost of that beauty is readability.
I agree that explicit code has benefits over terseness, in readability, understandability and maintainability. Good designers and managers want that.
However, there’s a certain beauty that some programmers recognize that encourages them to produce code with things purely for beautiful simplicity – such as the postfix operator – which code they wouldn’t have otherwise ever have thought of – or found intellectually stimulating. There’s a certain something to it that just makes it more beautiful, if not more desirable, in spite of pithiness.
In other words, some people find while (*dst++ = *src++);
to simply be a more beautiful solution, something that takes your breath away with its simplicity, just as much as if it were a brush to canvas. That you are forced to understand the language to appreciate the beauty only adds to its magnificence.
4
Let’s look at Kernighan & Ritchie original justification (original K&R page 42 and 43):
The unusual aspects is that ++ and — may be used either as prefix or
as postfix. (…) In the context where no value is wanted (..) choose
prefix or postfix according to taste. But htere are situations where
one or the other is specifically called for.
The text continues with some examples that use increments within index, with the explicit goal of writing “more compact” code. So the reason behind these operators is convenience of more compact code.
The three examples given (squeeze()
, getline()
and strcat()
) use only postfix within expressions using indexing. The authors compare the code with a longer version that doesn’t use embedded increments. This confirms that focus is on compactness.
K&R highlight on page 102, the use of these operators in combination with pointer dereferencing (eg *--p
and *p--
). No further example is given, but again, they make clear that the benefit is compactness.
Prefix is for example very commonly used when decrementing an index or a pointer staring from the end.
int i=0;
while (i<10)
doit (i++); // calls function for 0 to 9
// i is 10 at this stage
while (--i >=0)
doit (i); // calls function from 9 to 0
0
I’m going to disagree with the premise that ++p
, p++
, is somehow difficult to read or unclear. One means “increment p and then read p”, the other means “read p and then increment p”. In both cases, the operator in the code is exactly where it is in the explanation of the code, so if you know what ++
means, you know what the resulting code means.
You can create obfuscated code using any syntax, but I don’t see a case here that p++
/++p
is inherently obfuscatory.
this is from the POV of an electrical engineer:
there are many processors that have built-in post-increment and pre-decrement operators for the purpose of maintaining a last-in-first-out (LIFO) stack.
it might be like:
float stack[4096];
int stack_pointer = 0;
...
#define push_stack(arg) stack[stack_pointer++] = arg;
#define pop_stack(arg) arg = stack[--stack_pointer];
...
i dunno, but that’s the reason i would expect to see both prefix and postfix increment operators.
To underline Christophe’s point about compactness, I want to show y’all some code from V6. This is part of the original C compiler, from http://minnie.tuhs.org/cgi-bin/utree.pl?file=V6/usr/source/c/c00.c:
/*
* Look up the identifier in symbuf in the symbol table.
* If it hashes to the same spot as a keyword, try the keyword table
* first. An initial "." is ignored in the hash.
* Return is a ptr to the symbol table entry.
*/
lookup()
{
int ihash;
register struct hshtab *rp;
register char *sp, *np;
ihash = 0;
sp = symbuf;
if (*sp=='.')
sp++;
while (sp<symbuf+ncps)
ihash =+ *sp++;
rp = &hshtab[ihash%hshsiz];
if (rp->hflag&FKEYW)
if (findkw())
return(KEYW);
while (*(np = rp->name)) {
for (sp=symbuf; sp<symbuf+ncps;)
if (*np++ != *sp++)
goto no;
csym = rp;
return(NAME);
no:
if (++rp >= &hshtab[hshsiz])
rp = hshtab;
}
if(++hshused >= hshsiz) {
error("Symbol table overflow");
exit(1);
}
rp->hclass = 0;
rp->htype = 0;
rp->hoffset = 0;
rp->dimp = 0;
rp->hflag =| xdflg;
sp = symbuf;
for (np=rp->name; sp<symbuf+ncps;)
*np++ = *sp++;
csym = rp;
return(NAME);
}
This is code that was passed around in samizdat as an education in good style, and yet we would never write it like this today. Look at how many side effects have been packed into if
and while
conditions, and conversely, how both for
loops don’t have an increment expression because that’s done in the loop body. Look at how blocks are only wrapped in braces when absolutely necessary.
Part of this was certainly manual micro-optimization of the sort that we expect the compiler to do for us today, but I would also put forward the hypothesis that when your entire interface to the computer is a single 80×25 glass tty, you want your code to be as dense as possible so you can see more of it at the same time.
(Use of global buffers for everything is probably a hangover from cutting one’s teeth on assembly language.)
2
Perhaps you should think about cosmetics too.
while( i++ )
arguably “looks better” than
while( i+=1 )
For some reason, the post-increment operator looks very appealing. Its short, its sweet, and it increases everything by 1. Compare it with prefix:
while( ++i )
“looks backwards” doesn’t it? You never see a plus sign FIRST in math, except when someone’s being silly with specifying something’s positive (+0
or something like that). We’re used to seeing OBJECT + SOMETHING
Is it something to do with grading? A, A+, A++? I can tell you I was pretty cheesed at first that the Python people removed operator++
from their language!
2
Counting backwards 9 to 0 inclusive:
for (unsigned int i=10; i-- > 0;) {
cout << i << " ";
}
Or the equivalent while loop.
1