Given a file containing these lines:
0191/320
07-00-40
07-04-36
07-01-16
00004738991
07-08-06
070070
Why does the command
sort -k1,1 myfile.txt
return the lines in the following order:
00004738991
0191/320
07-00-40
070070
07-01-16
07-04-36
07-08-06
I know about collation orders – my default is en_US.UTF-8 – and I can get the order I’m expecting by prefixing the command with LC_ALL=C, but I don’t understand why “sort” is seemingly ignoring the hyphens when sorting. All the characters are basic ASCII chars so there’s no “odd” characters to throw the ordering out. This to me seems to be a very subtle (and in my case costly) bug but I’m sure it’s working as designed and hopefully someone can explain why that is.
1