How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string)

  Kiến thức lập trình

Can anyone tell me the purely assembly code for displaying the value in a register in decimal format? Please don’t suggest using the printf hack and then compile with gcc.

Description:

Well, I did some research and some experimentation with NASM and figured I could use the printf function from the c library to print an integer. I did so by compiling the object file with the GCC compiler and everything works fair enough.

However, what I want to achieve is to print the value stored in any register in the decimal form.

I did some research and figured the interrupt vector 021h for DOS command line can display strings and characters whilst either 2 or 9 is in the ah register and the data is in the dx.

Conclusion:

None of the examples I found showed how to display the content value of a register in decimal form without using the C library’s printf. Does anyone know how to do this in assembly?

9

You need to write a binary to decimal conversion routine, and then use the decimal digits to produce “digit characters” to print.

You have to assume that something, somewhere, will print a character on your output device of choice. Call this subroutine “print_character”; assumes it takes a character code in EAX and preserves all the registers.. (If you don’t have such a subroutine, you have an additional problem that should be the basis of a different question).

If you have the binary code for a digit (e.g., a value from 0-9) in a register (say, EAX), you can convert that value to a character for the digit by adding the ASCII code for the “zero” character to the register. This is as simple as:

       add     eax, 0x30    ; convert digit in EAX to corresponding character digit

You can then call print_character to print the digit character code.

To output an arbitrary value, you need to pick off digits and print them.

Picking off digits fundamentally requires working with powers of ten. It is easiest to work with one power of ten, e.g., 10 itself. Imagine we have a divide-by-10 routine that took a value in EAX, and produced a quotient in EDX and a remainder in EAX. I leave it as an exercise for you to figure out how to implement such a routine.

Then a simple routine with the right idea is to produce one digit for all digits the value might have. A 32 bit register stores values to 4 billion, so you might get 10 digits printed. So:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to produce
loop:    call   dividebyten
         add    eax, 0x30
         call   printcharacter
         mov    eax, edx
         dec    ecx
         jne    loop

This works… but prints the digits in reverse order. Oops! Well, we can take advantage of the pushdown stack to store digits produced, and then pop them off in reverse order:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to generate
loop1:   call   dividebyten
         add    eax, 0x30
         push   eax
         mov    eax, edx
         dec    ecx
         jne    loop1
         mov    ecx, 10        ;  digit count to print
loop2:   pop    eax
         call   printcharacter
         dec    ecx
         jne    loop2

Left as an exercise to the reader: suppress leading zeros. Also, since we are writing digit characters to memory, instead of writing them to the stack we could write them to a buffer, and then print the buffer content. Also left as an exercise to the reader.

3

You need to turn a binary integer into a string/array of ASCII decimal digits manually. ASCII digits are represented by 1-byte integers in the range '0' (0x30) to '9' (0x39). http://www.asciitable.com/

For power-of-2 bases like hex, see How to convert a binary integer number to a hex string? Converting between binary and a power-of-2 base allows many more optimizations and simplifications because each group of bits maps separately to a hex / octal digit.


Most operating systems / environments don’t have a system call that accepts integers and converts them to decimal for you. You have to do that yourself before sending the bytes to the OS, or copying them to video memory yourself, or drawing the corresponding font glyphs in video memory…

By far the most efficient way is to make a single system call that does the whole string at once, because a system call that writes 8 bytes is basically the same cost as writing 1 byte.

This means we need a buffer, but that doesn’t add to our complexity much at all. 2^32-1 is only 4294967295, which is only 10 decimal digits. Our buffer doesn’t need to be large, so we can just use the stack.

The usual algorithm produces digits LSD-first (Least Significant Digit first). Since printing order is MSD-first, we can just start at the end of the buffer and work backwards. For printing or copying elsewhere, just keep track of where it starts, and don’t bother about getting it to the start of a fixed buffer. No need to mess with push/pop to reverse anything, just produce it backwards in the first place.

char *itoa_end(unsigned long val, char *p_end) {
  const unsigned base = 10;
  char *p = p_end;
  do {
    *--p = (val % base) + '0';
    val /= base;
  } while(val);                  // runs at least once to print '0' for val=0.

  // write(1, p,  p_end-p);
  return p;  // let the caller know where the leading digit is
}

gcc/clang do an excellent job, using a magic constant multiplier instead of div to divide by 10 efficiently. (Godbolt compiler explorer for asm output).

This code-review Q&A has a nice efficient NASM version of that which accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.


To handle signed integers:

Use this algorithm on the unsigned absolute value. (val = val<0 ? 0U-val : val;, i.e. xor-zero / sub / cmovs which keeps the original value around; Godbolt). If the original input was negative, stick a '-' in front at the end, when you’re done. So for example, -10 runs this with 10, producing 2 ASCII bytes. Then you store a '-' in front, as a third byte of the string.


Here’s a simple commented NASM version of that, using div (slow but shorter code) for 32-bit unsigned integers and a Linux write system call. It should be easy to port this to 32-bit-mode code just by changing the registers to ecx instead of rcx. But add rsp,24 will become add esp, 20 because push ecx is only 4 bytes, not 8. (You should also save/restore esi for the usual 32-bit calling conventions, unless you’re making this into a macro or internal-use-only function.)

The system-call part is specific to 64-bit Linux. Replace that with whatever is appropriate for your system, e.g. call the VDSO page for efficient system calls on 32-bit Linux, or use int 0x80 directly for inefficient system calls. See calling conventions for 32 and 64-bit system calls on Unix/Linux. Or see rkhb’s answer on another question for a 32-bit int 0x80 version that works the same way.

If you just need the string without printing it, rsi points to the first digit after leaving the loop. You can copy it from the tmp buffer to the start of wherever you actually need it. Or if you generated it into the final destination directly (e.g. pass a pointer arg), you can pad with leading zeros until you reach the front of the space you left for it. There’s no simple way to find out how many digits it’s going to be before you start unless you always pad with zeros up to a fixed width.

ALIGN 16
; void print_uint32(uint32_t edi)
; x86-64 System V calling convention.  Clobbers RSI, RCX, RDX, RAX.
; optimized for simplicity and compactness, not speed (DIV is slow)
global print_uint32
print_uint32:
    mov    eax, edi              ; function arg

    mov    ecx, 0xa              ; base 10
    push   rcx                   ; ASCII newline 'n' = 0xa = base
    mov    rsi, rsp
    sub    rsp, 16               ; not needed on 64-bit Linux, the red-zone is big enough.  Change the LEA below if you remove this.

;;; rsi is pointing at 'n' on the stack, with 16B of "allocated" space below that.
.toascii_digit:                ; do {
    xor    edx, edx
    div    ecx                   ; edx=remainder = low digit = 0..9.  eax/=10
                                 ;; DIV IS SLOW.  use a multiplicative inverse if performance is relevant.
    add    edx, '0'
    dec    rsi                 ; store digits in MSD-first printing order, working backwards from the end of the string
    mov    [rsi], dl

    test   eax,eax             ; } while(x);
    jnz  .toascii_digit
;;; rsi points to the first digit


    mov    eax, 1               ; __NR_write from /usr/include/asm/unistd_64.h
    mov    edi, 1               ; fd = STDOUT_FILENO
    ; pointer already in RSI    ; buf = last digit stored = most significant
    lea    edx, [rsp+16 + 1]    ; yes, it's safe to truncate pointers before subtracting to find length.
    sub    edx, esi             ; RDX = length = end-start, including the n
    syscall                     ; write(1, string /*RSI*/,  digits + 1)

    add  rsp, 24                ; (in 32-bit: add esp,20) undo the push and the buffer reservation
    ret

Public domain. Feel free to copy/paste this into whatever you’re working on. If it breaks, you get to keep both pieces. (If performance matters, see the links below; you’ll want a multiplicative inverse instead of div.)

And here’s code to call it in a loop counting down to 0 (including 0). Putting it in the same file is convenient.

ALIGN 16
global _start
_start:
    mov    ebx, 100
.repeat:
    lea    edi, [rbx + 0]      ; put +whatever constant you want here.
    call   print_uint32
    dec    ebx
    jge   .repeat


    xor    edi, edi
    mov    eax, 231
    syscall                             ; sys_exit_group(0)

Assemble and link with

yasm -felf64 -Worphan-labels -gdwarf2 print-integer.asm &&
ld -o print-integer print-integer.o

./print_integer
100
99
...
1
0

Use strace to see that the only system calls this program makes are write() and exit(). (See also the gdb / debugging tips at the bottom of the x86 tag wiki, and the other links there.)


Related:

  • 32-bit version of this, using int 0x80 for the write system call at the end. Pretty much the same loop.
  • With printf – How to print a number in assembly NASM? has x86-64 and i386 answers.
  • NASM Assembly convert input to integer? is the other direction, string->int.
  • Printing an integer as a string with AT&T syntax, with Linux system calls instead of printf – AT&T version of the same thing (but for 64-bit integers). See that for more comments about performance, and a benchmark of div vs. compiler-generated code using mul.
  • Add 2 numbers and print the result using Assembly x86 32-bit version that’s very similar to this.
  • This code-review Q&A uses a multiplicative inverse like a compiler would. And it accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.
  • How to convert a binary integer number to a hex string? – power-of-2 bases are special. Answer includes scalar loop (branchy and table-lookup) and SIMD (SSE2, SSSE3, AVX2, and AVX512 which is amazing for this.)

High-performance versions

  • Some optimized decimal atoi versions from Daniel Lemire’s blog: without AVX-512, and much faster with AVX-512 IFMA

  • With NEON SIMD on Apple M1

  • and some older articles: How to print integers really fast blog post comparing some strategies in C.
    Such as x % 100 to create more ILP (Instruction Level Parallelism), and either a lookup table or a simpler multiplicative inverse (that only has to work for a limited range, like in this answer) to break up the 0..99 remainder into 2 decimal digits.
    e.g. with (x * 103) >> 10 using one imul r,r,imm8 / shr r,10 as shown in another answer. Possibly somehow folding that in to the remainder calculation itself.

  • https://tia.mat.br/posts/2014/06/23/integer_to_string_conversion.html a similar article.

I suppose you wanna print the value to stdout? If this is the case
you have to use a system call to do so. System calls are OS dependent.

e.g. Linux:
Linux System Call Table

The hello world program in this Tutorial may give you some insights.

1

Can’t comment so I post reply this way.
@Ira Baxter, perfect answer I just want to add that you don’t need to divide 10 times as you posted that you set register cx to value 10. Just divide number in ax until “ax==0”

loop1: call dividebyten
       ...
       cmp ax,0
       jnz loop1

You also have to store how many digits was there in original number.

       mov cx,0
loop1: call dividebyten
       inc cx

Anyway you Ira Baxter helped me there is just few ways how to optimize code 🙂

This is not only about optimization but also formatting. When you want to print number 54 you want print 54 not 0000000054 🙂

1 -9 are 1 -9. after that, there must be some conversion that I don’t know either. Say you have a 41H in AX (EAX) and you want to print a 65, not ‘A’ without doing some service call. I think you need to print a character representation of a 6 and a 5 whatever that might be. There must be a constant number that can be added to get there. You need a modulus operator (however you do that in assembly) and loop for all digits.

Not sure, but that’s my guess.

1

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website

How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string)

Can anyone tell me the purely assembly code for displaying the value in a register in decimal format? Please don’t suggest using the printf hack and then compile with gcc.

Description:

Well, I did some research and some experimentation with NASM and figured I could use the printf function from the c library to print an integer. I did so by compiling the object file with the GCC compiler and everything works fair enough.

However, what I want to achieve is to print the value stored in any register in the decimal form.

I did some research and figured the interrupt vector 021h for DOS command line can display strings and characters whilst either 2 or 9 is in the ah register and the data is in the dx.

Conclusion:

None of the examples I found showed how to display the content value of a register in decimal form without using the C library’s printf. Does anyone know how to do this in assembly?

9

You need to write a binary to decimal conversion routine, and then use the decimal digits to produce “digit characters” to print.

You have to assume that something, somewhere, will print a character on your output device of choice. Call this subroutine “print_character”; assumes it takes a character code in EAX and preserves all the registers.. (If you don’t have such a subroutine, you have an additional problem that should be the basis of a different question).

If you have the binary code for a digit (e.g., a value from 0-9) in a register (say, EAX), you can convert that value to a character for the digit by adding the ASCII code for the “zero” character to the register. This is as simple as:

       add     eax, 0x30    ; convert digit in EAX to corresponding character digit

You can then call print_character to print the digit character code.

To output an arbitrary value, you need to pick off digits and print them.

Picking off digits fundamentally requires working with powers of ten. It is easiest to work with one power of ten, e.g., 10 itself. Imagine we have a divide-by-10 routine that took a value in EAX, and produced a quotient in EDX and a remainder in EAX. I leave it as an exercise for you to figure out how to implement such a routine.

Then a simple routine with the right idea is to produce one digit for all digits the value might have. A 32 bit register stores values to 4 billion, so you might get 10 digits printed. So:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to produce
loop:    call   dividebyten
         add    eax, 0x30
         call   printcharacter
         mov    eax, edx
         dec    ecx
         jne    loop

This works… but prints the digits in reverse order. Oops! Well, we can take advantage of the pushdown stack to store digits produced, and then pop them off in reverse order:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to generate
loop1:   call   dividebyten
         add    eax, 0x30
         push   eax
         mov    eax, edx
         dec    ecx
         jne    loop1
         mov    ecx, 10        ;  digit count to print
loop2:   pop    eax
         call   printcharacter
         dec    ecx
         jne    loop2

Left as an exercise to the reader: suppress leading zeros. Also, since we are writing digit characters to memory, instead of writing them to the stack we could write them to a buffer, and then print the buffer content. Also left as an exercise to the reader.

3

You need to turn a binary integer into a string/array of ASCII decimal digits manually. ASCII digits are represented by 1-byte integers in the range '0' (0x30) to '9' (0x39). http://www.asciitable.com/

For power-of-2 bases like hex, see How to convert a binary integer number to a hex string? Converting between binary and a power-of-2 base allows many more optimizations and simplifications because each group of bits maps separately to a hex / octal digit.


Most operating systems / environments don’t have a system call that accepts integers and converts them to decimal for you. You have to do that yourself before sending the bytes to the OS, or copying them to video memory yourself, or drawing the corresponding font glyphs in video memory…

By far the most efficient way is to make a single system call that does the whole string at once, because a system call that writes 8 bytes is basically the same cost as writing 1 byte.

This means we need a buffer, but that doesn’t add to our complexity much at all. 2^32-1 is only 4294967295, which is only 10 decimal digits. Our buffer doesn’t need to be large, so we can just use the stack.

The usual algorithm produces digits LSD-first (Least Significant Digit first). Since printing order is MSD-first, we can just start at the end of the buffer and work backwards. For printing or copying elsewhere, just keep track of where it starts, and don’t bother about getting it to the start of a fixed buffer. No need to mess with push/pop to reverse anything, just produce it backwards in the first place.

char *itoa_end(unsigned long val, char *p_end) {
  const unsigned base = 10;
  char *p = p_end;
  do {
    *--p = (val % base) + '0';
    val /= base;
  } while(val);                  // runs at least once to print '0' for val=0.

  // write(1, p,  p_end-p);
  return p;  // let the caller know where the leading digit is
}

gcc/clang do an excellent job, using a magic constant multiplier instead of div to divide by 10 efficiently. (Godbolt compiler explorer for asm output).

This code-review Q&A has a nice efficient NASM version of that which accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.


To handle signed integers:

Use this algorithm on the unsigned absolute value. (val = val<0 ? 0U-val : val;, i.e. xor-zero / sub / cmovs which keeps the original value around; Godbolt). If the original input was negative, stick a '-' in front at the end, when you’re done. So for example, -10 runs this with 10, producing 2 ASCII bytes. Then you store a '-' in front, as a third byte of the string.


Here’s a simple commented NASM version of that, using div (slow but shorter code) for 32-bit unsigned integers and a Linux write system call. It should be easy to port this to 32-bit-mode code just by changing the registers to ecx instead of rcx. But add rsp,24 will become add esp, 20 because push ecx is only 4 bytes, not 8. (You should also save/restore esi for the usual 32-bit calling conventions, unless you’re making this into a macro or internal-use-only function.)

The system-call part is specific to 64-bit Linux. Replace that with whatever is appropriate for your system, e.g. call the VDSO page for efficient system calls on 32-bit Linux, or use int 0x80 directly for inefficient system calls. See calling conventions for 32 and 64-bit system calls on Unix/Linux. Or see rkhb’s answer on another question for a 32-bit int 0x80 version that works the same way.

If you just need the string without printing it, rsi points to the first digit after leaving the loop. You can copy it from the tmp buffer to the start of wherever you actually need it. Or if you generated it into the final destination directly (e.g. pass a pointer arg), you can pad with leading zeros until you reach the front of the space you left for it. There’s no simple way to find out how many digits it’s going to be before you start unless you always pad with zeros up to a fixed width.

ALIGN 16
; void print_uint32(uint32_t edi)
; x86-64 System V calling convention.  Clobbers RSI, RCX, RDX, RAX.
; optimized for simplicity and compactness, not speed (DIV is slow)
global print_uint32
print_uint32:
    mov    eax, edi              ; function arg

    mov    ecx, 0xa              ; base 10
    push   rcx                   ; ASCII newline 'n' = 0xa = base
    mov    rsi, rsp
    sub    rsp, 16               ; not needed on 64-bit Linux, the red-zone is big enough.  Change the LEA below if you remove this.

;;; rsi is pointing at 'n' on the stack, with 16B of "allocated" space below that.
.toascii_digit:                ; do {
    xor    edx, edx
    div    ecx                   ; edx=remainder = low digit = 0..9.  eax/=10
                                 ;; DIV IS SLOW.  use a multiplicative inverse if performance is relevant.
    add    edx, '0'
    dec    rsi                 ; store digits in MSD-first printing order, working backwards from the end of the string
    mov    [rsi], dl

    test   eax,eax             ; } while(x);
    jnz  .toascii_digit
;;; rsi points to the first digit


    mov    eax, 1               ; __NR_write from /usr/include/asm/unistd_64.h
    mov    edi, 1               ; fd = STDOUT_FILENO
    ; pointer already in RSI    ; buf = last digit stored = most significant
    lea    edx, [rsp+16 + 1]    ; yes, it's safe to truncate pointers before subtracting to find length.
    sub    edx, esi             ; RDX = length = end-start, including the n
    syscall                     ; write(1, string /*RSI*/,  digits + 1)

    add  rsp, 24                ; (in 32-bit: add esp,20) undo the push and the buffer reservation
    ret

Public domain. Feel free to copy/paste this into whatever you’re working on. If it breaks, you get to keep both pieces. (If performance matters, see the links below; you’ll want a multiplicative inverse instead of div.)

And here’s code to call it in a loop counting down to 0 (including 0). Putting it in the same file is convenient.

ALIGN 16
global _start
_start:
    mov    ebx, 100
.repeat:
    lea    edi, [rbx + 0]      ; put +whatever constant you want here.
    call   print_uint32
    dec    ebx
    jge   .repeat


    xor    edi, edi
    mov    eax, 231
    syscall                             ; sys_exit_group(0)

Assemble and link with

yasm -felf64 -Worphan-labels -gdwarf2 print-integer.asm &&
ld -o print-integer print-integer.o

./print_integer
100
99
...
1
0

Use strace to see that the only system calls this program makes are write() and exit(). (See also the gdb / debugging tips at the bottom of the x86 tag wiki, and the other links there.)


Related:

  • 32-bit version of this, using int 0x80 for the write system call at the end. Pretty much the same loop.
  • With printf – How to print a number in assembly NASM? has x86-64 and i386 answers.
  • NASM Assembly convert input to integer? is the other direction, string->int.
  • Printing an integer as a string with AT&T syntax, with Linux system calls instead of printf – AT&T version of the same thing (but for 64-bit integers). See that for more comments about performance, and a benchmark of div vs. compiler-generated code using mul.
  • Add 2 numbers and print the result using Assembly x86 32-bit version that’s very similar to this.
  • This code-review Q&A uses a multiplicative inverse like a compiler would. And it accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.
  • How to convert a binary integer number to a hex string? – power-of-2 bases are special. Answer includes scalar loop (branchy and table-lookup) and SIMD (SSE2, SSSE3, AVX2, and AVX512 which is amazing for this.)

High-performance versions

  • Some optimized decimal atoi versions from Daniel Lemire’s blog: without AVX-512, and much faster with AVX-512 IFMA

  • With NEON SIMD on Apple M1

  • and some older articles: How to print integers really fast blog post comparing some strategies in C.
    Such as x % 100 to create more ILP (Instruction Level Parallelism), and either a lookup table or a simpler multiplicative inverse (that only has to work for a limited range, like in this answer) to break up the 0..99 remainder into 2 decimal digits.
    e.g. with (x * 103) >> 10 using one imul r,r,imm8 / shr r,10 as shown in another answer. Possibly somehow folding that in to the remainder calculation itself.

  • https://tia.mat.br/posts/2014/06/23/integer_to_string_conversion.html a similar article.

I suppose you wanna print the value to stdout? If this is the case
you have to use a system call to do so. System calls are OS dependent.

e.g. Linux:
Linux System Call Table

The hello world program in this Tutorial may give you some insights.

1

Can’t comment so I post reply this way.
@Ira Baxter, perfect answer I just want to add that you don’t need to divide 10 times as you posted that you set register cx to value 10. Just divide number in ax until “ax==0”

loop1: call dividebyten
       ...
       cmp ax,0
       jnz loop1

You also have to store how many digits was there in original number.

       mov cx,0
loop1: call dividebyten
       inc cx

Anyway you Ira Baxter helped me there is just few ways how to optimize code 🙂

This is not only about optimization but also formatting. When you want to print number 54 you want print 54 not 0000000054 🙂

1 -9 are 1 -9. after that, there must be some conversion that I don’t know either. Say you have a 41H in AX (EAX) and you want to print a 65, not ‘A’ without doing some service call. I think you need to print a character representation of a 6 and a 5 whatever that might be. There must be a constant number that can be added to get there. You need a modulus operator (however you do that in assembly) and loop for all digits.

Not sure, but that’s my guess.

1

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website

How do I print an integer in Assembly Level Programming without printf from the c library? (itoa, integer to decimal ASCII string)

Can anyone tell me the purely assembly code for displaying the value in a register in decimal format? Please don’t suggest using the printf hack and then compile with gcc.

Description:

Well, I did some research and some experimentation with NASM and figured I could use the printf function from the c library to print an integer. I did so by compiling the object file with the GCC compiler and everything works fair enough.

However, what I want to achieve is to print the value stored in any register in the decimal form.

I did some research and figured the interrupt vector 021h for DOS command line can display strings and characters whilst either 2 or 9 is in the ah register and the data is in the dx.

Conclusion:

None of the examples I found showed how to display the content value of a register in decimal form without using the C library’s printf. Does anyone know how to do this in assembly?

9

You need to write a binary to decimal conversion routine, and then use the decimal digits to produce “digit characters” to print.

You have to assume that something, somewhere, will print a character on your output device of choice. Call this subroutine “print_character”; assumes it takes a character code in EAX and preserves all the registers.. (If you don’t have such a subroutine, you have an additional problem that should be the basis of a different question).

If you have the binary code for a digit (e.g., a value from 0-9) in a register (say, EAX), you can convert that value to a character for the digit by adding the ASCII code for the “zero” character to the register. This is as simple as:

       add     eax, 0x30    ; convert digit in EAX to corresponding character digit

You can then call print_character to print the digit character code.

To output an arbitrary value, you need to pick off digits and print them.

Picking off digits fundamentally requires working with powers of ten. It is easiest to work with one power of ten, e.g., 10 itself. Imagine we have a divide-by-10 routine that took a value in EAX, and produced a quotient in EDX and a remainder in EAX. I leave it as an exercise for you to figure out how to implement such a routine.

Then a simple routine with the right idea is to produce one digit for all digits the value might have. A 32 bit register stores values to 4 billion, so you might get 10 digits printed. So:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to produce
loop:    call   dividebyten
         add    eax, 0x30
         call   printcharacter
         mov    eax, edx
         dec    ecx
         jne    loop

This works… but prints the digits in reverse order. Oops! Well, we can take advantage of the pushdown stack to store digits produced, and then pop them off in reverse order:

         mov    eax, valuetoprint
         mov    ecx, 10        ;  digit count to generate
loop1:   call   dividebyten
         add    eax, 0x30
         push   eax
         mov    eax, edx
         dec    ecx
         jne    loop1
         mov    ecx, 10        ;  digit count to print
loop2:   pop    eax
         call   printcharacter
         dec    ecx
         jne    loop2

Left as an exercise to the reader: suppress leading zeros. Also, since we are writing digit characters to memory, instead of writing them to the stack we could write them to a buffer, and then print the buffer content. Also left as an exercise to the reader.

3

You need to turn a binary integer into a string/array of ASCII decimal digits manually. ASCII digits are represented by 1-byte integers in the range '0' (0x30) to '9' (0x39). http://www.asciitable.com/

For power-of-2 bases like hex, see How to convert a binary integer number to a hex string? Converting between binary and a power-of-2 base allows many more optimizations and simplifications because each group of bits maps separately to a hex / octal digit.


Most operating systems / environments don’t have a system call that accepts integers and converts them to decimal for you. You have to do that yourself before sending the bytes to the OS, or copying them to video memory yourself, or drawing the corresponding font glyphs in video memory…

By far the most efficient way is to make a single system call that does the whole string at once, because a system call that writes 8 bytes is basically the same cost as writing 1 byte.

This means we need a buffer, but that doesn’t add to our complexity much at all. 2^32-1 is only 4294967295, which is only 10 decimal digits. Our buffer doesn’t need to be large, so we can just use the stack.

The usual algorithm produces digits LSD-first (Least Significant Digit first). Since printing order is MSD-first, we can just start at the end of the buffer and work backwards. For printing or copying elsewhere, just keep track of where it starts, and don’t bother about getting it to the start of a fixed buffer. No need to mess with push/pop to reverse anything, just produce it backwards in the first place.

char *itoa_end(unsigned long val, char *p_end) {
  const unsigned base = 10;
  char *p = p_end;
  do {
    *--p = (val % base) + '0';
    val /= base;
  } while(val);                  // runs at least once to print '0' for val=0.

  // write(1, p,  p_end-p);
  return p;  // let the caller know where the leading digit is
}

gcc/clang do an excellent job, using a magic constant multiplier instead of div to divide by 10 efficiently. (Godbolt compiler explorer for asm output).

This code-review Q&A has a nice efficient NASM version of that which accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.


To handle signed integers:

Use this algorithm on the unsigned absolute value. (val = val<0 ? 0U-val : val;, i.e. xor-zero / sub / cmovs which keeps the original value around; Godbolt). If the original input was negative, stick a '-' in front at the end, when you’re done. So for example, -10 runs this with 10, producing 2 ASCII bytes. Then you store a '-' in front, as a third byte of the string.


Here’s a simple commented NASM version of that, using div (slow but shorter code) for 32-bit unsigned integers and a Linux write system call. It should be easy to port this to 32-bit-mode code just by changing the registers to ecx instead of rcx. But add rsp,24 will become add esp, 20 because push ecx is only 4 bytes, not 8. (You should also save/restore esi for the usual 32-bit calling conventions, unless you’re making this into a macro or internal-use-only function.)

The system-call part is specific to 64-bit Linux. Replace that with whatever is appropriate for your system, e.g. call the VDSO page for efficient system calls on 32-bit Linux, or use int 0x80 directly for inefficient system calls. See calling conventions for 32 and 64-bit system calls on Unix/Linux. Or see rkhb’s answer on another question for a 32-bit int 0x80 version that works the same way.

If you just need the string without printing it, rsi points to the first digit after leaving the loop. You can copy it from the tmp buffer to the start of wherever you actually need it. Or if you generated it into the final destination directly (e.g. pass a pointer arg), you can pad with leading zeros until you reach the front of the space you left for it. There’s no simple way to find out how many digits it’s going to be before you start unless you always pad with zeros up to a fixed width.

ALIGN 16
; void print_uint32(uint32_t edi)
; x86-64 System V calling convention.  Clobbers RSI, RCX, RDX, RAX.
; optimized for simplicity and compactness, not speed (DIV is slow)
global print_uint32
print_uint32:
    mov    eax, edi              ; function arg

    mov    ecx, 0xa              ; base 10
    push   rcx                   ; ASCII newline 'n' = 0xa = base
    mov    rsi, rsp
    sub    rsp, 16               ; not needed on 64-bit Linux, the red-zone is big enough.  Change the LEA below if you remove this.

;;; rsi is pointing at 'n' on the stack, with 16B of "allocated" space below that.
.toascii_digit:                ; do {
    xor    edx, edx
    div    ecx                   ; edx=remainder = low digit = 0..9.  eax/=10
                                 ;; DIV IS SLOW.  use a multiplicative inverse if performance is relevant.
    add    edx, '0'
    dec    rsi                 ; store digits in MSD-first printing order, working backwards from the end of the string
    mov    [rsi], dl

    test   eax,eax             ; } while(x);
    jnz  .toascii_digit
;;; rsi points to the first digit


    mov    eax, 1               ; __NR_write from /usr/include/asm/unistd_64.h
    mov    edi, 1               ; fd = STDOUT_FILENO
    ; pointer already in RSI    ; buf = last digit stored = most significant
    lea    edx, [rsp+16 + 1]    ; yes, it's safe to truncate pointers before subtracting to find length.
    sub    edx, esi             ; RDX = length = end-start, including the n
    syscall                     ; write(1, string /*RSI*/,  digits + 1)

    add  rsp, 24                ; (in 32-bit: add esp,20) undo the push and the buffer reservation
    ret

Public domain. Feel free to copy/paste this into whatever you’re working on. If it breaks, you get to keep both pieces. (If performance matters, see the links below; you’ll want a multiplicative inverse instead of div.)

And here’s code to call it in a loop counting down to 0 (including 0). Putting it in the same file is convenient.

ALIGN 16
global _start
_start:
    mov    ebx, 100
.repeat:
    lea    edi, [rbx + 0]      ; put +whatever constant you want here.
    call   print_uint32
    dec    ebx
    jge   .repeat


    xor    edi, edi
    mov    eax, 231
    syscall                             ; sys_exit_group(0)

Assemble and link with

yasm -felf64 -Worphan-labels -gdwarf2 print-integer.asm &&
ld -o print-integer print-integer.o

./print_integer
100
99
...
1
0

Use strace to see that the only system calls this program makes are write() and exit(). (See also the gdb / debugging tips at the bottom of the x86 tag wiki, and the other links there.)


Related:

  • 32-bit version of this, using int 0x80 for the write system call at the end. Pretty much the same loop.
  • With printf – How to print a number in assembly NASM? has x86-64 and i386 answers.
  • NASM Assembly convert input to integer? is the other direction, string->int.
  • Printing an integer as a string with AT&T syntax, with Linux system calls instead of printf – AT&T version of the same thing (but for 64-bit integers). See that for more comments about performance, and a benchmark of div vs. compiler-generated code using mul.
  • Add 2 numbers and print the result using Assembly x86 32-bit version that’s very similar to this.
  • This code-review Q&A uses a multiplicative inverse like a compiler would. And it accumulates the string into an 8-byte register instead of into memory, ready store where you want the string to start without extra copying.
  • How to convert a binary integer number to a hex string? – power-of-2 bases are special. Answer includes scalar loop (branchy and table-lookup) and SIMD (SSE2, SSSE3, AVX2, and AVX512 which is amazing for this.)

High-performance versions

  • Some optimized decimal atoi versions from Daniel Lemire’s blog: without AVX-512, and much faster with AVX-512 IFMA

  • With NEON SIMD on Apple M1

  • and some older articles: How to print integers really fast blog post comparing some strategies in C.
    Such as x % 100 to create more ILP (Instruction Level Parallelism), and either a lookup table or a simpler multiplicative inverse (that only has to work for a limited range, like in this answer) to break up the 0..99 remainder into 2 decimal digits.
    e.g. with (x * 103) >> 10 using one imul r,r,imm8 / shr r,10 as shown in another answer. Possibly somehow folding that in to the remainder calculation itself.

  • https://tia.mat.br/posts/2014/06/23/integer_to_string_conversion.html a similar article.

I suppose you wanna print the value to stdout? If this is the case
you have to use a system call to do so. System calls are OS dependent.

e.g. Linux:
Linux System Call Table

The hello world program in this Tutorial may give you some insights.

1

Can’t comment so I post reply this way.
@Ira Baxter, perfect answer I just want to add that you don’t need to divide 10 times as you posted that you set register cx to value 10. Just divide number in ax until “ax==0”

loop1: call dividebyten
       ...
       cmp ax,0
       jnz loop1

You also have to store how many digits was there in original number.

       mov cx,0
loop1: call dividebyten
       inc cx

Anyway you Ira Baxter helped me there is just few ways how to optimize code 🙂

This is not only about optimization but also formatting. When you want to print number 54 you want print 54 not 0000000054 🙂

1 -9 are 1 -9. after that, there must be some conversion that I don’t know either. Say you have a 41H in AX (EAX) and you want to print a 65, not ‘A’ without doing some service call. I think you need to print a character representation of a 6 and a 5 whatever that might be. There must be a constant number that can be added to get there. You need a modulus operator (however you do that in assembly) and loop for all digits.

Not sure, but that’s my guess.

1

Theme wordpress giá rẻ Theme wordpress giá rẻ Thiết kế website

LEAVE A COMMENT