print the first character of a string results in a segmentation fault

You have several bugs:

  • You push two 4-byte arguments onto the stack for printf. In SysV calling conventions, printf will leave them there, and so it is your responsibility to adjust the stack afterwards to “remove” them. Remember that ret will look for a return address at the top of the stack; as your code stands, what will be there is the character value from eax that you pushed. That’s not a valid address, so trying to return there causes a segfault. You can remove those arguments by popping twice, or more efficiently by simply adding 8 to esp, thus moving the stack pointer back to where it was.

  • Current versions of the i386 SysV ABI require the stack to be aligned to 16 bytes just before calling any function. Thinking about the fact that call itself pushes 4 bytes on the stack as the return address, as does every push instruction, you can work out the necessary adjustments needed for your calls to some_proc and to printf, and add or subtract from esp as appropriate. (Technically you could avoid aligning the stack before calling some_proc and just fix it up before printf, but this is too easy to screw up.) Some 32-bit libraries may be compiled in such a way that this requirement is not enforced, but 64-bit code definitely needs it, so it is a good habit to comply.

  • esi is a callee-saved register according to i386 SysV ABI calling conventions (memorize these!). If you want to modify it, you have to save the previous contents and restore them before returning (e.g. push esi at the top of the function and pop esi at the end). Or choose a caller-saved register such as ecx instead. However, as noted below, you don’t really need to use a register for the address of str1 at all.

  • mov eax, [esi] is a 32-bit load because eax is a 32-bit register. So this will load eax with the 4 bytes from location str_1, which will result in it containing the value 0x65726874 (the bytes t h r e as a little-endian integer). This may not actually cause a problem since printf is supposed to convert its int argument back to unsigned char for printing, so you should only get the low byte 0x74 = 't', but it is still weird, and could break if your string was very short and adjacent to an unmapped page.

    Safer would be mov al, [esi] which just loads one byte into al, which is the low byte of eax, but whatever garbage is in the high 3 bytes will stay there. You could zero out eax beforehand with xor eax, eax, but you can also kill two birds with one stone with the movzx instruction, which zero-extends a smaller operand into a larger one: movzx eax, byte [esi].

    Of course, putting the address into esi first is redundant, since the address can be specified as an immediate: mov al, [str_1] or movzx eax, byte [str_1]. This then avoids the need to save/restore esi.

  • main is expected to return an exit code, and return values always go in eax. Your eax would contain your characters or maybe the return value from printf, depending where your push/pops end up. Any of those will be a weird nonzero exit code and your shell will think the program encountered an error. So zero out eax before returning from main, to indicate success.

  • argv_str is a strange name for a string that has nothing to do with argv.

I would modify your program as follows:

; nasm -f test.asm && gcc -m32 -o test test.asm.o
section .text
global  main
extern printf

some_proc:
    sub esp, 4 ; 8 more bytes pushed before call to printf
    movzx eax, byte [str_1]
    push eax
    push argv_str
    call printf
    add esp, 12
    ret

main:
    sub esp, 12
    call some_proc
    xor eax, eax
    add esp, 12
    ret

section  .data
    str_1        db `three`
    argv_str     db `%c\n`

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top