Issue
#include<stdio.h>
#include<stdlib.h>
int main(int argc, char** argv) {
char *str5=malloc(10);
*str5="xxxxx\0";
printf("%s\n",str5);
return 0;
}
compiles into the following assembly (using "gcc source.c"):
=> 0x555555555179 <main+4>: sub rsp,0x20
0x55555555517d <main+8>: mov DWORD PTR [rbp-0x14],edi
0x555555555180 <main+11>: mov QWORD PTR [rbp-0x20],rsi
0x555555555184 <main+15>: mov edi,0xa
0x555555555189 <main+20>: call 0x555555555040 <malloc@plt>
0x55555555518e <main+25>: mov QWORD PTR [rbp-0x8],rax
0x555555555192 <main+29>: lea rax,[rip+0xe6b] # 0x555555556004
0x555555555199 <main+36>: mov edx,eax
0x55555555519b <main+38>: mov rax,QWORD PTR [rbp-0x8]
0x55555555519f <main+42>: mov BYTE PTR [rax],dl
0x5555555551a1 <main+44>: mov rax,QWORD PTR [rbp-0x8]
0x5555555551a5 <main+48>: mov rdi,rax
0x5555555551a8 <main+51>: call 0x555555555030 <puts@plt>
0x5555555551ad <main+56>: mov eax,0x0
0x5555555551b2 <main+61>: leave
0x5555555551b3 <main+62>: ret
0x5555555551b4 <_fini>: sub rsp,0x8
0x5555555551b8 <_fini+4>: add rsp,0x8
0x5555555551bc <_fini+8>: ret
edit- To clarify, I'm not asking how to correctly copy or assign strings - I know how to do that. I'm asking WHY this particular set of instructions gets compiled in this way. The answer selected answers it well. -end edit.
What actually happens:
So, on main+42, it loads $dl (the lowest byte of the ADDRESS of my string constant "xxxxx\0" into the address $rax points to, which is my variable str5. This ends up being just a garbage character, and the rest of the string constant is never copied. The compiler spits out the following warning which is probably pertinent, but I don't understand how:
source.c:6:14: warning: assignment to ‘char’ from ‘char *’ makes integer from pointer without a cast [-Wint-conversion]
What I'm expecting to happen: The string constant "xxxxx\0" is loaded into the address pointed to by str5. Then, the string pointed to by str5 is printed to the screen. Why does gcc compile this code this way?
Solution
in C *str5="xxxxx\0";
does not copy the string.
This operation is actually:
- taking the address of the string literal
"xxxxx\0"
- converts this address to
char
(which is an integer) - assigns the first character of the allocated by
malloc
memory with this integer value
When you print it, the first character is that integer in the character form. The rest of the characters are not initialized and this code invokes undefined behaviour.
If you want to copy the string literal you need to strcpy(str, "xxxxx");
(the string literal will have null terminating character and you do not need to put it there yourself)
So the compiler is right and it is generating the code for the program you wrote.
Answered By - gulpr Answer Checked By - Mildred Charles (WPSolving Admin)