Issue
I encountered some weird things while doing 2018 version MIT 6.828, which lab running on QEMU with emulated 80386 CPU:
What I want to do is initializing the receive process for INTEL 82540EM chip, also known as the E1000. I basically just write some bytes to the device's registers.
First I defined a structure with bit fileds, since it is actually a register in hardware:
struct rx_addr_reg {
// low 32 bit
unsigned ral : 32; // 0 - 31
// high 32 bit
unsigned rah : 16; // 0 -15
unsigned as : 2; // 16 - 17
unsigned rs : 13; // 18 - 30
unsigned av : 1; // 31
};
I decided to use it via C macro:
#define E1000_RA 0x05400 /* Receive Address - RW Array */
#define E1000_RAH_AV 0x80000000 /* Receive descriptor valid */
#define E1000_GET_REG(base,reg) \
{ ((void*)(base) + (reg)) }
#define E1000_SET_RECEIVE_ADDR_REG(addr,as,rs,av) (struct rx_addr_reg)\
{ (addr >> 16) & 0xffffffff, (addr) & 0xffff, \
(as) & 0x3, (rs) & 0x1fff, (av) & 0x1 }
Then in my .c
file, I try to reach and initiate the register:
// Receive Initialization
// Program the Receive Address Registers (RAL/RAH) with the desired Ethernet addresses
struct rx_addr_reg* rar = (struct rx_addr_reg*) E1000_GET_REG(e1000_va, E1000_RA);
*rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1);
What I expected to see the rar
in memory is something like:
Memory address: content
0x????????: 0x12005452 0x80005634
However, the result ended with:
Memory address: content
0x????????: 0x12005452 0x00000080
That is weird, so I check the program in GDB:
+ target remote localhost:26000
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
(gdb) br e1000.c:64
Breakpoint 1 at 0xf0107470: file kern/e1000.c, line 64.
(gdb) si
[f000:e05b] 0xfe05b: cmpl $0x0,%cs:0x6ac8
0x0000e05b in ?? ()
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0xf0107470 <pci_e1000_attach+264>: movl $0x60200a,0x410(%eax)
Breakpoint 1, pci_e1000_attach (pcif=0xf012af10) at kern/e1000.c:64
64 *(uint32_t*)((char*)e1000_va + E1000_TIPG) |= 10 | 8 << 10 | 6 << 20;
(gdb) si
=> 0xf010747a <pci_e1000_attach+274>: movl $0x12005452,0x5400(%eax)
82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb)
=> 0xf0107484 <pci_e1000_attach+284>: movw $0x5634,0x5404(%eax)
0xf0107484 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb)
=> 0xf010748d <pci_e1000_attach+293>: andb $0xfc,0x5406(%eax)
0xf010748d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00005634
(gdb) si
=> 0xf0107494 <pci_e1000_attach+300>: andw $0x8003,0x5406(%eax)
0xf0107494 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000034
(gdb) si
=> 0xf010749d <pci_e1000_attach+309>: orb $0x80,0x5407(%eax)
0xf010749d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000000
(gdb) si
=> 0xf01074a4 <pci_e1000_attach+316>: movl $0x1,0xc(%esp)
86 cprintf("[RAH:RAL] [av]: [%x:%x] [%x]\n", rar->rah, rar->ral, rar->av);
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000080
(gdb)
Below is the points that I can not understand:
- The assembly code try to AND the byte in
0x5406(%eax)
with0xfc
, but it actually seems clear the byte in0x5405
.
(gdb)
=> 0xf010748d <pci_e1000_attach+293>: andb $0xfc,0x5406(%eax)
0xf010748d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00005634
(gdb) si
=> 0xf0107494 <pci_e1000_attach+300>: andw $0x8003,0x5406(%eax)
0xf0107494 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000034
- Then something wrong with the ANDW, it seems clear the byte at
0x5404(%eax)
:
(gdb) si
=> 0xf0107494 <pci_e1000_attach+300>: andw $0x8003,0x5406(%eax)
0xf0107494 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000034
(gdb) si
=> 0xf010749d <pci_e1000_attach+309>: orb $0x80,0x5407(%eax)
0xf010749d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000000
- Finally it ORBs the byte at
0x5404(%eax)
, which it shouldor
with0x5407(%eax)
(gdb) si
=> 0xf010749d <pci_e1000_attach+309>: orb $0x80,0x5407(%eax)
0xf010749d 82 *rar = E1000_SET_RECEIVE_ADDR_REG(0x120054525634, 0x0, 0x0, 0x1); //0x525400123456 0x120054525634
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000000
(gdb) si
=> 0xf01074a4 <pci_e1000_attach+316>: movl $0x1,0xc(%esp)
86 cprintf("[RAH:RAL] [av]: [%x:%x] [%x]\n", rar->rah, rar->ral, rar->av);
(gdb) x/2xw $eax + 0x5400
0xef809400: 0x12005452 0x00000080
- BTW, when I try to print the bytes at
0x5400(%eax)
, why does gdb refuse to do it but only shows the content at 4bytes aligned bytes?
(gdb) x/xw $eax+0x5404
0xef809404: 0x00000034
(gdb) x/xw $eax+0x5406
0xef809406: 0x00000034
(gdb) x/xb $eax+0x5406
0xef809406: 0x34
(gdb) x/xb $eax+0x5404
0xef809404: 0x34
One point that I think it may solve the problem but I'm not sure: The structure I defined is a 8-byte-long, and the system is running under 32-bit. So if the device is not allowed to write bit fields, and only allowed to write with whole 4 bytes, the problem maybe reasonable.
Really appreciate for your answer!
Solution
This hardware defines that its registers are 32 bits wide. That means you need to read and write them 32 bits at a time. Your C code doesn't do anything to ensure that happens; the compiler assumes that you're reading and writing plain old RAM when you operate on pointers to structs. For RAM it is fine to update sub-fields in a 32-bit value by reading and writing less than 32 bits at a time, and that is what the code the compiler generates is doing, with its byte and word operations. However this will not work correctly on a device register. (QEMU's implementation will ignore the byte and word access attempts; you can see this as well when you try to access the device via the gdbstub.)
So you can't just define a struct with bitfields that line up with the registers in the specification and expect writing to an individual bitfield to work correctly. If you want to update an individual field in a register you should read the whole 32 bit register, update the relevant part of the value, and then write the whole 32 bit value back again. (Often you want to update all the fields at once, in which case you can just do a write of the full new value without having to do a read first.)
You also want to make sure the compiler doesn't think this is just RAM and so it can happily reorder, merge or drop updates. Personally I like the Linux kernel's approach of defining functions for doing accesses that eventually boil down to asm loads and stores so that it's always 100% clear exactly what the generated code will be doing; there are other approaches too.
Answered By - Peter Maydell Answer Checked By - Marilyn (WPSolving Volunteer)