In Through the Out Door

In the previous episode of this reverse engineering effort, we finally found a good way to get hold of the real hardware register addresses, and we extracted the UART registers to begin with.

What’s needed for a minimal booting kernel is first interrupts, meaning information on how to drive the SoC’s interrupt controller, and timers, also a facility supplied by the SoC. We really only need one timer: the system timer, which supplies Linux’s “ticks” – it’s what drives the scheduler. When you enable the system timer, you set it to fire at a set interval; firing means generating an interrupt which you’ve linked it to.

The interrupt controller in the GK7101 is called the VIC (Vectored Interrupt Controller); there are two of them, each handling up to 32 interrupts. ARM provides a standard interrupt controller facility with its CPU cores, called the PrimeCell VIC. As it turns out Goke’s VIC is almost, but not quite, compatible with it. This is turning into a theme; we saw the same oddity with its not-quite-16550 UART.

No matter, we have the SDK source code to figure out how to work it. First let’s get the registers from the SDK header, and run them through our HAL emulator:

/****************************************************/
/* Controller registers definitions                 */
/****************************************************/
#define VIC_IRQ_STA_OFFSET      0x30
#define VIC_FIQ_STA_OFFSET      0x34
#define VIC_RAW_STA_OFFSET      0x18
#define VIC_INT_SEL_OFFSET      0x0c
#define VIC_INTEN_OFFSET        0x10
#define VIC_INTEN_CLR_OFFSET    0x14
#define VIC_SOFTEN_OFFSET       0x1c
#define VIC_SOFTEN_CLR_OFFSET   0x20
#define VIC_PROTEN_OFFSET       0x24
#define VIC_SENSE_OFFSET        0x00
#define VIC_BOTHEDGE_OFFSET     0x08
#define VIC_EVENT_OFFSET        0x04
#define VIC_EDGE_CLR_OFFSET     0x38

The vendor driver thinks the VIC lives on the AHB at 0xf2000000, with the two VICs at offset 0x8000 and 0x9000 respectively. So the VIC1 status register would be at 0xf2008030. Running these through the HAL for translation:

sumner: for reg in 30 34 18 0c 10 14 1c 20 24 00 08 04 38; do ./emu-hal halcode-fromdevice 0xf20080$reg; done
0xf2008030 -> 0xf0003000
0xf2008034 -> 0xf0003004
0xf2008018 -> 0xf0003008
0xf200800c -> 0xf000300c
0xf2008010 -> 0xf0003010
0xf2008014 -> 0xf0003014
0xf200801c -> 0xf0003018
0xf2008020 -> 0xf000301c
0xf2008024 -> 0xf0003020
0xf2008000 -> 0xf0003024
0xf2008008 -> 0xf0003028
0xf2008004 -> 0xf000302c
0xf2008038 -> 0xf0003038

Hey, look at how these are suddenly sorted! And how they suddenly match ARM’s PrimeCell VIC layout! It makes me wonder whether there’s an extra license charge for using ARM’s UART, VIC and so on.

The timer subsystem officially lives at the APB base 0xf3000000 + timer offset (which is actually 0), and translates to post-MMU addresses like this:

sumner: for reg in 0c 00 04 08 14 18 1c 20 24 28 30 34 38; do ./emu-hal halcode-fromdevice 0xf30000$reg; done
0xf300000c -> 0xf100b030
0xf3000000 -> 0xf100b000
0xf3000004 -> 0xf100b008
0xf3000008 -> 0xf100b00c
0xf3000014 -> 0xf100b010
0xf3000018 -> 0xf100b018
0xf300001c -> 0xf100b01c
0xf3000020 -> 0xf100b020
0xf3000024 -> 0xf100b028
0xf3000028 -> 0xf100b02c
0xf3000030 -> 0xf100b004
0xf3000034 -> 0xf100b014
0xf3000038 -> 0xf100b024

Making a long story short: I implemented interrupt controller and timer drivers in a mainline kernel, and it didn’t work. Fortunately we’ve got that early_print() facility. By inserting those liberally, I figured out the kernel hung in init/calibrate.c:calibrate_delay_converge(). That’s the first kernel function to use the timer: it makes some precise measurements for delay loops and such, and does this by busy-looping waiting for jiffies to change. Jiffies is updated only by the system timer; if that never fires, you’ve got a hang.

The big question here is what’s broken: either one of interrupts or timer not working would break this function.

On the assumption I’d overlooked some hardware initialization somewhere, I dug deeper into the SDK source and decompiled vendor kernel… and found exactly that. There is some hardware setup in, of all places, the pre-kernel decompressor. From the SDK source:

gk_gpio_clrbitsl(GK_PA_GPIO0 + 0x3c, 0x00000001);
#ifdef CONFIG_PHY_USE_AO_MCLK
gk_rct_writel(GK_PA_RCT + 0x024, 0x00124021);
gk_rct_writel(GK_PA_RCT + 0x078, 0x00555555);
gk_rct_writel(GK_PA_RCT + 0x084, 0x00000004);
gk_rct_writel(GK_PA_RCT + 0x080, 0x00000001);
#endif

//misc clock configure
// SFLASH ioctrl
gk_rct_writel(GK_PA_RCT + 0x0198, 0x00000011);

// Sensor ioctrl
gk_rct_writel(GK_PA_RCT + 0x019C, 0x00000012);

The write calls in the PHY_USE_A0_MCLK block (which is defined), certainly look like they might be related to a system clock. Alas, this made no difference: still a hang at the calibration function.

This is rather hard to debug. I don’t have JTAG; the GK7101 may well have it, but it’s certainly not broken out on this camera board. The SoC is a BGA package, so there’s not going to be any probing pins for JTAG. The Linux kernel certainly supports this class of CPU, so interrupt vectors and handlers should just work here. I’ve gone over my code lots of times, and can’t find anything wrong.

On the assumption I must therefore still be driving either the interrupt controller or timer facility wrong, the way forward is clear: go over the running vendor kernel again until I find what else it’s doing. You can only stare at the same bits of code so many times though.

What I’d really like to do is get a log of all the hardware register accesses the vendor kernel does, so I can compare that to my code. But of course the Linux kernel doesn’t have anything like a register-level tracepoint, these are just memory access reads/writes after all.

But… the vendor kernel does have such a facility: that horrible HAL! It sits exactly where we need a trace facility: between drivers and their register-level access. What if we could get the HAL to not just do its read/write thing, but also log what it did, and which function asked it to?

We’d have to intercept calls to the HAL. That’s not too hard: we know where it lives, where its jumptable is, and we have access to it at U-Boot time. After all, U-Boot puts it in memory before the kernel is called. It turns out that the HAL jumptable – the hw_ops structure – is actually populated by the call to hal_init(). So we need to intercept the call to hal_init() by the vendor kernel’s decompressor as well. Why not do both at once: change hal_init() to call some code of ours after it’s populated the jumptable, so we can then:

  • Save a copy of that jumptable somewhere.
  • Populate the jumptable with our own functions, which log the call and then call the original function from the saved jumptable.
  • Boot the vendor kernel, and grab the log from memory.

Of course memory is rather a difficult facility at this level. It turns out that hal_init() is called twice:

  • Once by the decompressor, with arguments that make the HAL translate from physical addresses, since the MMU is off at decompression time.
  • A second time during proper kernel startup, this time with arguments that are virtual addresses.

That means our logging code will need to figure out whether it’s running with or without MMU, and adjust its log memory pointer accordingly. This is normally done by a call to the MMU coprocessor, but there’s an easier way: the decompressor runs from the 0xc1000000 block (where U-Boot put it), but it decompresses the kernel into the 0x80000000 block, from where the kernel then runs. So checking the caller function’s high byte will tell us where main memory lives. This is especially easy on ARM: the caller’s address is in the lr register. You don’t even have to go find it on the stack!

As to memory location regardless of MMU prefix: we’ll just drop it in the middle of the RAM range. All things being equal, a kernel boot shouldn’t use so much memory that it would reach halfway its RAM, even on this board’s measly 16MB.

This is all low-level enough that it needs to be done in machine code. The code is here. You’ll have to excuse the crudity of the code: it’s the first time I’ve played with ARM machine code. Incidentally, this technique of inserting a jump right into some other function, and having that function end with running the instruction that was lost and then jumping back, is called a wedge. At least that’s what it was called in the Commodore 64 days; over in serious business PC land they called it a TSR (Terminate and Stay Resident). TLAs make everything better, don’t they?

From the U-Boot shell we can use the tftpboot command to load the compiled blob of our wedge at 0xc0016000, and call it with “go c0016000”. The wedge is now inserted at the end of hal_init(), where it will switch the jumptable around as explained above. Booting the vendor kernel will populate the log. I’ve made a custom root filesystem with the awesome buildroot, and made the vendor kernel load that off a microSD card. We can then use memtool to grab the log from memory. Piping it to gzip and scp’ing it over the network for analysis was the quickest way to get the log.

Each call to a HAL function saves four 32-bit items:

  • The caller’s address.
  • An integer denoting which function in the jumptable was called, i.e. a number from 0 to 17.
  • The first argument to the call – for read functions this is the address, write functions have the value to be written here
  • The second argument; only used for write calls, is the address to be written.

Here’s a sample:

0000000: 080a 00c1 0400 0000 3c90 00a0 0000 0000  ........<.......
0000010: 140a 00c1 0700 0000 0000 0000 3c90 00a0  ............<...
0000020: 280a 00c1 0700 0000 2040 1200 2400 17a0  (....... @..$...
0000030: 3c0a 00c1 0700 0000 0000 0000 7800 17a0  <...........x...
[...]

The first entry shows an instruction at 0xc1000a08 called hw_ops->hw_readl(0xa000903c). The second one translates to hw_ops->hw_writel(0, 0xa000903c).

A little Python makes quick work of this:

0xc1000a08 hw_readl(0xa000903c)             
0xc1000a14 hw_writel(0x0, 0xa000903c)       
0xc1000a28 hw_writel(0x124020, 0xa0170024)  
0xc1000a3c hw_writel(0x0, 0xa0170078)       
0xc1000a50 hw_writel(0x4, 0xa0170084)       
0xc1000a64 hw_writel(0x1, 0xa0170080)       
0xc1000a78 hw_writel(0x11, 0xa0170198)      
0xc1000a90 hw_writel(0x112032, 0xa017008c)  
0xc1000868 hw_readl(0xa0005014)             
0xc1000884 hw_writel(0x55, 0xa0005004)      
0xc1000868 hw_readl(0xa0005014)             
0xc1000868 hw_readl(0xa0005014)             
0xc1000884 hw_writel(0x6e, 0xa0005004)      
0xc1000868 hw_readl(0xa0005014)             
0xc1000868 hw_readl(0xa0005014)             
0xc1000884 hw_writel(0x63, 0xa0005004)      

See those hw_writel() calls with 0xa0005004 as the address? That’s the UART input/output register, and it’s writing 0x55, 0x6e and 0x63 – that’s ASCII for “Unc”, the start of “Uncompressing Linux…".

The caller address is interesting to see, but looking up which function that’s in is going to be a bit of work. Fortunately we have a shell on this running kernel, and Linux publishes its function symbol table in /proc/kallsyms – not for the decompressor, but for the kernel proper. If we feed those symbols into a Ghidra disassembly of the decompressed kernel, it can give all those functions names in its disassembly.

Ghidra also has a very, very elaborate API (it’s in Java, so overengineered to the gills). We can use that to feed the caller address to an API function that determines the function that address is in, and show the function name next to the caller.

Also, the log output shows the addresses the vendor drivers think they’re writing to, but those are of course pre-HAL. But since we have our handy HAL emulator, we could enrich our log output still further by adding the real address it ends up using – very handy if you’re writing drivers that don’t use this HAL nonsense. Here’s what the enriched log looks like, starting right after the decompressor handed over to the kernel proper:

0x804ea6e0 gk7101_map_io                  get_version(0xc0015ad0)           -> 0x0
0x804ea8d0 gk7101_init_irq                hw_writel(0x0, 0xf2008000)        -> 0xf0003024
0x804ea8e4 gk7101_init_irq                hw_writel(0x0, 0xf2008008)        -> 0xf0003028
0x804ea8f8 gk7101_init_irq                hw_writel(0x0, 0xf2008004)        -> 0xf000302c
0x804ea914 gk7101_init_irq                hw_writel(0x0, 0xf2009000)        -> 0xf0010024
0x804ea928 gk7101_init_irq                hw_writel(0x0, 0xf2009008)        -> 0xf0010028
0x804ea93c gk7101_init_irq                hw_writel(0x0, 0xf2009004)        -> 0xf001002c
0x804ea950 gk7101_init_irq                hw_writel(0x0, 0xf200800c)        -> 0xf000300c
0x804ea964 gk7101_init_irq                hw_writel(0x0, 0xf2008010)        -> 0xf0003010
0x804ea978 gk7101_init_irq                hw_writel(0xffffffff, 0xf2008014) -> 0xf0003014
0x804ea98c gk7101_init_irq                hw_writel(0xffffffff, 0xf2008038) -> 0xf0003038
0x804ea9a0 gk7101_init_irq                hw_writel(0x0, 0xf200900c)        -> 0xf001000c
0x804ea9b4 gk7101_init_irq                hw_writel(0x0, 0xf2009010)        -> 0xf0010010
0x804ea9c8 gk7101_init_irq                hw_writel(0xffffffff, 0xf2009014) -> 0xf0010014
0x804ea9dc gk7101_init_irq                hw_writel(0xffffffff, 0xf2009038) -> 0xf0010038
0x80017d44 gk7101_irq_set_type            hw_readl(0xf2008000)              -> 0xf0003024
0x80017d64 gk7101_irq_set_type            hw_readl(0xf2008008)              -> 0xf0003028
0x80017d80 gk7101_irq_set_type            hw_readl(0xf2008004)              -> 0xf000302c
0x80017e20 gk7101_irq_set_type            hw_writel(0x0, 0xf2008000)        -> 0xf0003024
0x80017e34 gk7101_irq_set_type            hw_writel(0x0, 0xf2008008)        -> 0xf0003028
0x80017e48 gk7101_irq_set_type            hw_writel(0x100000, 0xf2008004)   -> 0xf000302c
0x80017b4c gk7101_ack_irq                 hw_writel(0x100000, 0xf2008038)   -> 0xf0003038
0x80017be4 gk7101_enable_irq              hw_writel(0x100000, 0xf2008010)   -> 0xf0003010
0x80019fa0 get_apb_bus_freq_hz            hw_readl(0xf3170014)              -> 0xf1170000
0x80019fb4 get_apb_bus_freq_hz            hw_readl(0xf3170118)              -> 0xf1170118
0x800177ac gk7101_ce_timer_set_mode       hw_readl(0xf300000c)              -> 0xf100b030
0x800177b8 gk7101_ce_timer_set_mode       hw_writel(0x400, 0xf300000c)      -> 0xf100b030
0x800175e8 gk7101_timer_offset            hw_readl(0xf3000020)              -> 0xf100b020
0x8001764c gk7101_ce_timer_set_mode       hw_readl(0xf300000c)              -> 0xf100b030
0x80017658 gk7101_ce_timer_set_mode       hw_writel(0x400, 0xf300000c)      -> 0xf100b030
0x80019fa0 get_apb_bus_freq_hz            hw_readl(0xf3170014)              -> 0xf1170000
0x80019fb4 get_apb_bus_freq_hz            hw_readl(0xf3170118)              -> 0xf1170118
0x80017678 gk7101_ce_timer_set_mode       hw_writel(0xa8750, 0xf3000020)    -> 0xf100b020
0x8001768c gk7101_ce_timer_set_mode       hw_writel(0xa8750, 0xf3000038)    -> 0xf100b024
0x80019fa0 get_apb_bus_freq_hz            hw_readl(0xf3170014)              -> 0xf1170000
0x80019fb4 get_apb_bus_freq_hz            hw_readl(0xf3170118)              -> 0xf1170118
0x8001770c gk7101_ce_timer_set_mode       hw_writel(0x0, 0xf3000024)        -> 0xf100b028
0x80017720 gk7101_ce_timer_set_mode       hw_writel(0x0, 0xf3000028)        -> 0xf100b02c
0x80017734 gk7101_ce_timer_set_mode       hw_readl(0xf300000c)              -> 0xf100b030
0x80017740 gk7101_ce_timer_set_mode       hw_writel(0x411, 0xf300000c)      -> 0xf100b030
0x80017754 gk7101_ce_timer_set_mode       hw_readl(0xf300000c)              -> 0xf100b030
0x80017760 gk7101_ce_timer_set_mode       hw_writel(0x401, 0xf300000c)      -> 0xf100b030
0x80017774 gk7101_ce_timer_set_mode       hw_readl(0xf300000c)              -> 0xf100b030
0x80017780 gk7101_ce_timer_set_mode       hw_writel(0x501, 0xf300000c)      -> 0xf100b030
0x801dce78 serial_gk7101_set_termios      hw_writel(0x10, 0xf3005018)       -> 0xf100500c
0x801dced8 serial_gk7101_set_termios      hw_writel(0xd, 0xf3005004)        -> 0xf1005000
0x801dceec serial_gk7101_set_termios      hw_writel(0x0, 0xf3005000)        -> 0xf1005004
0x801dcf04 serial_gk7101_set_termios      hw_writel(0x3, 0xf3005018)        -> 0xf100500c
0x801dcf48 serial_gk7101_set_termios      hw_readl(0xf3005000)              -> 0xf1005004
0x801dcf54 serial_gk7101_set_termios      hw_writel(0x5, 0xf3005000)        -> 0xf1005004
0x801dc004 serial_gk7101_set_mctrl        hw_readl(0xf300500c)              -> 0xf1005010
0x801dc07c serial_gk7101_set_mctrl        hw_writel(0x4, 0xf300500c)        -> 0xf1005010
0x801dc1ec serial_gk7101_console_write    hw_readl(0xf3005000)              -> 0xf1005004
0x801dc204 serial_gk7101_console_write    hw_writel(0x5, 0xf3005000)        -> 0xf1005004
0x801dc158 serial_gk7101_console_putchar  hw_readl(0xf3005014)              -> 0xf1005014
0x801dc17c serial_gk7101_console_putchar  hw_writel(0x5b, 0xf3005004)       -> 0xf1005000
0x801dc158 serial_gk7101_console_putchar  hw_readl(0xf3005014)              -> 0xf1005014
0x801dc158 serial_gk7101_console_putchar  hw_readl(0xf3005014)              -> 0xf1005014
0x801dc17c serial_gk7101_console_putchar  hw_writel(0x20, 0xf3005004)       -> 0xf1005000
0x801dc158 serial_gk7101_console_putchar  hw_readl(0xf3005014)              -> 0xf1005014
0x801dc158 serial_gk7101_console_putchar  hw_readl(0xf3005014)              -> 0xf1005014

That first call to get_version() is used to print the “hal version = 20151223” line in the console log we saw earlier. After that a bunch of IRQ initialization stuff, followed by timer initialization.

After that it’s some console initialization output; the putchar calls are the first real kernel console output. So that’s all as expected; as a matter of fact that matches exactly what my interrupt and timer drivers put into the registers. Yet the vendor kernel makes it through the calibration function. It’s not long before the log starts showing this:

0x8000845c gk7101_vic_handle_irq          hw_readl(0xf2008030)              -> 0
0x80017b4c gk7101_ack_irq                 hw_writel(0x100000, 0xf2008038)   -> 0
0x800175a0 gk7101_ce_timer_interrupt      hw_writel(0x100000, 0xf2008038)   -> 0

That’s a timer interrupt coming in, getting ack’ed, and getting handled. It just never happens in my kernel.

And so we’re stuck again!