10 mins read
This is the second post in our pre-main Zephyr series.
Last time, we explored the resources Zephyr uses to
bring up ARM Cortex-M CPUs after a reset. In particular, we examined the vector table, and reset
handler assembly code.
Today, we dig into the C runtime logic that eventually leads to calling the
application’s main()
function. We’ll focus on how Zephyr uses preprocessor macros, linker script
tricks, and GNU C’s __attribute__
mechanism to efficiently register and invoke initialization functions
at different stages of the boot process.
The last instruction in the ARM Cortex-M reset handler is bl z_prep_c
, let’s walk though the
called function step-by-step.
1. [Config-specific] soc_prep_hook()
The function starts with an optional call to soc_prep_hook
, gated on CONFIG_SOC_PREP_HOOK
config
variable. When defined, this function performs very-early pre-main initializations, e.g.: clock
selection, TrustZone setup, etc.
2. Vector table relocation
Next, Zephyr ensures the vector table is located at the right address by calling
relocate_vector_table
. This is done regardless of the configuration and architecture, and is
necessary to reset VTOR
register to point to Zephyr’s own table, so that future interrupts and
faults hit the correct handlers.
We can verify the address of the vector table in our output ELF file, confirming that the vector
table effectively starts at 0x10000
:
$ arm-none-eabi-objdump -t build/zephyr/zephyr.elf | grep _vector_table -w
00010000 g O rom_start 00000000 _vector_table
In build/zephyr/linker.cmd
too, we can double-check that the firmware, thus the vector table,
starts at the same address:
OUTPUT_FORMAT("elf32-littlearm")
_region_min_align = 32;
MEMORY
{
FLASH (rx) : ORIGIN = (0x0 + 0x10000), LENGTH = (0xe8000 - 0x0)
...
3. [Config-specific] Floating Point Unit initialization
Zephyr then enables the Floating Point Unit (FPU) on CPUs that have it (Cortex-M4F, M7F, M33F, M35PF). It is important to perform this setup early enough – i.e. before any FP-related instructions are run – to avoid triggering usage faults.
4. BSS zeroing
The next step is standard in any C runtime startup. z_bss_zero
takes care of clearing
the .bss
section, where all uninitialized global and static variables live. It is more or less
equivalent to:
memset(&_bss_start, 0, &_bss_end - &_bss_start);
5. Data copy
Likewise, Zephyr performs another very common operation in C runtime startup logic. It copies the
initialized data section (.data
) from flash to RAM, which roughly corresponds to:
for (dest = &_data_start; dest < &_data_end; ++dest)
*dest = *src++; // where src = image_load_addr
This is particularly important to ensure initialized global variables are copied to RAM before executing user code.
6. Interrupt controller initialization
Following this, Zephyr either calls the SoC’s custom interrupt controller init function
(z_soc_irq_init
) when defined, or uses the default fallback z_arm_interrupt_init
. Here’s the
definition from arch/arm/core/cortex_m/irq_init.c
:
void z_arm_interrupt_init(void) {
int irq = 0;
#if defined(CONFIG_MULTI_LEVEL_INTERRUPTS) && defined(CONFIG_2ND_LVL_ISR_TBL_OFFSET)
for (; irq < CONFIG_2ND_LVL_ISR_TBL_OFFSET; irq++) {
#else
for (; irq < CONFIG_NUM_IRQS; irq++) {
#endif
NVIC_SetPriority((IRQn_Type)irq, _IRQ_PRIO_OFFSET);
}
}
The function is basically a loop over all IRQs.
In a multi-level interrupt controller, only the first level is initialized here (indices up to
CONFIG_2ND_LVL_ISR_TBL_OFFSET
), the remaining IRQs are SoC-specific. Regardless of the config,
Zephyr initializes all the IRQs using CMSIS’s NVIC_SetPriority
, with all their priority levels set
to _EXC_IRQ_DEFAULT_PRIO
, ensuring they can be enabled safely and will run in a predictable order
unless overridden later.
7. [Config-specific] Cache init
Most ARM Cortex-M CPUs don’t have data/instruction cache, so this step is skipped.
Instruction and/or data cache can only be present optionally in Cortex-M7 and M35P.
In this step, Zephyr calls the generic arch_cache_init
function, which is empty
by default on Cortex-M families, even when it’s present. Instead, cache control
is left to the SoC vendor’s implementation, as we’ll see in the next article.
8. [Config-specific] Trapping null pointer dereferences
When CONFIG_NULL_POINTER_EXCEPTION_DETECTION_DWT
is set, Zephyr calls
z_arm_debug_enable_null_pointer_detection
which uses the ARM DWT (Data Watchpoint and Trace) unit
to trap null pointer dereferences.
For ARMV8-M Mainline, it uses two DWT comparators to monitor illegal (and not just NULL
)
dereferences in the address space spanning from 0x0
to CONFIG_NULL_POINTER_EXCEPTION_DETECTION_DWT
,
the size of paged “unmapped” (defaults to 0x400
).
For the remaining cases (ARMv7-M processors, or processors using a backwards-compatible ARMv8-M
processor implementation supporting Main Extension), a comparator and a mask are used.
The result is still the same though: If any illegal R/W access is performed in the range [0,
CONFIG_NULL_POINTER_EXCEPTION_DETECTION_DWT]
, an exception occurs.
This feature is very useful and can help detect null pointer bugs that occur very early during execution.
Kernel initialization
The last bit is a call to z_cstart
. This is a non SoC-specific function located in kernel/init.c
that marks the transition from platform startup to Zephyr RTOS initialization, let’s dive into its
key elements.
First, z_sys_init_run_level
is called in order to initialize baselevel devices. This function is
called multiple times inside z_cstart
, and has an important role so let’s unpack it in details.
The source code of the function is fairly easy to understand regardless of the context. It can be summarized as follows:
- The function accepts an enum parameter
level
. - It defines a static list
levels
of type:struct init_entry { int (*init_fn)(void); const struct device *dev; }
- Based on
level
, it callsdo_device_init(entry->dev)
for entries in the list that correspond to the requested level if they’re device drivers (which is never the case forEARLY
initialization level), orentry->init_fn()
otherwise.
At this stage, we can ask a few questions:
- How does
levels
array work and why is itstatic
ally declared? - How are entries defined?
Zephyr defines a set of “levels” of initializations; you can see them as a multistage rocket: each level is initialized in order: early initialization, then pre-kernel, post-kernel, application, and eventually the final level.
For each level, a set of functions are “attached”.
After its definition, a function named
my_init_function
defined as an init function performs the following preprocessor macro call:
SYS_INIT(my_init_function, 3 /* LEVEL */, 5 /* PRIO */);
which ultimately expands, after a few operations, to this very verbose definition:
static const __aligned(__alignof(struct init_entry)) struct init_entry
__attribute__((__section__("z_init_POST_KERNEL_P_5_SUB_0_")))
__attribute__((__used__))
__attribute__((no_sanitize("address")))
__init_my_init_function = {
.init_fn = (my_init_function),
.dev = ((void *)0)
};
In this definition lie most answers to our previous questions. In particular, this is a very
insightful illustration of the power of __attribute__
keyword. Let’s see what purpose each
of them serves:
__attribute__((no_sanitize("address")))
disables Address Sanitizer (ASan) to avoid false positives or runtime overhead for this very “low-level” variable. It is only relevant when-fsanitize=address
is used in compiler flags.__attribute__((__used__))
tells the compiler not to discard the variable even if it looks unused. This is even more important if we know that when the variable is declared asstatic
(which is the case here), the compiler may optimize it away.__attribute__((__section__(".z_init_POST_KERNEL_P_5_SUB_0_")))
is the key attribute to help understand this definition.
It instructs the compiler to place this variable in a custom linker section named.z_init_POST_KERNEL_P_5_SUB_0_
, allowing the linker script, as we’ll see, to aggregate init entries per level and priority into a contiguous block. Without this, Zephyr wouldn’t find it easily at runtime.
The last point is exactly how everything fits together. If we inspect the linker script
(reminder: it’s build/zephyr/linker.cmd
), we’ll see this:
initlevel :
{
__init_start = .;
__init_EARLY_start = .; KEEP(*(SORT(.z_initin _EARLY_P_?_*))); KEEP(*(SORT(.z_init_EARLY_P_??_*))); KEEP(*(SORT(.z_init_EARLY_P_???_*)));
__init_PRE_KERNEL_1_start = .; KEEP(*(SORT(.z_init_PRE_KERNEL_1_P_?_*))); KEEP(*(SORT(.z_init_PRE_KERNEL_1_P_??_*))); KEEP(*(SORT(.z_init_PRE_KERNEL_1_P_???_*)));
__init_PRE_KERNEL_2_start = .; KEEP(*(SORT(.z_init_PRE_KERNEL_2_P_?_*))); KEEP(*(SORT(.z_init_PRE_KERNEL_2_P_??_*))); KEEP(*(SORT(.z_init_PRE_KERNEL_2_P_???_*)));
__init_POST_KERNEL_start = .; KEEP(*(SORT(.z_init_POST_KERNEL_P_?_*))); KEEP(*(SORT(.z_init_POST_KERNEL_P_??_*))); KEEP(*(SORT(.z_init_POST_KERNEL_P_???_*)));
__init_APPLICATION_start = .; KEEP(*(SORT(.z_init_APPLICATION_P_?_*))); KEEP(*(SORT(.z_init_APPLICATION_P_??_*))); KEEP(*(SORT(.z_init_APPLICATION_P_???_*)));
__init_SMP_start = .; KEEP(*(SORT(.z_init_SMP_P_?_*))); KEEP(*(SORT(.z_init_SMP_P_??_*))); KEEP(*(SORT(.z_init_SMP_P_???_*)));
__init_end = .;
} > FLASH
We can conclude that my_init_function
, which ended up being manually put in
.z_init_POST_KERNEL_P_5_SUB_0_
, will be matched by the pattern:
KEEP(*(SORT(.z_init_POST_KERNEL_P_???_*)))
and thus inserted (sorted by name, i.e. by priority) into __init_POST_KERNEL_start
, which belongs
to the parent section initlevel
.
One more thing: why is levels
variable statically defined inside the function? We can make a few
guesses:
- No other part of the code needs to access it. By keeping it local, it helps reduce its scope and avoid cluttering the global symbol table.
- No stack allocation needed. Instead, the compiler can store it in
.rodata
segment.
It could have been that the array is declared as static
within the function to allow for link-time
optimizations (LTO) in the cases when z_sys_init_run_level
is never called (due to preprocessor
macros and Kconfig definitions), but this never happens.
We can now clearly see how the levels array ties directly into the linker script. The
__init_*_start
symbols are linker-defined pointers to arrays of init_entry
structs, each sorted
into their corresponding .z_init_*
section during compilation.
At runtime, z_sys_init_run_level
walks through each section for a given level and calls the associated initialization functions. This
design allows Zephyr to cleanly organize startup logic into very well-defined stages.
In the next post, we’ll look at how z_cstart
finishes the remaining setup,
configures threads and finally hands control off to main()
.