From Rust to AVR assembly: Dissecting a minimal blinky program

13 May 2025 Last updated: 14 May 2025

15 mins read

Unless you prefer to define registers, addresses, and toggle bits manually, the simplest Rust “Blinky” program for an Arduino Uno board consists of the following:

# Cargo.toml
[dependencies]
panic-halt = "^1"
# as of May 2025, this crate is not yet published on crates.io
arduino-hal = { git = "https://github.com/Rahix/avr-hal", features = ["arduino-uno"] }

// src/main.rs
#![no_std]
#![no_main]

use panic_halt as _;

#[arduino_hal::entry]
fn main() -> ! {
    let dp = arduino_hal::Peripherals::take().unwrap();
    let pins = arduino_hal::pins!(dp);
    let mut led = pins.d13.into_output();

    loop {
        led.set_high();
        arduino_hal::delay_ms(1000);
        led.set_low();
        arduino_hal::delay_ms(1000);
    }
}

In this post, we’ll dig into the Rust code concepts behind this program, inspect the generated assembly, and examine the compiled binary’s memory layout.

Understanding the code

If you’re familiar with the previous snippet, you can skip to the next section. Otherwise, let’s break it down:

no_std

This line tells the Rust compiler not to link the standard library (std). Instead, we rely solely on the core library, which is a lightweight subset designed for resource-constrained environments. Since we’re writing code for an embedded device with limited memory, we exclude the standard library, which is designed for general-purpose systems. This means we won’t have access to features like heap allocation, collections, threads, etc.

no_main

This line tells the compiler that we don’t want to use the default entry point for our program. Instead, we will define our own entry point using the #[arduino_hal::entry] attribute.

panic_halt

The usual panic behavior in Rust is to unwind the stack. This is something that’s included in the standard library. However, since we’re no_std, we either need to implement our own implementation using the #[panic_handler] attribute or use a crate that provides one. panic-halt is a library which provides a simple panic handler that halts the program when a panic occurs.

main

#[arduino_hal::entry] expands the function into some low-level boilerplate code that sets up the real entrypoint for the program, this is out of the scope of this post.

Our main function returns !, indicating it never returns — which is typical for embedded programs.

The first three lines correspond to the setup function in an equivalent Arduino sketch, that is:

void setup() {
    pinMode(13, OUTPUT);
}

First, we take ownership of the peripherals on our board by calling Peripherals::take(). This returns a singleton that defines all the accessible peripherals for this particular board (ADC, GPIOs, EEPROM, SPI, etc.) This singleton pattern leverages Rust’s ownership model to ensure peripherals are only accessed once. Doing this early on in the program is a good practice, as it prevents potential conflicts with other parts of the code that may try to access the peripherals.

Then, we use the pins! macro to get access to the pins of the microcontroller. This macro generates code that maps the pins to their corresponding registers and bit positions.

Depending on the hardware architecture, each pin can be only be configured in a set of given modes. In our case, we configure pin 13 (mapped to the builtin LED on Arduino Uno boards) as an output.

loop

Unsurprisingly, at the end of our main function, we have a loop that toggles the LED with a 1s delay.

Inspecting the memory layout

Now that we understand the code, let’s inspect the compiled binary. The project was configured to strip symbols, and to optimize for size (using strip and opt-level fields in the Cargo.toml file, and the corresponding documentation).

Before digging deeper, let’s see how much memory our program uses.

$ avr-size target/avr-none/debug/blinky.elf
   text	   data	    bss	    dec	    hex	filename
    304	      0	      1	    305	    131	target/avr-none/debug/blinky.elf

Our compiled code is only 304 bytes long (text section), which is pretty efficient compared to the same program written as an Arduino sketch compiled with arduino-cli:

$ arduino-cli compile -b arduino:avr:uno
...
$ avr-size blink.ino.elf
   text	   data	    bss	    dec	    hex	filename
    924	      0	      9	    933	    3a5	blink.ino.elf

Although the sketch was built with the default configuration (which uses -Os to optimize for size), its footprint is still roughly 3 times larger than the Rust version. This is probably due to the fact that the compilation process also uses -g to produce debugging information. I’m not sure if there are other things to consider (apart from C runtime initialization) but I didn’t investigate further. If you happen to have the answer, please let me know.

One noteworthy detail in the Rust program’s size breakdown is that the bss segment (used to store statically allocated variables) contains 1 byte of data, but our program doesn’t declare any static variables.

Looking further revealed the only static variable in our binary. Remember the .take() call on peripherals? How does it work exactly?

Under the hood, avr-device keeps track of a global static boolean flag to indicate whether the peripherals have been taken or not. That’s the only memory runtime overhead we have in our program. We’ll see the full implementation in the last section.

pub(crate) static mut DEVICE_PERIPHERALS: bool = false;

Going one layer deeper

Let’s now focus on the text segment of our program, the real size of the compiled code. In our case, it is 304 bytes long.

To understand how things work, we’ll need to look at the assembly code generated by the compiler. For that, we can use avr-objdump to disassemble our binary:

# for convenience, we'll store the output in a file
$ avr-objdump -d target/avr-none/debug/blinky.elf > blinky.S

I tried the -S option to get the source code interleaved with assembly instructions, but it didn’t yield any interesting results, I would be grateful if you could enlighten me.

We can first verify that the disassembled code has the same size as the text segment.

$ tail -n 5 blinky.S
 122:	0e 94 93 00 	call	0x126	;  0x126
 126:	0e 94 95 00 	call	0x12a	;  0x12a
 12a:	ff cf       	rjmp	.-2      	;  0x12a
 12c:	f8 94       	cli
 12e:	ff cf       	rjmp	.-2      	;  0x12e

According to avr-size, the text segment is 304 (0x131) bytes long. In the disassembled code, the last instruction is at address 0x12e and is 2 bytes long, which gives the same result.

In the rest of this post, we’ll try to breakdown the assembly code and understand how and what instructions are generated by the compiler.

Startup

In embedded systems, the first piece of software to execute after a system reset is called the reset handler. Typically, it is in charge of setting up configuration data (e.g. initializing stack pointers) before calling user code.

With the current example, this is pretty straightforward. The datasheet lists 26 vectors in the interrupt vector table (the first one being the reset vector.) In the disassembled code, each of the first 26 instructions is a jump to the address of the corresponding interrupt handler. In particular, address 0x68 is the start of our program. The next 25 instructions contain jumps to each interrupt handler. However, they all point to the same address 0x8c: jmp 0x0. This makes sense, since we don’t need any interrupt in our program, they are left to their default value, which is 0x0. Basically, this means if any interrupt occurs, it will reset the program.

00000000 <.text>:
   0:	jmp	0x68	;  0x68
   4:	jmp	0x8c	;  0x8c
   8:	jmp	0x8c	;  0x8c
   c:	jmp	0x8c	;  0x8c
  10:	jmp	0x8c	;  0x8c
  14:	jmp	0x8c	;  0x8c
  ...
  64:	jmp	0x8c	;  0x8c

For the reference, defining an interrupt handler using avr-device would look like this:

#[avr_device::interrupt]
fn USART_RX() {
    // ...
}

Initialization

The reset handler spans from 0x68 until 0x72.

  68:	eor	r1, r1
  6a:	out	0x3f, r1
  6c:	ldi	r28, 0xFF
  6e:	ldi	r29, 0x08
  70:	out	0x3e, r29
  72:	out	0x3d, r28

The first two instructions clear SREG (AVR status register) by XORing r1 with itself, and storing the result in it. The remaining instructions initialize stack pointer registers (SPH and SPL), they are respectively defined to 0xFF and 0x08, which sets the stack pointer to 0x08FF, i.e. the top of the ATmega328P’s 2kB SRAM (0x0100–0x08FF).

Program logic

  90:	call	0x94
  94:	in	r24, 0x3f
  96:	cli
  98:	lds	r25, 0x0100
  9c:	cpi	r25, 0x01
  9e:	brne	.+2      
  a0:	rjmp	.+118    

This part corresponds to let dp = arduino_hal::Peripherals::take().unwrap();. Below is the generated source code for take:

pub fn take() -> Option<Self> {
    critical_section::with(|_| {
        if unsafe { DEVICE_PERIPHERALS } {
            return None;
        }
        Some(unsafe { Peripherals::steal() })
    })
}

Without going into too much detail, critical_section::with will save SREG in a temporary register, then disable interrupts (cli). Afterwards, DEVICE_PERIPHERALS is checked to see if the peripherals have already been taken. This reveals that DEVICE_PERIPHERALS is stored at 0x0100, that is at the start of SRAM addressable space.

Finally, upon failure, the call to unwrap will abort the program, which is why there is an rjmp instruction with a relative offset of +118 (towards 0x116) if the previous comparison failed. Otherwise, the program follows along by skipping the next instruction (brne .+2).

  a2:	ldi	r25, 0x01
  a4:	sts	0x0100, r25
  a8:	out	0x3f, r24

When the program continues, the next instructions simply take care of setting DEVICE_PERIPHERALS, and restoring the value of SREG.

  aa:	cbi	0x05, 5
  ac:	sbi	0x04, 5

What comes next is more interesting.

cbi (Clear Bit in I/O register) and sbi (Set Bit in I/O register) respectively unsets bit 5 in PORTB (Port B Data Register), and sets bit 5 in DDRB (Data Direction Register B.) Since we use the builtin LED, we know it’s mapped to pin 5 on port B.

As a result, the two instructions perform the following:

Turn off PB5, reset pin state before changing its mode (input/output) to avoid glitches on I/O pins.
Set PB5 mode to output.

Starting at address 0xae, we enter the infinite loop: turn the LED on, wait 1s, turn it off, wait another second.

We already know how to identify the instructions setting or clearing bit 5 in PORTB register. These are sbi 0x05, 5 and cbi 0x05, 5.

Apart from that, there are a few observations we can make with regards to the delay_ms calls:

The delay consists of a busy-loop of 21 instructions.
Although we’re calling a dedicated function, the generated assembly gets duplicated. This is likely due to inlining and link-time optimization (LTO) being enabled. This behavior — and how to control or inspect inlining and LTO effects in embedded Rust — could be an interesting topic for a future post.

Bonus: Coming Full Circle with `led.toggle()`

Ironically, this whole journey started with the toggle() method. It was the first version of the code I wrote — elegant, compact, and exactly what you’d expect for blinking an LED:

loop {
    // led.set_high();
    // arduino_hal::delay_ms(1000);
    // led.set_high();
    // arduino_hal::delay_ms(1000);

    led.toggle();
    arduino_hal::delay_ms(1000);
}

But when I dug into the disassembly to understand what was really happening under the hood, things got murky. The output wasn’t as obvious as I’d hoped. I couldn’t clearly trace how toggle() translated to actual AVR assembly code, especially compared to the more straightforward set_high() and set_low(), which map cleanly to sbi and cbi instructions.

So, I took a step back and rewrote the loop using set_high() and set_low() — a more verbose but transparent approach. That led to the post you read so far.

But near the end of writing this post, curiosity brought me back to toggle(). That’s when I noticed something remarkable in the datasheet:

“Writing a logic one to a bit in the PINx register will result in a toggle in the corresponding bit in the data register.”

and later on a curious thread created in 2017 on the same topic on Arduino forum.

That’s the key! avr-hal-generic crate writes directly to the PINx register — generating a single instruction, typically sbi PINx, n. No need for read-modify-write. No critical section. Just one atomic operation, since ATmega328p supports atomic-toggle.

Here’s the relevant snippet from avr-hal-generic:

// https://github.com/Rahix/avr-hal/blob/7583be5f230cb238f5b518033bcfe78116e063d7/avr-hal-generic/src/port.rs#L651-L670
#[inline]
unsafe fn out_toggle(&mut self) {
    match self.port {
        $(DynamicPort::[<PORT $name>] => {
            if $chip_supports_atomic_toggle {
                (*<$port>::ptr()).[<pin $name:lower>].write(|w| {
                    w.bits(self.mask)
                })
            } else {
                // ...
            }
        },)+
    }
}

Looking at the disassembled code, we can indeed identify the following instructions:

ldi r17, 0x20
...
out 0x03, r17

PINB is set to 0x20 (0b00100000, 5th bit to 1), confirming the expected behavior.

So, in the end, led.toggle() isn’t just syntactic sugar — it’s the most efficient way to blink an LED on the ATmega328P.

One line of Rust. One machine instruction. And a full-circle moment of discovery that made the detour completely worth it.

Some takeaways

Writing embedded Rust may seem intimidating at first, especially when the abstractions hide away the low-level implementation details that are most likely datasheet secrets.

But that’s also where it gets fun. By digging into a simple blinky program, we’ve explored how Rust compiles down to AVR assembly, how memory is laid out, how initialization works behind the scenes, and how even a single one-byte static variable can play a crucial role in ensuring safe access to hardware.

Taking the detour through set_high() and set_low() only to return to toggle() and uncover how it leverages a lesser-known AVR feature was really both stimulating and a great eye-opener to what happens under the hood.

References

ATmega328p datasheet: https://ww1.microchip.com/downloads/en/DeviceDoc/Atmel-7810-Automotive-Microcontrollers-ATmega328P_Datasheet.pdf
AVR instruction set manual: https://ww1.microchip.com/downloads/en/devicedoc/AVR-Instruction-Set-Manual-DS40002198A.pdf
Arduino HAL crate: https://github.com/Rahix/avr-hal
AVR microcontrollers peripheral access crate: https://github.com/Rahix/avr-device
Full disassembled code (set_high/set_low version): https://gist.github.com/n-eq/bfdd7a2d1cb0c7b567ca9650983f85c8