CodingLad
machine code

How to Convert Load and Store Instructions to Machine Code in ARM

How to Convert Load and Store Instructions to Machine Code in ARM
0 views
5 min read
#machine code

How to Convert Load and Store Instructions to Machine Code in ARM

Here we’ll explore how to convert load and store instructions to machine code in ARM. ARM assembly language is a low-level programming language that provides a direct interface to the ARM architecture. Understanding how to convert assembly instructions into machine code is essential for anyone working with ARM processors, especially in embedded systems or performance-critical applications. In ARM assembly, load and store instructions are used to transfer data between registers and memory. These instructions are fundamental for manipulating data in ARM architecture. The process of converting these instructions into machine code involves understanding the instruction format, bit fields, and how to encode various addressing modes.

Instruction Format Overview

ARM load/store instructions follow this general bit-field layout (32-bit):

FieldBitsDescription
cond31–28Condition: Determines if the instruction will execute.
op27–26Opcode bits: Identify the instruction type(01).
I25Immediate bit: 0 if offset is a shifted register, 1 if immediate.
p24Pre/Post indexing: 1 for pre-indexed addressing.
u23Up/Down bit: 1 for a positive offset, 0 for a negative offset.
b22Byte/Word: 1 for byte access (LDRB/STRB), 0 for word access.
w21Write-back: 1 if the base register is updated.
L20Load/Store: 1 for load (LDR), 0 for store (STR).
Rn19–16Base register.
Rd15–12Destination (for LDR) or source (for STR) register.
offset11–0The offset, which can be an immediate value or a register offset (possibly with shift).

Summary of the fields:

31–2827–2625242322212019–1615–1211–0
cond.opIpubwLRnRdoffset

Addressing Modes in ARM Load/Store Instructions

Addressing ModeAssembly MnemonicEffective AddressFinal Value in R1
Pre-indexed, base (unchanged)LDR R0, [R1, #x]R1 + xR1
Pre-indexed, base (updated)LDR R0, [R1, #x]!R1 + xR1 + x
Post-indexed, base (unchanged)LDR R0, [R1], #xR1R1 + x
Post-indexed, base (updated)LDR R0, [R1], #x!R1R1 + x

Example 1: Encoding LDR R2, [R0, #4]

This instruction loads a word from memory into R2. The address is calculated as the sum of R0 and an immediate offset of 4.

Breakdown:

  • cond (31–28): 1110 (Always execute)
  • op (27–26): Load/store instructions typically use an opcode value(01) that distinguishes them from data processing instructions.
  • I (25): 1 (Immediate offset)
  • p (24): 1 (Pre-indexed addressing)
  • u (23): 1 (Positive offset, since #4 is positive)
  • b (22): 0 (Word access)
  • w (21): 0 (No write-back since no exclamation mark is used)
  • l (20): 1 (Load instruction)
  • Rn (19–16): Base register R0 → 0000
  • Rd (15–12): Destination register R2 → 0010
  • offset (11–0): Immediate 4 → 000000000100
31–2827–2625242322212019–1615–1211–0
cond.opIpubwLR0R2offset
11100111100100000010000000000100

Full 32-bit Binary:

1110 0111 1001 0000 0010 000000000100

This corresponds to the hexadecimal value:
0xE7902004

Example 2: Encoding LDRB R2, [R0, #4]!

The LDRB instruction loads a byte from memory. Compared to Example1, the main differences are:

  1. The b (22) bit is set to 1 (byte access instead of word access).
  2. The w (21) bit is set to 1 (write-back enabled due to !).
  3. The instruction updates R0 after loading the byte.
31–2827–2625242322212019–1615–1211–0
cond.opIpubwLR0R2offset
11100111111100000010000000000100

Example 3: STRB R2, [R0, #4]!

For a store byte instruction with write-back, the fields change as follows:

  • l (20): 0 (Store operation)
  • b (22): 1 (Byte access)
  • w (21): 1 (Write-back, indicated by the exclamation mark)

Other fields:

  • Rn: R0 → 0000
  • Rd: R2 → 0010
  • offset: 4 → 000000000100
31–2827–2625242322212019–1615–1211–0
cond.opIpubwLR0R2offset
11100111111000000010000000000100

Example 4: LDRB R2, [R0, #-4]

When the offset is negative (here, -4), the u bit (bit 23) is set to 0 to indicate subtraction from the base register.

  • u (23): 0 for a negative offset.
31–2827–2625242322212019–1615–1211–0
cond.opIpubwLR0R2offset
11100111010100000010000000000100

Example 5: LDR R2, [R0, R1, LSL #3]!

This example shows using a register offset with a shift. Here, the offset is not an immediate value but a shifted register.

Breakdown of the Offset Field:

For register-offset addressing with a shift:

  • Bits for the shift amount (typically a 5-bit field): For #3, that is 00011.
  • Bits for the shift type (2 bits): For LSL, this is 00.
  • An extra bit (bit 4) is used to indicate that the shift amount comes from a register (set to 1 for register-defined shifts) or immediate (set to 0 for immediate shifts). In our examples above, since we're using an immediate shift amount, this bit is set accordingly.
  • Rn: Base register R0 → 0000
  • Rd: Destination register R2 → 0010
  • Extra details: If the instruction ends with an exclamation mark (!), then w (write-back) should be set to 1.

For this example, let’s assume:

  • I (25): remains 1 if we are using an immediate offset format; however, when combining a register with a shift, the encoding is slightly different.
  • For a register offset with shift, you’ll typically use the modified format:
    • Instead of a pure immediate 12-bit offset, the lower bits are divided into a shift field.
    • For instance, the offset might encode:
      • Shift amount (bits 11–7): 00011 (for #3)
      • Shift type (bits 6–5): 00 (LSL)
      • Bit 4: 0 if using an immediate shift amount
      • Rm (bits 3–0): R1 → 0001

Putting it all together, with write-back (w = 1) and a load (l = 1), the fields become:

  • cond (31–28): 1110 (Always execute)

  • op (27–26): (Load/store opcode fields)

  • I (25): Usually 0 when the offset is specified by a register (if using register-defined shift)

  • p (24): 1 for pre-indexing

  • u (23): 1 (Positive offset)

  • b (22): 0 (Word access, not byte)

  • w (21): 1 (Write-back due to the exclamation mark)

  • l (20): 1 (Load instruction)

  • Rn (19–16): R0 → 0000

  • Rd (15–12): R2 → 0010

  • Offset (11–0): Encodes the register offset with shift:

    • Shift amount: 00011 (for #3)
    • Shift type: 00 (LSL)
    • Bit 4: 0 (indicating immediate shift amount in this encoding scheme)
    • Rm: R1 → 0001
  • More details on shifting operations:

Instruction Encoding with Shift Operations

LDR R2, [R0, R1, LSL #3]!
31–2827–2625242322212019–1615–1211–0
cond.opIpubwLR0R2offset
11100101101100000010000110000100

Conclusion

In this post, we’ve covered the ARM load/store instruction format and demonstrated how different instructions are encoded:

  • LDR R2, [R0, #4]! uses an immediate positive offset with write-back enabled.
  • LDRB R2, [R0, #4]! changes the byte/word bit (b) to 1 and enables write-back.
  • STRB R2, [R0, #4]! sets write-back (w) and changes the load/store bit (l).
  • LDRB R2, [R0, #-4] uses a negative offset by setting u to 0.
  • LDR R2, [R0, R1, LSL #3]! uses a register offset with a shifted register, incorporating both a shift amount and a write-back.

Understanding these bit fields and how they are arranged in the 32-bit machine code can help you better grasp how ARM processors access memory and manage data transfers. Experiment with these examples to get a deeper insight into ARM’s low-level operations. 🚀