Adding two numbers in macOS x86-64 Assembly - Part 2
byThis is the second part as continuation of series “My first x86-64 assembly in macOS”. Please read first part to view assembly language programming from the perspective of total beginner. I thought simple addition of two numbers in assembly could be the next simple step in progress to learn assembly language programming, covering declaration of variable(kinda), accessing them, adding them and printing them. Hoping to give some more insights on assembly language programming adventure. This post follows same spirit of exploring the findings and observation through perspective of total beginner. If any information provided seems wrong please email me with explanation so I could understand and rectify. Most of the concepts we did explored in first part of the series but I will try to do very short brief here too.
I am using Intel based Mac running Catalina. In future when arm based Mac comes out, then this series may need to be interpreted separately. I am using AT&T syntax throughout the series.
Important Note:
Following assembly code may not be optimised. This assembly code to calculate the sum of numbers is for simple learning of following concepts:- Declare and Initialise variables in assembly.
- Access variables in assembly.
- Perform addition on variables in assembly.
- Print the sum of variables in assembly.
Let’s start
.data
a: .long 4
b: .long 6
sum: .long 0
str: .asciz "Sum: %d\n"
.section __TEXT,__text
.globl _main ## -- Begin function main
_main: ## @main
pushq %rbp
movq %rsp, %rbp
subq $32, %rsp
movl sum(%rip), %esi
movl %esi, -4(%rbp)
movl %edi, -8(%rbp)
movq %rsi, -16(%rbp)
movl a(%rip), %esi
movl %esi, -20(%rbp)
movl b(%rip), %esi
movl %esi, -24(%rbp)
movl -20(%rbp), %eax
addl -24(%rbp), %eax
movl %eax, -28(%rbp)
movl -28(%rbp), %esi
leaq str(%rip), %rdi
callq _printf
addq $32, %rsp
popq %rbp
retq
We start with declaring three variables: a, b and sum. We declare them in .data
directive..data
This is equivalent to .section __DATA, __data
The compiler places all non-const initialized data (even initialized to zero) in this section.
printf("Sum: %d\n", sum);
Sum: <sum value>
Next
we setup our stack for instructions.After setting up our stack and base pointer we will start storing our variables to registers. My first attempt was trying to access sum directly but it gave an following error:
error: 32-bit absolute addressing is not supported in 64-bit mode
movl sum, %esi
From error I assumed that I could not point directly to sum. I guess. I found out I should try to access with offset value. The next question was offset from what register? I checked documentation of registers again(many things bounced off my head) but Instruction Pointer register caught my attention.
%rip
is an instruction pointer used for accessing instructions. It contains the offset of that instruction in given segment. What I have found so far is that our declared initialised variables in .data
section are available through this register. We need the offset value to access given instruction - in our case variable. We use sum(%rip)
which is equivalent to [%rip + sum]
. Please note that [] brackets is not supported in at least my Mac platform because I stumbled with error:
error: brackets expression not supported on this target
movl [%rip + sum], %esi
So the instruction to get access the variable that worked for me is:
movl sum(%rip), %esi
In above instruction we are accessing the variable address and copying to %esi
. Why %esi
??? Well, from the perspective of a noob, I only found %esi
to be compatible operand through documentation.
Now we have copied sum(address) with initial value 0 to %esi
. We need to move this to our base pointer register. We reserve the 4 bytes for our long(32 bit) address in stack frame.
movl %esi, -4(%rbp)
We are moving our %esi
containing sum address to 4 bytes offset from old base pointer. This offset we will use later to perform our addition operation since now it is available in our stack frame.I did wonder why couldn’t I have simply moved variable a from %rip
directly to stack instead of first moving it to index pointer register first? Why longer steps?
error: invalid operand for instruction
movl a(%rip), -8(%rbp)
Next we will similarly move our a and b variables to stack frame:
movl a(%rip), %esi
movl %esi, -8(%rbp)
movl b(%rip), %esi
movl %esi, -12(%rbp)
Since first 4 bytes was reserved for sum we reserve next 4 bytes for other long variables at 8 bytes and 12 bytes offset respectively for a and b from old base pointer.
All our variables are moved to stack frame. We will do simple addition - arithmetic operation now. We need a special register where arithmetic calculation happens and that register is data register. For our addition operation we use %eax
which is primary accumulator widely used in input/output and most arithmetic operation. Documentation wasn’t too more clear than that for me to understand.
We move our variable a in stack frame to %eax
. Remember variable a was moved to 8 byte offset from old base pointer.
movl -8(%rbp), %eax
Now we call add instruction to move variable b at 12 byte offset from old base pointer to %eax and perform add with current value of %eax
. The final result is then placed into the address pointed to by %eax
. I hope this is what it means.Now we move the result pointed by %eax
to stack (4 byte offset from old base pointer)
span class=“codeclass-function”>movl <span class=“codekeyword”>%eax, -4(%rbp).
If we remember, at 4 bytes from old base pointer we had stored our sum value, so basically we are storing the result back to sum in stack. Then move the sum value back to %esi
movl -16(%rbp), %esi
Now we simply call printf and print the output of sum
leaq str(%rip), %rdi
callq _printf
Compile, link and execute
as sum.s -o sum.o
ld -arch x86_64 /usr/lib/libc.dylib sum.o -o sumas
./sumas
You might also like: My first x86-64 assembly in macOS - Part 1