Adding two numbers in macOS x86-64 Assembly - Part 2 – Code and Wilderness: Adventures in Programming and Nature

Oct 27, 2020 - Coding

Adding two numbers in macOS x86-64 Assembly - Part 2

This is the second part as continuation of series “My first x86-64 assembly in macOS”. Please read first part to view assembly language programming from the perspective of total beginner. I thought simple addition of two numbers in assembly could be the next simple step in progress to learn assembly language programming, covering declaration of variable(kinda), accessing them, adding them and printing them. Hoping to give some more insights on assembly language programming adventure. This post follows same spirit of exploring the findings and observation through perspective of total beginner. If any information provided seems wrong please email me with explanation so I could understand and rectify. Most of the concepts we did explored in first part of the series but I will try to do very short brief here too.

I am using Intel based Mac running Catalina. In future when arm based Mac comes out, then this series may need to be interpreted separately. I am using AT&T syntax throughout the series.

Important Note:

Following assembly code may not be optimised. This assembly code to calculate the sum of numbers is for simple learning of following concepts:

Declare and Initialise variables in assembly.
Access variables in assembly.
Perform addition on variables in assembly.
Print the sum of variables in assembly.

Let’s start


.data
	a: .long 4
	b: .long 6
	sum: .long 0
	str: .asciz	"Sum: %d\n"
.section	__TEXT,__text
	.globl	_main                   ## -- Begin function main
_main:                                  ## @main
	pushq 	%rbp
	movq  	%rsp, %rbp
	subq  	$32, %rsp
	movl  	sum(%rip), %esi
	movl  	%esi, -4(%rbp)
	movl  	%edi, -8(%rbp)
	movq  	%rsi, -16(%rbp)
	movl  	a(%rip), %esi
	movl  	%esi, -20(%rbp)
	movl  	b(%rip), %esi
	movl  	%esi, -24(%rbp)
	movl  	-20(%rbp), %eax
	addl  	-24(%rbp), %eax
	movl  	%eax, -28(%rbp)
	movl  	-28(%rbp), %esi
	leaq  	str(%rip), %rdi
	callq 	_printf
	addq  	$32, %rsp
	popq  	%rbp
	retq

We start with declaring three variables: a, b and sum. We declare them in .data directive.

.data
This is equivalent to .section __DATA, __data
The compiler places all non-const initialized data (even initialized to zero) in this section.

By non-const I would assume that any mutable variables should be declared in this section.

To print sum I will be using C function printf formatter to display the output:


printf("Sum: %d\n", sum);
Sum: <sum value>

we setup our stack for instructions.

After setting up our stack and base pointer we will start storing our variables to registers. My first attempt was trying to access sum directly but it gave an following error:


error: 32-bit absolute addressing is not supported in 64-bit mode
        movl    sum, %esi

From error I assumed that I could not point directly to sum. I guess. I found out I should try to access with offset value. The next question was offset from what register? I checked documentation of registers again(many things bounced off my head) but Instruction Pointer register caught my attention. %rip is an instruction pointer used for accessing instructions. It contains the offset of that instruction in given segment. What I have found so far is that our declared initialised variables in .data section are available through this register. We need the offset value to access given instruction - in our case variable. We use sum(%rip) which is equivalent to [%rip + sum]. Please note that [] brackets is not supported in at least my Mac platform because I stumbled with error:


error: brackets expression not supported on this target
        movl    [%rip + sum], %esi

So the instruction to get access the variable that worked for me is:


movl  	sum(%rip), %esi

In above instruction we are accessing the variable address and copying to %esi. Why %esi??? Well, from the perspective of a noob, I only found %esi to be compatible operand through documentation.

According to documentation, source index register can point to data locations in the data segment. So It was my first trial choice to use for storing to register.

Now we have copied sum(address) with initial value 0 to %esi. We need to move this to our base pointer register. We reserve the 4 bytes for our long(32 bit) address in stack frame.


movl  	%esi, -4(%rbp)

We are moving our %esi containing sum address to 4 bytes offset from old base pointer. This offset we will use later to perform our addition operation since now it is available in our stack frame.

I did wonder why couldn’t I have simply moved variable a from %rip directly to stack instead of first moving it to index pointer register first? Why longer steps?


error: invalid operand for instruction
        movl    a(%rip), -8(%rbp)

Next we will similarly move our a and b variables to stack frame:


movl  	a(%rip), %esi
movl  	%esi, -8(%rbp)
movl  	b(%rip), %esi
movl  	%esi, -12(%rbp)

Since first 4 bytes was reserved for sum we reserve next 4 bytes for other long variables at 8 bytes and 12 bytes offset respectively for a and b from old base pointer.

All our variables are moved to stack frame. We will do simple addition - arithmetic operation now. We need a special register where arithmetic calculation happens and that register is data register. For our addition operation we use %eax which is primary accumulator widely used in input/output and most arithmetic operation. Documentation wasn’t too more clear than that for me to understand.

We move our variable a in stack frame to %eax. Remember variable a was moved to 8 byte offset from old base pointer.


movl  	-8(%rbp), %eax

Now we call add instruction to move variable b at 12 byte offset from old base pointer to %eax and perform add with current value of %eax. The final result is then placed into the address pointed to by %eax. I hope this is what it means.

Now we move the result pointed by %eax to stack (4 byte offset from old base pointer)


span class="code__class-function">movl   	%eax, -4(%rbp).

If we remember, at 4 bytes from old base pointer we had stored our sum value, so basically we are storing the result back to sum in stack. Then move the sum value back to %esi


movl  	-16(%rbp), %esi

Now we simply call printf and print the output of sum


leaq  	str(%rip), %rdi
callq  	_printf

Compile, link and execute


as sum.s -o sum.o
ld -arch x86_64 /usr/lib/libc.dylib sum.o -o sumas
./sumas

You might also like: My first x86-64 assembly in macOS - Part 1