You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 99_Appendices/D_Nasm.md
+75-47Lines changed: 75 additions & 47 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,11 +5,11 @@
5
5
There are some cases where writing some assembly code is preferred/needed to do certain operations (i.e. interrupts handling).
6
6
7
7
Nasm has a macro processor that supports conditional assembly, multi-level file inclusion, etc.
8
-
A macro start with the '%' symbol.
8
+
A macro start with the '%' symbol.
9
9
10
-
There are two types of macros: _single line_ (defined with `%define`) and _multiline_ wrapped around `%macro` and `%endmacro`. In this paragraph we will explain the multi-line macros.
10
+
There are two types of macros: _single line_ (defined with `%define`) and _multiline_ wrapped around `%macro` and `%endmacro`. In this paragraph we will explain the multi-line macros.
11
11
12
-
A multi-line macro is defined as follows:
12
+
A multi-line macro is defined as follows:
13
13
14
14
```nasm
15
15
%macro my_first_macro 1
@@ -19,7 +19,7 @@ A multi-line macro is defined as follows:
19
19
%endmacro
20
20
```
21
21
22
-
A macro can be accessed from C if needed, in this case we need to add a global label to it, for example the macro above will become:
22
+
A macro can be accessed from C if needed, in this case we need to add a global label to it, for example the macro above will become:
23
23
24
24
```nasm
25
25
%macro my_first_macro 1
@@ -31,22 +31,22 @@ my_first_macro_label_%1:
31
31
%endmacro
32
32
```
33
33
34
-
In the code above we can see few new things:
34
+
In the code above we can see few new things:
35
35
36
36
* First we said the the label `my_first_macro_label_%1` has to be set as global, this is pretty straightforward to understand.
37
-
* the `%1` in the label definition, let us create different label using the first parameter passed in the macro.
37
+
* the `%1` in the label definition, let us create different label using the first parameter passed in the macro.
38
38
39
-
So if now we add a new line with the following code:
39
+
So if now we add a new line with the following code:
40
40
41
41
```nasm
42
42
my_first_macro 42
43
43
```
44
44
45
-
It creates the global label: `my_first_macro_label_42`, and since it is global it will be visible also from our C code (of course if the files are linked)
45
+
It creates the global label: `my_first_macro_label_42`, and since it is global it will be visible also from our C code (of course if the files are linked)
46
46
47
-
Basically defining a macro with nasm is similar to use C define statement, these special "instruction" are evaluated by nasm preprocessor, and transformed at compile time.
47
+
Basically defining a macro with nasm is similar to use C define statement, these special "instruction" are evaluated by nasm preprocessor, and transformed at compile time.
48
48
49
-
So for example *my_first_macro 42* is transformed in the following statement:
49
+
So for example *my_first_macro 42* is transformed in the following statement:
50
50
51
51
```nasm
52
52
my_first_macro_label_42:
@@ -57,16 +57,16 @@ my_first_macro_label_42:
57
57
58
58
## Declaring Variables
59
59
60
-
In Nasm if we want to declare a "variable" initialized we can use the following directives:
60
+
In Nasm if we want to declare a "variable" initialized we can use the following directives:
61
61
62
-
| Directive | Description |
62
+
| Directive | Description |
63
63
|-----------|-----------------------------------|
64
64
| DB | Allocate a byte |
65
65
| DW | Allocate 2 bytes (a word) |
66
66
| DD | Allocate 4 bytes (a double word) |
67
67
| DQ | Allocate 8 bytes (a quad word) |
68
68
69
-
These directive are intended to be used for initialized variables. The syntax is:
69
+
These directive are intended to be used for initialized variables. The syntax is:
70
70
71
71
```nasm
72
72
single_byte_var:
@@ -79,24 +79,24 @@ quad_var:
79
79
dq 133.463 ; Example with a real number
80
80
```
81
81
82
-
But what if we want to declare a string? Well in this case we can use a different syntax for db:
82
+
But what if we want to declare a string? Well in this case we can use a different syntax for db:
83
83
84
84
```nasm
85
85
string_var:
86
86
db "Hello", 10
87
87
```
88
88
What does it mean? We are simply declaring a variable (string_variable) that starts at 'H', and fill the consecutive bytes with the next letters. But what about the last number? It is just an extra byte, that represents the newline character. So what we are really storing is the string _"Hello\\n"_
89
89
90
-
Now what we have seen so far is valid for a variable that can be initialized with a value, but what if we don't know the value yet, but we want just to "label" it with a variable name? Well is pretty simple, we have equivalent directives for reserving memory:
90
+
Now what we have seen so far is valid for a variable that can be initialized with a value, but what if we don't know the value yet, but we want just to "label" it with a variable name? Well is pretty simple, we have equivalent directives for reserving memory:
91
91
92
-
| Directive | Description |
92
+
| Directive | Description |
93
93
|-------------|---------------------------------|
94
94
| RESB | Rserve a byte |
95
95
| RESW | Rserve 2 bytes (a word) |
96
96
| RESD | Rserve 4 bytes (a double word) |
97
97
| RESQ | Rserve 8 bytes (a quad word) |
98
98
99
-
The syntax is similar as the previous examples:
99
+
The syntax is similar as the previous examples:
100
100
101
101
```nasm
102
102
single_byte_var:
@@ -109,27 +109,27 @@ quad_var:
109
109
resq 4
110
110
```
111
111
112
-
One moment! What are those number after the directives? Well it's pretty simple, they indicate how many bytes/word/dword/qword we want to allocate. In the example above:
112
+
One moment! What are those number after the directives? Well it's pretty simple, they indicate how many bytes/word/dword/qword we want to allocate. In the example above:
113
113
*`resb 1` Is reserving one byte
114
114
*`resw 2` Is reserving 2 words, and each word is 2 bytes each, in total 4 bytes
115
115
*`resd 3` Is reserving 3 dwords, again a dword is 4 bytes, in total we have 12 bytes reserved
116
-
*`resq 4` Is reserving... well you should know it now...
116
+
*`resq 4` Is reserving... well you should know it now...
117
117
118
118
## Calling C from Nasm
119
119
120
-
In the asm code, if in 64bit mode, a call to *cld* is required before calling an external C function.
120
+
In the asm code, if in 64bit mode, a call to *cld* is required before calling an external C function.
121
121
122
-
So for example if we want to call the following function from C:
122
+
So for example if we want to call the following function from C:
123
123
124
124
```C
125
125
voidmy_c_function(unsigned int my_value){
126
126
printf("My shiny function called from nasm worth: %d\n", my_value);
127
127
}
128
128
```
129
129
130
-
First thing is to let the compiler know that we want to reference an external function using `extern`, and then just before calling the function, add the instruction cld.
130
+
First thing is to let the compiler know that we want to reference an external function using `extern`, and then just before calling the function, add the instruction cld.
131
131
132
-
Here an example:
132
+
Here an example:
133
133
134
134
```nasm
135
135
[extern my_c_function]
@@ -143,58 +143,56 @@ call my_c_function
143
143
144
144
As mentioned in the multiboot chapter, argument passing from asm to C in 64 bits is little bit different from 32 bits, so the first parameter of a C function is taken from `rdi` (followed by: `rsi`, `rdx`, `rcx`, `r8`, `r9`, then the stack), so the `mov rdi, 42` is setting the value of *my_value* parameter to 42.
145
145
146
-
The output of the printf will be then:
146
+
The output of the printf will be then:
147
147
148
148
```
149
149
My shiny function called from nasm worth: 42
150
150
```
151
151
152
152
## About Sizes
153
153
154
-
Variable sizes are always important while coding, but while coding in asm they are even more important to understand how they works in assembly, and since there is no real type you can't rely on the variable type.
154
+
Variable sizes are always important while coding, but while coding in asm they are even more important to understand how they works in assembly, and since there is no real type you can't rely on the variable type.
155
155
156
-
The important things to know when dealing with assembly code:
156
+
The important things to know when dealing with assembly code:
157
157
158
-
* when moving from memory to register, using the wrong register size will cause wrong value being loaded into the registry. Example:
158
+
* when moving from memory to register, using the wrong register size will cause wrong value being loaded into the registry. Example:
159
159
160
160
```nasm
161
161
mov rax, [memory_location_label]
162
162
```
163
-
is different from:
163
+
is different from:
164
164
165
165
```nasm
166
166
mov eax, [memory_location_label]
167
167
```
168
168
169
-
And it could potentially lead to two different values in the register. That because the size of rax is 8 bytes, while eax is only 4 bytes, so if we do a move from memory to register in the first case, the processor is going to read 8 memory locations, while in the second case only 4, and of course there can be differences (unless we are lucky enough and the extra 4 bytes are all 0s).
169
+
And it could potentially lead to two different values in the register. That because the size of rax is 8 bytes, while eax is only 4 bytes, so if we do a move from memory to register in the first case, the processor is going to read 8 memory locations, while in the second case only 4, and of course there can be differences (unless we are lucky enough and the extra 4 bytes are all 0s).
170
170
171
171
This is kind of misleading if we usually do mostly register to memory, or value to register, value to memory, where the size is "implicit".
172
172
173
-
_Authors Note_: Probably it can be a trivial issue, but it took me couple of hours to figure it out!
174
-
175
173
## If Statement
176
174
177
-
Below an example showing a possible solution to a complex if statement. Let's assume that we have the following `if` statement in C and we want to translate in assembly:
175
+
Below an example showing a possible solution to a complex if statement. Let's assume that we have the following `if` statement in C and we want to translate in assembly:
178
176
179
177
```C
180
178
if ( var1==SOME_VALUE && var2 == SOME_VALUE2){
181
179
//do something
182
180
}
183
181
```
184
182
185
-
In asm we can do something like the following:
183
+
In asm we can do something like the following:
186
184
187
185
```asm
188
186
cmp [var1], SOME_VALUE
189
-
jne else_label
187
+
jne .else_label
190
188
cmp [var2], SOME_VALUE2
191
189
jne .else_label
192
190
;here code if both conditions are true
193
191
.else_label:
194
192
;the else part
195
193
```
196
194
197
-
And in a similar way we can have a if statement with a logic OR:
195
+
And in a similar way we can have a if statement with a logic OR:
198
196
199
197
```C
200
198
if (var1 == SOME_VALUE || var2 == SOME_VALUE){
@@ -209,33 +207,38 @@ cmp [var1], SOME_VALUE
209
207
je .true_branch
210
208
cmp [var2], SOME_VALUE
211
209
je .true_branch
210
+
jmp .else_label
212
211
.true_branch
213
212
jne .else_label
214
213
```
215
214
216
-
## Switch Statement
215
+
## Switch Statement
217
216
218
217
The usual switch statement in C:
219
218
```C
220
219
switch(variable){
221
-
case X:
220
+
case SOME_VALUE:
222
221
//do something
223
222
break;
224
-
case Y:
223
+
case SOME_VALUE2:
224
+
//do something
225
+
break;
226
+
case SOME_VALUE3:
225
227
//do something
226
228
break;
227
229
}
228
230
```
229
231
230
-
can be rendered as:
231
-
232
+
can be rendered as:
233
+
232
234
```asm
233
235
cmp [var1], SOME_VALUE
234
236
je .value1_case
235
237
cmp [var1], SOME_VALUE2
236
238
je .value2_case
237
239
cmp [var1], SOME_VALUE3
238
240
je .value3_case
241
+
jmp .item_not_needed
239
242
.value1_case
240
243
;do stuff for value1
241
244
jmp .item_not_needed
@@ -245,14 +248,39 @@ je .value3_case
245
248
.value3_case:
246
249
;do stuff for value3
247
250
.item_not_needed
248
-
;rst of the code
251
+
;rest of the code
249
252
```
250
253
254
+
## Loop
255
+
256
+
Another typical scenario are loops. For example imagine we have the following while loop in C:
257
+
258
+
```c
259
+
unsignedint counter = 0;
260
+
while (counter < SOME_VALUE) {
261
+
//do something
262
+
counter++;
263
+
}
264
+
```
265
+
266
+
Again in assembly we can use the `jmp` instructions family:
267
+
268
+
```asm
269
+
mov ecx, 0 ; Loop counter
270
+
.loop_cycle
271
+
; do sometehing
272
+
inc ecx
273
+
cmp ecx, SOME_VALUE
274
+
jne loop_cycle
275
+
```
276
+
277
+
The `inc` instruction increase the value contained by the `ecx` register.
278
+
251
279
## Data Structures
252
280
253
281
Every language supports accessing data as a raw array of bytes, C provides an abstraction over this in the form of structs. NASM also happens to provide us with an abstraction over raw bytes, that is similar to how C does it.
254
282
255
-
This guide will just introduce quickly how to define a basic struct, for more information and use cases is better to check the netwide assembler official documentation (see the useful links section)
283
+
This section will just introduce quickly how to define a basic struct, for more information and use cases is better to check the netwide assembler official documentation (see the useful links appendix)
256
284
257
285
Let's for example assume we have the following C struct:
258
286
@@ -263,8 +291,8 @@ struct task {
263
291
};
264
292
```
265
293
266
-
How nasm render a struct is basically declaring a list of offset labels, in this way we can use them to access the field starting from the struct memory location (*Authors note: yeah it is a trick...*)
267
-
To create a struct in nasm we use the `struc` and `endstruc` keywords, and the fields are defined between them.
294
+
How nasm render a struct is basically declaring a list of _offset labels_, in this way we can use them to access the field starting from the struct memory location (*Authors note: yeah it is a trick...*)
295
+
To create a struct in nasm we use the `struc` and `endstruc` keywords, and the fields are defined between them.
268
296
The example above can be rendered in the following way:
269
297
270
298
```asm
@@ -274,7 +302,7 @@ struc task
274
302
endstruc
275
303
```
276
304
277
-
What this code is doing is creating three symbols: id as 0 representing the offset from the beginning of a task structure and name as 4 (still the offset) and the task symbol that is 0 too. This notation has a drawback, it defines the labels as global constants, so you can't have another struct or label declared with same name, to solve this problem you can use the following notation:
305
+
What this code is doing is creating three symbols: `id` as 0 representing the offset from the beginning of a task structure and `name` as 4 (still the offset) and the `task` symbol that is 0 too. This notation has a drawback, it defines the labels as global constants, so you can't have another struct or label declared with same name, to solve this problem you can use the following notation:
278
306
279
307
```asm
280
308
struc task
@@ -291,8 +319,8 @@ Now if we have a memory location or register that contains our structure, for ex
291
319
mov rbx, dword [(rax + task.id)]
292
320
```
293
321
294
-
This is how to access a struct, besically we add the label representing an offset to its base address.
295
-
What if we want to create an instance of it? Well in this case we can use the macros `istruc` and `iend`, and using `at` to access the fields. For example if we want create an instance of task with the values 1 for the id field and "hello123" for the name field, we can use the following syntax:
322
+
This is how to access a struct, basically we add the label representing an offset to its base address.
323
+
What if we want to create an instance of it? Well in this case we can use the macros `istruc` and `iend`, and using `at` to access the fields. For example if we want create an instance of task with the values 1 for the id field and "hello123" for the name field, we can use the following syntax:
0 commit comments