Computer System

Assembly Language Fundamentals

Attention is All You Need 2021. 4. 6. 11:51

Assembly Programming example

- Main PROC / INVOKE ExitProcess,0 / Main ENDP 는 기본적으로 적어야하는 코드 틀이다.

- INVOKE ExitProcess,0을 시행하게 되면 OS 내에 특정 부분으로 JUMP하게 된다.

 

Assembly Programming example 2

 

- 변수를 선언하기 위해서 .data와 .code로 구분하게 된다.

 

 

- Assembly 언어에서는 numeric literal을 정의하기 위해서 숫자 뒤에 d(decimal), h(hexadecimal), b(binary), q(octal)을 붙여줘야한다.

- 이때, default는 decimal이다.

- numeric literal 사이의 연산자에는 +, -, *, /, MOD 등이 있다.

 

- When we use variables, we should not use reserved words such as MOV, ADD, SUB, register names and etc.

- The first character of a identifier should be a letter, _, $, @.

 

 

Assembly Code 1
Assembly Code 2

- Assembly instruction is divided into 4 basic parts.

- [label:] [mnemonic] [operands] [;comment]

- label is a identifier as a place marker for instructions or data. While data label is called 'Variable', code label is called 'Label'.

- instruction mnemonics include 'MOV', 'ADD', 'SUB', 'MUL', 'JMP'.

- instruction is executed by the processor at runtime.

- operands is where the instruction will be applied, and registers(eax), memory data, constant expression could come.

- comments are short description of program, and starts with a semicolon.

 

 

Assembly Programming example 3
Output by call to DumpRegs

- 'CALL DumpRegs' is a function call that displays memory.

- In the first two rows, values of 32-bit registers are displayed and then EIP(extended instruction pointer) and EFL(extended flag register) are displayed along with CF, SF, ZF, OF.

 

- INCLUDE Irvine32.inc copies definitions and setup info from Irvine32.inc.

- It reserve stack space by inserting '.stack 4096' code when code is executed.

 

 

 

Program Template

- Overall, the basic template for Assembly Programming is as the image above.

 

 

 

Two-Staged Process

- Source file is transformed into Object file by the assembler, and linker adds Link Library and transform Object file into Executable file.

- The executable file is finally executed by the OS loader.

 

 

Intrinsic data types

- Data types include 8,16,32,64 bit unsigned or signed integer.

- As mentioned above, data definition statement in .data area set aside storage in memory.

- [name] directive initializer [,initializer]

- At least one initializer is required, and "?" is also allowed.

 

- The 'offset' of a variable is the distance from the beginning.

- A variable is automatically associated with offset.

Assembly code 3
Assembly code 4

- Instead of 'BYTE', 'db' could be used to allocate 1 byte data.

- Multiple initializers could be included at data allocation.

- When including a string as a initializer, it is good practice to include 0 after string.

- Also, BYTE directive is ideal for allocating strings of any length.

- $ operator is used for current location counter.

- dup operator is useful when allocating space for a string or array.

 

- Instead of 'WORD', 'dw' could be used to allocate 2 byte data.

- Instead of 'DWORD', 'dd' could be used to allocate 4 byte data.

 

 

Data allocation

- Bytes are stored in reverse order(Little Endian Order).

- For example, the value 12345678h is stored as 78/56/34/12 in memory.

 

 

Assembly Programming example 4

- Instead of using only register memory, we could also use variables to do the same functionality.

 

 

 

Symbolic Constant : Equal-sign directive
Symbolic Constant : EQU directive
Symbolic Constant : TEXTEQU directive

- 'Equal-sign directive(=)' is known as a redefinable equate.

- When assembled, the statement is changed like the image above.

- '$ operator' is used when calculating sizes of arrays and strings.

- 'EQU directive' is similar with equal-sign directive, but cannot be redefined later in the program.

- 'TEXTEQU directive' is similar with EQU and is used to create a text macro.

 

 

 

- For 64 bit programming, we should use 64 bit register and variables.

- For example, instead of DWORD, we should use QWORD, and instead of EAX, we should use RAX.

 

 

 

- For direct memory operands, we define variables because variable are reference to offsets in memory.

- MOV instruction should have the condition that both operands must be the same size.

- If types are not matched, an error results.

- Since there is no direct memory to memory move, instead of "MOV var2, var1", we should first move var1 into eax and move eax to var2.

 

- "MOVZX rxx,r/mxx " is a instruction for move zero extended, which means it can move operands with differnet sizes.

- In this case, the rest bits will be filled with 0s.

- "MOVSZ rxx,r/mxx" is a instruction of sign extended, which means it can move operand into upper half of destination register.

- In this case, the rest bits will be filled with 1.

 

- "XCHG reg/mem" is a instruction of exchanging contents of two operands.

- Like MOV, we cannot exchange memory and memory directly, so we should move one to register.

 

- We cannot access to second and third member of an array.

- There fore we use "[ ]" operator like "MOV al, [arrayA+1]".

 

 

 

- "ADD dest,src" means to add src and dest, and store it in dest.

- "SUB dest,src" means to subtract dest to src, and store it in dest.

- "NEG reg/mem" means to reverse the sign of a number by converting to its 2's complement.

 

 

- Zero flag is set when dest becomes zero.

- Sign flag is set when the arithmetic result is negative (MSB is 1).

- Carry flag is a unsigned arithmetic flag while Overflow is a signed arithmetic flag.

- Note that carry flag is not changed by "INC" or "DEC".

- Carry flag is set to 1 when ADD results too large unsigned result, or SUB results borrowing number from above of MSB.

- Overflow flag is set to 1 when ADD results too large signed result, or SUB results borrowing number from MSB.

- For example, ADD 127,1 or SUB -128,1 would result overflow.

- To summarize, (+)+(+)=(-) or (-)+(-)=(+) are cases when overflow occurs.

 

 

 

- There exist data-related operators and directives.

- Offset is the distance of a variable.

- PTR is a pointer that overrides a variable's default size.

- TYPE is size in bytes of each element.

- LENGTHOF is the number of elements in an array.

- SIZEOF is the number of bytes.

 

- LABEL directive inserts a label and gives it a size attribute without allocating any storage.