Assembly Language Fundamentals
- Main PROC / INVOKE ExitProcess,0 / Main ENDP 는 기본적으로 적어야하는 코드 틀이다.
- INVOKE ExitProcess,0을 시행하게 되면 OS 내에 특정 부분으로 JUMP하게 된다.
- 변수를 선언하기 위해서 .data와 .code로 구분하게 된다.
- Assembly 언어에서는 numeric literal을 정의하기 위해서 숫자 뒤에 d(decimal), h(hexadecimal), b(binary), q(octal)을 붙여줘야한다.
- 이때, default는 decimal이다.
- numeric literal 사이의 연산자에는 +, -, *, /, MOD 등이 있다.
- When we use variables, we should not use reserved words such as MOV, ADD, SUB, register names and etc.
- The first character of a identifier should be a letter, _, $, @.
- Assembly instruction is divided into 4 basic parts.
- [label:] [mnemonic] [operands] [;comment]
- label is a identifier as a place marker for instructions or data. While data label is called 'Variable', code label is called 'Label'.
- instruction mnemonics include 'MOV', 'ADD', 'SUB', 'MUL', 'JMP'.
- instruction is executed by the processor at runtime.
- operands is where the instruction will be applied, and registers(eax), memory data, constant expression could come.
- comments are short description of program, and starts with a semicolon.
- 'CALL DumpRegs' is a function call that displays memory.
- In the first two rows, values of 32-bit registers are displayed and then EIP(extended instruction pointer) and EFL(extended flag register) are displayed along with CF, SF, ZF, OF.
- INCLUDE Irvine32.inc copies definitions and setup info from Irvine32.inc.
- It reserve stack space by inserting '.stack 4096' code when code is executed.
- Overall, the basic template for Assembly Programming is as the image above.
- Source file is transformed into Object file by the assembler, and linker adds Link Library and transform Object file into Executable file.
- The executable file is finally executed by the OS loader.
- Data types include 8,16,32,64 bit unsigned or signed integer.
- As mentioned above, data definition statement in .data area set aside storage in memory.
- [name] directive initializer [,initializer]
- At least one initializer is required, and "?" is also allowed.
- The 'offset' of a variable is the distance from the beginning.
- A variable is automatically associated with offset.
- Instead of 'BYTE', 'db' could be used to allocate 1 byte data.
- Multiple initializers could be included at data allocation.
- When including a string as a initializer, it is good practice to include 0 after string.
- Also, BYTE directive is ideal for allocating strings of any length.
- $ operator is used for current location counter.
- dup operator is useful when allocating space for a string or array.
- Instead of 'WORD', 'dw' could be used to allocate 2 byte data.
- Instead of 'DWORD', 'dd' could be used to allocate 4 byte data.
- Bytes are stored in reverse order(Little Endian Order).
- For example, the value 12345678h is stored as 78/56/34/12 in memory.
- Instead of using only register memory, we could also use variables to do the same functionality.
- 'Equal-sign directive(=)' is known as a redefinable equate.
- When assembled, the statement is changed like the image above.
- '$ operator' is used when calculating sizes of arrays and strings.
- 'EQU directive' is similar with equal-sign directive, but cannot be redefined later in the program.
- 'TEXTEQU directive' is similar with EQU and is used to create a text macro.
- For 64 bit programming, we should use 64 bit register and variables.
- For example, instead of DWORD, we should use QWORD, and instead of EAX, we should use RAX.
- For direct memory operands, we define variables because variable are reference to offsets in memory.
- MOV instruction should have the condition that both operands must be the same size.
- If types are not matched, an error results.
- Since there is no direct memory to memory move, instead of "MOV var2, var1", we should first move var1 into eax and move eax to var2.
- "MOVZX rxx,r/mxx " is a instruction for move zero extended, which means it can move operands with differnet sizes.
- In this case, the rest bits will be filled with 0s.
- "MOVSZ rxx,r/mxx" is a instruction of sign extended, which means it can move operand into upper half of destination register.
- In this case, the rest bits will be filled with 1.
- "XCHG reg/mem" is a instruction of exchanging contents of two operands.
- Like MOV, we cannot exchange memory and memory directly, so we should move one to register.
- We cannot access to second and third member of an array.
- There fore we use "[ ]" operator like "MOV al, [arrayA+1]".
- "ADD dest,src" means to add src and dest, and store it in dest.
- "SUB dest,src" means to subtract dest to src, and store it in dest.
- "NEG reg/mem" means to reverse the sign of a number by converting to its 2's complement.
- Zero flag is set when dest becomes zero.
- Sign flag is set when the arithmetic result is negative (MSB is 1).
- Carry flag is a unsigned arithmetic flag while Overflow is a signed arithmetic flag.
- Note that carry flag is not changed by "INC" or "DEC".
- Carry flag is set to 1 when ADD results too large unsigned result, or SUB results borrowing number from above of MSB.
- Overflow flag is set to 1 when ADD results too large signed result, or SUB results borrowing number from MSB.
- For example, ADD 127,1 or SUB -128,1 would result overflow.
- To summarize, (+)+(+)=(-) or (-)+(-)=(+) are cases when overflow occurs.
- There exist data-related operators and directives.
- Offset is the distance of a variable.
- PTR is a pointer that overrides a variable's default size.
- TYPE is size in bytes of each element.
- LENGTHOF is the number of elements in an array.
- SIZEOF is the number of bytes.
- LABEL directive inserts a label and gives it a size attribute without allocating any storage.