Assembly: The Registers and Segments (MASM/TASM)

Posted March 3, 2006 by William_Wilson in Computer programming

This tech recipe contains the name and type of the registers and segments usable by a 32-bit processor. (These are easily converted by a naming convention to 16-bit and even 64-bit.)


Since there are no other tech recipes on Assembly Language, it would be irrational to start anywhere but the basics of hardware in assembly programming.

I myself do not use a Pentium processor, but since the assembler for which I am writing is specific to Pentium, some names may be off.

When using MASM or TASM, the naming conventions will work no matter the processor you have as long as it is in the corresponding size (eg. 32-bit, 64-bit, etc.).

The Registers:
Visual Representation:

---------------------------------------------
|11111111111111111111|22222222222|3333333333|
---------------------------------------------

This applies to EAX, EBX, ECX and EDX:
*EAX will be the example.
-The entire register (32-bits) containing 1’s 2’s and 3’s is EAX.
-The section filled with 1’s is the upper or left 16-bits of EAX, which cannot be accessed directly.
-The other half, filled with 2’s and 3’s, is the lower or right 16-bits of EAX, which can be used as a 16-bit register AX.
-The section of 2’s is the upper or left 8-bits of AX which can be used as an 8-bit register AH (register A, higher bits H).
-The section of 3’s is the lower or right 8-bits of AX which can be used as an 8-bit register AL (register A, lower bits L).

To show a number spanning multiple registers, the terminology of : is used. (e.g., To mean upper in EAX and lower in EBX, it is written as EAX:EBX, which would be a 64-bit number.)
This is mostly used when we are dividing numbers as the numerator can be as large as 32-bits, while the denominator may only be 16-bits.

Register Type and Use
General Use Registers (described above):
(E)AX: Increment Register (for loops, etc.)
(E)BX: Base Register (addresses and offsets)
(E)CX: Free Register (any value)
(E)DX: Free Register (any value)

Pointers Used During Code Excecution and Jumps:
(E)IP: Instruction Pointer (points to the next instruction as a relative offset of the Code Segment)
(E)BP: Base Pointer (shadows (E)SP and can be set manually)
(E)SP: Stack Pointer (points to the ‘end’ of the stack)

I used the phrase “end of the stack” since, in assembly, the stack builds in a downward order; therefore, a stack overflow would indicate (E)SP was pointing beyond the beginning of the stack segment. In turn, a stack underflow would be the result of placing more memory on the stack than is allocated.

Segments:
CS: code segment (holds the base pointer for the program)
SS: stack segment (points to the top item in the stack)
DS: data segment (points to the data elements)
ES: extra segment (can be added to any of the first three)
FS: extra segment (can be added to any of the first three)
GS: extra segment (can be added to any of the first three)

There are six segments because this is how many segments (one of each) can be loaded into the buffers and memory at a single time.

Questions/Comments: [email protected]
-William. ยง (marvin_gohan)

The Conversation

Follow the reactions below and share your own thoughts.