My Avatar

Shilong ZHAO

x86 History and Memory Models

2016-05-19 00:00:00 +0200

In case you have any questions or suggestions, you can leave comments HERE . Thanks!

8080

Released in 1974, Intel 8080 is a 8-bit CPU, which means it processes 8 bits of information at one time. However, it has 16 address lines coming out, which means it can access or 64K bytes.

The operating system most used with 8080 processor was CP/M-80 (CP/M short for Control Program for Microcomputers).

8086, 8088, 80186

Intel 8086 (1978), 8088 (8086 variant, 1979), 80186(1982) are called 16-bit because their internal registers are 16 bits in size. These processors had 20 address lines, numbered A0 to A19; with these, the processor can access  bytes or 1 megabyte.

However, to make 8086 backward compatible with CP/M-80 softwares run on 8080, the 8086 was set up in a way such that a program could take a portion of 64K byte segment within the megabyte of memory and run entirely inside it as if it runs in a 8080 system.

The design was achieved by using segment registers, they are

All segment registers are 16 bits in size, regardless of the CPU type. This is also true for 32-bit CPUs.

A segment in 8086 may start at any memory address that is a multiple of 16 (paragraph) and have a fixed size of 64K bytes. Note that two segments could overlap with each other, since there is no memory protection mechanism yet.

Since the internal address registers of these processors only had 16 bits. To access a 20-bit address space, an external memory reference was made up of a 16-bit Offset address added to a 16-bit Segment number, shifted 4 bits so as to produce a 20-bit physical address. The resulting address is equal to Segment * 16 + Offset. 

80286

The intel 80286 had a 24-bit address bus and was able to address up to 16 MB of RAM, compared to 1 MB for its predecessor.

The 286 was the first of the x86 CPU family to support protected mode. All the x86 CPUs before 80286 worked in real mode1.

A20 Line: A historical bug is now a routine

As we said earlier, the 80286’s predecessor, 8086, 8088, 80186 processors had only 20 address lines, numbered A0 to A19 and can access 1 megabyte.

When IBM designed the IBM PC AT machine, it decided to use the new higher-performance Intel 80286 microprocessor, since it could address up to 16 megabytes of system memory in protected mode.

The 80286 CPU was supposed to emulate an 8086’s behavior in real mode, which is its mode at system start up, so it could run operating systems and programs that were not written for protected mode (again for compatibility reason).

However, the 80286 had a bug: it failed to force the A20 line to zero in real mode.

For its predecessors, the segment:offset address will be rounded to the begining of the memory address when the value segment*16+offset is larger than 0xFFFFF (1M-1), so that the combination F800:8000 will be wrapped around to address 0x00000h, the memory’s first byte. But in 80286, for example, would no longer point to the physical address 0x00000000 but the correct address 0x00100000.

Enabling the Gate-A20 line is one of the first steps a protected mode x86 operating system does in the boot up process, often before control has been passed onto the kernel from the bootstrap (in the case of Linux, for example)2.

Intel no longer supports the A20 gate starting with Haswell. Page 271 of the Intel System Programmers Manual Vol. 3A from June 2013 #cite_note-states: “The functionality of A20M# is used primarily by older operating systems and not used by modern operating systems. On newer Intel 64 processors, A20M# may be absent.”3

Protected Mode

As we remembered in the real mode, each logical address points directly into physical memory location, every logical address consists of two 16 bit parts: The segment part of the logical address contains the base address of a segment with a granularity of 16 bytes.

the physical address can be calculated as physical_address : = segment_part × 16 + offset (if the address line A20 is enabled), respectively (segment_part × 16 + offset) mod 2^20 (if A20 is off).Every segment has a size of  bytes.

In protected mode, the segment_part is replaced by a 16-bit segment selector, in which the 13 upper bits (bit 3 to bit 15) contain the index of an entry inside a descriptor table. The next bit (bit 2), Table Indicator (TI) specifies whether the operation is used with the GDT or the LDT. The lowest two bits (bit 1 and bit 0) of the selector, Request Privilege Level (PRL) are combined to define the privilege of the request, where the values of 0 and 3 represent the highest and the lowest priority, respectively. This means that the byte offset of descriptors in the descriptor table is the same as the 16-bit selector, provided the lower three bits are zeroed.

Segment Selector

In protected mode, the segmentation registers aforementioned, i.e. CS, DS, SS are now holding the segment selectors instead of the segment addresses in the real mode.

The Global Descriptor Table(GDT) is an array of Segment Descriptors. It is a data structure used by Intel x86-family processors starting with the 80286 in order to define the characteristics of the various memory areas used during program execution, including the base address, the size and access privileges like executability and writability. 

The first segment descriptor of the GDT is always set to 0. This ensures that logical addresses with a null Segment Selector will be considered invalid, thus causing a processor exception.

A segment descriptor is 8 bytes long. Suppose the GDT table is at 0x00020000 (the value stored in the gdtr register) and the index specified by the Segment Selector is 2, the address of the corresponding Segment Descriptor is 0x00020000 + (2 × 8), or 0x00020010. After the Segment Descriptor is located, the segment base address and limit can be retrieved into corresponding registers.

Segment Descriptor

To enter protected mode, the GDT must first be created with a minimum of three entries: a null descriptor, a code segment descriptor and data segment descriptor. In an IBM-compatible machine, the A20 line(21st address line) also must be enabled to allow the use of all the address lines so that the CPU can access beyond 1 megabyte of memory (Only the first 20 are allowed to be used after power-up, to guarantee compatibility with older software written for the Intel 8088-based IBM PC and PC/XT models). After performing those two steps, the PE bit must be set in the CR0 register and a far jump must be made to clear the pre-fetch input queue.

BIOS uses Real Mode, since there is no GDT nor LDT yet in the memory at startup, and the code that initializes the GDT, LDT and paging tables must run in read mode.

An example of simple boot loader4, you will see the parts about GDT, A20, etc:

80386

When 80386(1986-2007) and later Intel CPUs came out, they can access 4GB of memory (16-bit segment register, 32-bit index register and 32-bit instruction pointer), but again, to maintain backward compatibility with 8086 and 8088, they are able to limit themselves to what the older chips could perform.

The 80386 added a 32-bit architecture and a paging translation unit, which made it much easier to implement operating systems that used virtual memory. It also offered support for register debugging5.

For 386 and its successors, there are control registers that changes or controls the general behavior of a CPU.

The CR0 register is 32 bits long on the 386 and higher processors. On x86-64 processors in long mode, it (and the other control registers) is 64 bits long. CR0 has various control flags that modify the basic operation of the processor. The two bits related with segmentation and paging are bit 0 PE and bit 31 PG6.

When Protected Mode Enable (PE) is set to 1, system is in protected mode; otherwise system is in real mode. This is usually done in the boot loader before kernel has been loaded.

If PG = 1, enable paging and use CR3 register else paging will be disabled. This is performed after kernel has been loaded.

Typically, the upper 20 bits of CR3 become the page directory base register(PDBR), which stores the physical address of the first page directory entry.

Regular Paging in Hardware

The physical address of the Page Directory in use is stored in the control register CR3.

Since the 80386, the Intel processors paging unit handles pages of size 4KB.

The 32 bits linear address is divided into 3 parts:

PDE

 

PTE

 

Memory Mapping

Extended Paging

Starting with Pentium, Intel added to x86 4MB pages instead of 4KB.

80486

Just as in the 80386, a simple flat 4 GB memory model could be implemented by setting all “segment selector” registers to a neutral value in protected mode, or setting (the same) “segment registers” to zero in real mode, and using only the 32-bit “offset registers” (x86-terminology for general CPU registers used as address registers) as a linear 32-bit virtual address bypassing the segmentation logic. Virtual addresses were then normally mapped onto physical addresses by the paging system except when it was disabled. (Real mode had no virtual addresses.) Just as with the 80386, circumventing memory segmentation could substantially improve performance in some operating systems and applications.

On a typical PC motherboard, either four matched 30-pin (8-bit) SIMMs or one 72-pin (32-bit) SIMM per bank were required to fit the 486’s 32-bit data bus. The address bus used 30-bits (A31..A2) complemented by four byte-select pins (instead of A0,A1) to allow for any 8/16/32-bit selection. This meant that the limit of directly addressable physical memory was 4 gigabytes as well,(230 32-bitwords = 232 8-bit words)7.

Recap

x86 Modes

x86 Processor has

Segmentation

Segmentation cannot be disabled, but can be a no-op (aka flat mode); a segment has

Actually Linux does not use segmentation (only paging is used), that means, the data segment, text segment are the same memory area, starts at 0x0, length equal to memory size.

 

  1. https://en.wikipedia.org/wiki/X86_memory_segmentation 

  2. https://www.win.tue.nl/~aeb/linux/kbd/A20.html 

  3. https://en.wikipedia.org/wiki/A20_line 

  4. https://pdos.csail.mit.edu/6.828/2014/labs/lab1/ 

  5. https://en.wikipedia.org/wiki/Intel_80386 

  6. https://en.wikipedia.org/wiki/Control_register#CR0 

  7. https://en.wikipedia.org/wiki/Intel_80486