Intel Architecture Software Developer’s Manual · 2005. 11. 21. · Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (2024)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (1)

Intel ArchitectureSoftware Developer’s

Manual

Volume 3:System Programming

NOTE: The Intel Architecture Software Developer’s Manual consists ofthree volumes: Basic Architecture, Order Number 243190; Instruction Set

Reference, Order Number 243191; and the System Programming Guide,Order Number 243192.

Please refer to all three volumes when evaluating your design needs.

1999

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (2)

Information in this document is provided in connection with Intel products. No license, express or implied, by estoppelor otherwise, to any intellectual property rights is granted by this document. Except as provided in Intel’s Terms andConditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or impliedwarranty, relating to sale and/or use of Intel products including liability or warranties relating to fitness for a particularpurpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products arenot intended for use in medical, life saving, or life sustaining applications.

Intel may make changes to specifications and product descriptions at any time, without notice.

Designers must not rely on the absence or characteristics of any features or instructions marked “reserved” or“undefined.” Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts orincompatibilities arising from future changes to them.

Intel’s Intel Architecture processors (e.g., Pentium®, Pentium® II, Pentium® III, and Pentium® Pro processors) maycontain design defects or errors known as errata which may cause the product to deviate from publishedspecifications. Current characterized errata are available on request.

Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing yourproduct order.

Copies of documents which have an ordering number and are referenced in this document, or other Intel literature,may be obtained by calling 1-800-548-4725, or by visiting Intel's literature center at http://www.intel.com.

COPYRIGHT © INTEL CORPORATION 1999 *THIRD-PARTY BRANDS AND NAMES ARE THE PROPERTY OF THEIR RESPECTIVE OWNERS.

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (3)

TABLE OF CONTENTS

CHAPTER 1ABOUT THIS MANUAL1.1. P6 FAMILY PROCESSOR TERMINOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11.2. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,

VOLUME 3: SYSTEM PROGRAMMING GUIDE. . . . . . . . . . . . . . . . . . . . . . . . . . . 1-11.3. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,

VOLUME 1: BASIC ARCHITECTURE 1-31.4. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL,

VOLUME 2: INSTRUCTION SET REFERENCE 1-51.5. NOTATIONAL CONVENTIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-51.5.1. Bit and Byte Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-61.5.2. Reserved Bits and Software Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-61.5.3. Instruction Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-71.5.4. Hexadecimal and Binary Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-71.5.5. Segmented Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-71.5.6. Exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-81.6. RELATED LITERATURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-9

CHAPTER 2SYSTEM ARCHITECTURE OVERVIEW2.1. OVERVIEW OF THE SYSTEM-LEVEL ARCHITECTURE . . . . . . . . . . . . . . . . . . . 2-12.1.1. Global and Local Descriptor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-32.1.2. System Segments, Segment Descriptors, and Gates . . . . . . . . . . . . . . . . . . . . . .2-32.1.3. Task-State Segments and Task Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42.1.4. Interrupt and Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42.1.5. Memory Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-52.1.6. System Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-52.1.7. Other System Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-62.2. MODES OF OPERATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-62.3. SYSTEM FLAGS AND FIELDS IN THE EFLAGS REGISTER . . . . . . . . . . . . . . . . 2-82.4. MEMORY-MANAGEMENT REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-102.4.1. Global Descriptor Table Register (GDTR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-102.4.2. Local Descriptor Table Register (LDTR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-112.4.3. IDTR Interrupt Descriptor Table Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-112.4.4. Task Register (TR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-112.5. CONTROL REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-122.5.1. CPUID Qualification of Control Register Flags . . . . . . . . . . . . . . . . . . . . . . . . . .2-182.6. SYSTEM INSTRUCTION SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-182.6.1. Loading and Storing System Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-202.6.2. Verifying of Access Privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-202.6.3. Loading and Storing Debug Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-212.6.4. Invalidating Caches and TLBs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-212.6.5. Controlling the Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-222.6.6. Reading Performance-Monitoring and Time-Stamp Counters . . . . . . . . . . . . . .2-222.6.7. Reading and Writing Model-Specific Registers . . . . . . . . . . . . . . . . . . . . . . . . . .2-232.6.8. Loading and Storing the Streaming SIMD Extensions Control/Status Word . . . .2-23

iii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (4)

TABLE OF CONTENTS

CHAPTER 3PROTECTED-MODE MEMORY MANAGEMENT3.1. MEMORY MANAGEMENT OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-13.2. USING SEGMENTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-33.2.1. Basic Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-33.2.2. Protected Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-43.2.3. Multisegment Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-53.2.4. Paging and Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-63.3. PHYSICAL ADDRESS SPACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-63.4. LOGICAL AND LINEAR ADDRESSES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-63.4.1. Segment Selectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-73.4.2. Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-83.4.3. Segment Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-93.4.3.1. Code- and Data-Segment Descriptor Types. . . . . . . . . . . . . . . . . . . . . . . . . .3-133.5. SYSTEM DESCRIPTOR TYPES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-153.5.1. Segment Descriptor Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-163.6. PAGING (VIRTUAL MEMORY) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-183.6.1. Paging Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-193.6.2. Page Tables and Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-203.6.2.1. Linear Address Translation (4-KByte Pages) . . . . . . . . . . . . . . . . . . . . . . . . .3-203.6.2.2. Linear Address Translation (4-MByte Pages). . . . . . . . . . . . . . . . . . . . . . . . .3-213.6.2.3. Mixing 4-KByte and 4-MByte Pages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-223.6.3. Base Address of the Page Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-233.6.4. Page-Directory and Page-Table Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-233.6.5. Not Present Page-Directory and Page-Table Entries . . . . . . . . . . . . . . . . . . . . .3-283.7. TRANSLATION LOOKASIDE BUFFERS (TLBS) . . . . . . . . . . . . . . . . . . . . . . . . . 3-283.8. PHYSICAL ADDRESS EXTENSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-293.8.1. Linear Address Translation With Extended

Addressing Enabled (4-KByte Pages) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-303.8.2. Linear Address Translation With Extended Addressing Enabled

(2-MByte or 4-MByte Pages) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-323.8.3. Accessing the Full Extended Physical Address Space With the

Extended Page-Table Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-323.8.4. Page-Directory and Page-Table Entries With Extended Addressing Enabled . .3-333.9. 36-BIT PAGE SIZE EXTENSION (PSE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-353.9.1. Description of the 36-bit PSE Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-363.9.2. Fault Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-393.10. MAPPING SEGMENTS TO PAGES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-40

CHAPTER 4PROTECTION4.1. ENABLING AND DISABLING SEGMENT AND PAGE PROTECTION . . . . . . . . . . 4-24.2. FIELDS AND FLAGS USED FOR SEGMENT-LEVEL AND

PAGE-LEVEL PROTECTION 4-24.3. LIMIT CHECKING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-54.4. TYPE CHECKING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-64.4.1. Null Segment Selector Checking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-74.5. PRIVILEGE LEVELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-84.6. PRIVILEGE LEVEL CHECKING WHEN ACCESSING DATA SEGMENTS . . . . . . 4-94.6.1. Accessing Data in Code Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-124.7. PRIVILEGE LEVEL CHECKING WHEN LOADING THE SS REGISTER . . . . . . . 4-12

iv

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (5)

TABLE OF CONTENTS

4.8. PRIVILEGE LEVEL CHECKING WHEN TRANSFERRING PROGRAM CONTROL BETWEEN CODE SEGMENTS 4-12

4.8.1. Direct Calls or Jumps to Code Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-134.8.1.1. Accessing Nonconforming Code Segments . . . . . . . . . . . . . . . . . . . . . . . . . 4-144.8.1.2. Accessing Conforming Code Segments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-154.8.2. Gate Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-164.8.3. Call Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-164.8.4. Accessing a Code Segment Through a Call Gate . . . . . . . . . . . . . . . . . . . . . . . 4-174.8.5. Stack Switching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-214.8.6. Returning from a Called Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-234.9. PRIVILEGED INSTRUCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-254.10. POINTER VALIDATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-254.10.1. Checking Access Rights (LAR Instruction) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-264.10.2. Checking Read/Write Rights (VERR and VERW Instructions) . . . . . . . . . . . . . 4-274.10.3. Checking That the Pointer Offset Is Within Limits (LSL Instruction) . . . . . . . . . 4-284.10.4. Checking Caller Access Privileges (ARPL Instruction) . . . . . . . . . . . . . . . . . . . 4-284.10.5. Checking Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-304.11. PAGE-LEVEL PROTECTION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-304.11.1. Page-Protection Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-314.11.2. Restricting Addressable Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-314.11.3. Page Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-324.11.4. Combining Protection of Both Levels of Page Tables . . . . . . . . . . . . . . . . . . . . 4-324.11.5. Overrides to Page Protection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-324.12. COMBINING PAGE AND SEGMENT PROTECTION . . . . . . . . . . . . . . . . . . . . . . 4-33

CHAPTER 5INTERRUPT AND EXCEPTION HANDLING5.1. INTERRUPT AND EXCEPTION OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15.1.1. Sources of Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15.1.1.1. External Interrupts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-25.1.1.2. Maskable Hardware Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-25.1.1.3. Software-Generated Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35.1.2. Sources of Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35.1.2.1. Program-Error Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35.1.2.2. Software-Generated Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35.1.2.3. Machine-Check Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-45.2. EXCEPTION AND INTERRUPT VECTORS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-45.3. EXCEPTION CLASSIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-45.4. PROGRAM OR TASK RESTART. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-75.5. NONMASKABLE INTERRUPT (NMI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-85.5.1. Handling Multiple NMIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-85.6. ENABLING AND DISABLING INTERRUPTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-85.6.1. Masking Maskable Hardware Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-85.6.2. Masking Instruction Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-95.6.3. Masking Exceptions and Interrupts When Switching Stacks . . . . . . . . . . . . . . . 5-105.7. PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND INTERRUPTS . . . . . 5-105.8. INTERRUPT DESCRIPTOR TABLE (IDT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-115.9. IDT DESCRIPTORS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-135.10. EXCEPTION AND INTERRUPT HANDLING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-155.10.1. Exception- or Interrupt-Handler Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-155.10.1.1. Protection of Exception- and Interrupt-Handler Procedures . . . . . . . . . . . . . 5-175.10.1.2. Flag Usage By Exception- or Interrupt-Handler Procedure. . . . . . . . . . . . . . 5-18

v

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (6)

TABLE OF CONTENTS

5.10.2. Interrupt Tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-185.11. ERROR CODE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-205.12. EXCEPTION AND INTERRUPT REFERENCE . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-21

CHAPTER 6TASK MANAGEMENT6.1. TASK MANAGEMENT OVERVIEW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16.1.1. Task Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-16.1.2. Task State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-26.1.3. Executing a Task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-36.2. TASK MANAGEMENT DATA STRUCTURES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-46.2.1. Task-State Segment (TSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-46.2.2. TSS Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-66.2.3. Task Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-86.2.4. Task-Gate Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-86.3. TASK SWITCHING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-106.4. TASK LINKING. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-146.4.1. Use of Busy Flag To Prevent Recursive Task Switching . . . . . . . . . . . . . . . . . .6-166.4.2. Modifying Task Linkages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-166.5. TASK ADDRESS SPACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-176.5.1. Mapping Tasks to the Linear and Physical Address Spaces. . . . . . . . . . . . . . . .6-176.5.2. Task Logical Address Space. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-186.6. 16-BIT TASK-STATE SEGMENT (TSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-19

CHAPTER 7MULTIPLE-PROCESSOR MANAGEMENT7.1. LOCKED ATOMIC OPERATIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-27.1.1. Guaranteed Atomic Operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-27.1.2. Bus Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-37.1.2.1. Automatic Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-37.1.2.2. Software Controlled Bus Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-47.1.3. Handling Self- and Cross-Modifying Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-57.1.4. Effects of a LOCK Operation on Internal Processor Caches. . . . . . . . . . . . . . . . .7-67.2. MEMORY ORDERING. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-67.2.1. Memory Ordering in the Pentium® and Intel486™ Processors. . . . . . . . . . . . . . .7-77.2.2. Memory Ordering in the P6 Family Processors. . . . . . . . . . . . . . . . . . . . . . . . . . .7-77.2.3. Out of Order Stores From String Operations in P6 Family Processors . . . . . . . . .7-97.2.4. Strengthening or Weakening the Memory Ordering Model . . . . . . . . . . . . . . . . . .7-97.3. PROPAGATION OF PAGE TABLE ENTRY CHANGES TO

MULTIPLE PROCESSORS 7-117.4. SERIALIZING INSTRUCTIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-117.5. ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC). . . . . . . . . 7-137.5.1. Presence of APIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-147.5.2. Enabling or Disabling the Local APIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-147.5.3. APIC Bus. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-147.5.4. Valid Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-157.5.5. Interrupt Sources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-157.5.6. Bus Arbitration Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-157.5.7. The Local APIC Block Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-167.5.8. Relocation of the APIC Registers Base Address. . . . . . . . . . . . . . . . . . . . . . . . .7-197.5.9. Interrupt Destination and APIC ID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-207.5.9.1. Physical Destination Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-20

vi

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (7)

TABLE OF CONTENTS

7.5.9.2. Logical Destination Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-207.5.9.3. Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-217.5.9.4. Cluster Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-217.5.9.5. Arbitration Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-227.5.10. Interrupt Distribution Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-227.5.11. Local Vector Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-237.5.12. Interprocessor and Self-Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-257.5.13. Interrupt Acceptance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-307.5.13.1. Interrupt Acceptance Decision Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . 7-307.5.13.2. Task Priority Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-317.5.13.3. Processor Priority Register (PPR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-327.5.13.4. Arbitration Priority Register (APR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-327.5.13.5. Spurious Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-337.5.13.6. End-Of-Interrupt (EOI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-337.5.14. Local APIC State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-337.5.14.1. Spurious-Interrupt Vector Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-347.5.14.2. Local APIC Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-357.5.14.3. Local APIC State After Power-Up Reset. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-357.5.14.4. Local APIC State After an INIT Reset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-357.5.14.5. Local APIC State After INIT-Deassert Message . . . . . . . . . . . . . . . . . . . . . . 7-357.5.15. Local APIC Version Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-367.5.16. APIC Bus Arbitration Mechanism and Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 7-367.5.16.1. Bus Message Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-377.5.16.2. APIC Bus Status Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-407.5.17. Error Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-427.5.18. Timer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-437.5.19. Software Visible Differences Between the Local APIC and the 82489DX. . . . . 7-447.5.20. Performance Related Differences between the Local APIC and the 82489DX . 7-457.5.21. New Features Incorporated in the Pentium® and P6 Family

Processors Local APIC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-457.6. DUAL-PROCESSOR (DP) INITIALIZATION PROTOCOL. . . . . . . . . . . . . . . . . . . 7-457.7. MULTIPLE-PROCESSOR (MP) INITIALIZATION PROTOCOL. . . . . . . . . . . . . . . 7-467.7.1. MP Initialization Protocol Requirements and Restrictions . . . . . . . . . . . . . . . . . 7-467.7.2. MP Protocol Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-477.7.3. Error Detection During the MP Initialization Protocol. . . . . . . . . . . . . . . . . . . . . 7-487.7.4. Error Handling During the MP Initialization Protocol . . . . . . . . . . . . . . . . . . . . . 7-487.7.5. MP Initialization Protocol Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-48

CHAPTER 8PROCESSOR MANAGEMENT AND INITIALIZATION8.1. INITIALIZATION OVERVIEW. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18.1.1. Processor State After Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28.1.2. Processor Built-In Self-Test (BIST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28.1.3. Model and Stepping Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-58.1.4. First Instruction Executed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-68.2. FPU INITIALIZATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-68.2.1. Configuring the FPU Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-68.2.2. Setting the Processor for FPU Software Emulation. . . . . . . . . . . . . . . . . . . . . . . 8-88.3. CACHE ENABLING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-88.4. MODEL-SPECIFIC REGISTERS (MSRS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-88.5. MEMORY TYPE RANGE REGISTERS (MTRRS) . . . . . . . . . . . . . . . . . . . . . . . . . . 8-98.6. SOFTWARE INITIALIZATION FOR REAL-ADDRESS MODE OPERATION . . . . 8-10

vii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (8)

TABLE OF CONTENTS

8.6.1. Real-Address Mode IDT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-108.6.2. NMI Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-108.7. SOFTWARE INITIALIZATION FOR PROTECTED-MODE OPERATION . . . . . . . 8-118.7.1. Protected-Mode System Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-128.7.2. Initializing Protected-Mode Exceptions and Interrupts . . . . . . . . . . . . . . . . . . . .8-128.7.3. Initializing Paging. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-128.7.4. Initializing Multitasking. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-138.8. MODE SWITCHING. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-138.8.1. Switching to Protected Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-148.8.2. Switching Back to Real-Address Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-158.9. INITIALIZATION AND MODE SWITCHING EXAMPLE. . . . . . . . . . . . . . . . . . . . . 8-168.9.1. Assembler Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-198.9.2. STARTUP.ASM Listing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-198.9.3. MAIN.ASM Source Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-298.9.4. Supporting Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-298.10. P6 FAMILY MICROCODE UPDATE FEATURE . . . . . . . . . . . . . . . . . . . . . . . . . . 8-318.10.1. Microcode Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-328.10.2. Microcode Update Loader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-358.10.2.1. Update Loading Procedure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-368.10.2.2. Hard Resets in Update Loading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-368.10.2.3. Update in a Multiprocessor System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-378.10.2.4. Update Loader Enhancements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-378.10.3. Update Signature and Verification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-378.10.3.1. Determining the Signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-388.10.3.2. Authenticating the Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-388.10.4. P6 Family Processor Microcode Update Specifications . . . . . . . . . . . . . . . . . . .8-398.10.4.1. Responsibilities of the BIOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-398.10.4.2. Responsibilities of the Calling Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-408.10.4.3. Microcode Update Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-438.10.4.4. INT 15h-based Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-438.10.4.5. Return Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-50

CHAPTER 9MEMORY CACHE CONTROL9.1. INTERNAL CACHES, TLBS, AND BUFFERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-19.2. CACHING TERMINOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-49.3. METHODS OF CACHING AVAILABLE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-59.3.1. Buffering of Write Combining Memory Locations . . . . . . . . . . . . . . . . . . . . . . . . .9-79.3.2. Choosing a Memory Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-89.4. CACHE CONTROL PROTOCOL. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-99.5. CACHE CONTROL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-99.5.1. Precedence of Cache Controls (P6 Family Processor) . . . . . . . . . . . . . . . . . . . .9-139.5.2. Preventing Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-149.6. CACHE MANAGEMENT INSTRUCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-159.7. SELF-MODIFYING CODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-159.8. IMPLICIT CACHING (P6 FAMILY PROCESSORS) . . . . . . . . . . . . . . . . . . . . . . . 9-169.9. EXPLICIT CACHING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-169.10. INVALIDATING THE TRANSLATION LOOKASIDE BUFFERS (TLBS) . . . . . . . . 9-179.11. WRITE BUFFER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-179.12. MEMORY TYPE RANGE REGISTERS (MTRRS) . . . . . . . . . . . . . . . . . . . . . . . . . 9-189.12.1. MTRR Feature Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-209.12.2. Setting Memory Ranges with MTRRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-21

viii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (9)

TABLE OF CONTENTS

9.12.2.1. MTRRdefType Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-219.12.2.2. Fixed Range MTRRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-229.12.2.3. Variable Range MTRRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-239.12.3. Example Base and Mask Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-259.12.4. Range Size and Alignment Requirement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-269.12.4.1. MTRR Precedences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-269.12.5. MTRR Initialization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-279.12.6. Remapping Memory Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-279.12.7. MTRR Maintenance Programming Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-289.12.7.1. MemTypeGet() Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-289.12.7.2. MemTypeSet() Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-299.12.8. Multiple-Processor Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-319.12.9. Large Page Size Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-329.13. PAGE ATTRIBUTE TABLE (PAT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-339.13.1. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-339.13.2. Detecting Support for the PAT Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-349.13.3. Technical Description of the PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-349.13.4. Accessing the PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-359.13.5. Programming the PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-38

CHAPTER 10MMX™ TECHNOLOGY SYSTEM PROGRAMMING10.1. EMULATION OF THE MMX™ INSTRUCTION SET . . . . . . . . . . . . . . . . . . . . . . . 10-110.2. THE MMX™ STATE AND MMX™ REGISTER ALIASING . . . . . . . . . . . . . . . . . . 10-110.2.1. Effect of MMX™ and Floating-Point Instructions on the FPU Tag Word . . . . . . 10-310.3. SAVING AND RESTORING THE MMX™ STATE AND REGISTERS. . . . . . . . . . 10-410.4. DESIGNING OPERATING SYSTEM TASK AND CONTEXT

SWITCHING FACILITIES 10-510.4.1. Using the TS Flag in Control Register CR0 to Control MMX™/FPU

State Saving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-510.5. EXCEPTIONS THAT CAN OCCUR WHEN EXECUTING

MMX™ INSTRUCTIONS 10-710.5.1. Effect of MMX™ Instructions on Pending Floating-Point Exceptions . . . . . . . . 10-810.6. DEBUGGING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-8

CHAPTER 11STREAMING SIMD EXTENSIONS SYSTEM PROGRAMMING11.1. EMULATION OF THE STREAMING SIMD EXTENSIONS . . . . . . . . . . . . . . . . . . 11-111.2. MMX™ STATE AND STREAMING SIMD EXTENSIONS . . . . . . . . . . . . . . . . . . . 11-111.3. NEW PENTIUM® III PROCESSOR REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . 11-111.3.1. SIMD Floating-point Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-211.3.2. SIMD Floating-point Control/Status Registers . . . . . . . . . . . . . . . . . . . . . . . . . . 11-211.3.2.1. Rounding Control Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-311.3.2.2. Flush-to-Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-511.4. ENABLING STREAMING SIMD EXTENSIONS SUPPORT. . . . . . . . . . . . . . . . . . 11-611.4.1. Enabling Streaming SIMD Extensions Support . . . . . . . . . . . . . . . . . . . . . . . . . 11-611.4.2. Device Not Available (DNA) Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-611.4.3. FXSAVE/FXRSTOR as a Replacement for FSAVE/FRSTOR. . . . . . . . . . . . . . 11-711.4.4. Numeric Error flag and IGNNE# . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-711.5. SAVING AND RESTORING THE STREAMING SIMD EXTENSIONS STATE . . . 11-711.6. DESIGNING OPERATING SYSTEM TASK AND CONTEXT

SWITCHING FACILITIES 11-8

ix

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (10)

TABLE OF CONTENTS

11.6.1. Using the TS Flag in Control Register CR0 to Control SIMD Floating-Point State Saving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-8

11.7. EXCEPTIONS THAT CAN OCCUR WHEN EXECUTING STREAMING SIMD EXTENSIONS INSTRUCTIONS 11-11

11.7.1. SIMD Floating-point Non-Numeric Exceptions . . . . . . . . . . . . . . . . . . . . . . . . .11-1211.7.2. SIMD Floating-point Numeric Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1311.7.2.1. Exception Priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1311.7.2.2. Automatic Masked Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1411.7.2.3. Software Exception Handling - Unmasked Exceptions. . . . . . . . . . . . . . . . .11-1511.7.2.4. Interaction with x87 numeric exceptions. . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1611.7.3. SIMD Floating-point Numeric Exception Conditions and

Masked/Unmasked Responses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1611.7.3.1. Invalid Operation Exception(#IA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1711.7.3.2. Division-By-Zero Exception (#Z). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1811.7.3.3. Denormal Operand Exception (#D) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1911.7.3.4. Numeric Overflow Exception (#O) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-1911.7.3.5. Numeric Underflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-2011.7.3.6. Inexact Result (Precision) Exception (#P) . . . . . . . . . . . . . . . . . . . . . . . . . .11-2111.7.4. Effect of Streaming SIMD Extensions Instructions on Pending

Floating-Point Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-2211.8. DEBUGGING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-22

CHAPTER 12SYSTEM MANAGEMENT MODE (SMM)12.1. SYSTEM MANAGEMENT MODE OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-112.2. SYSTEM MANAGEMENT INTERRUPT (SMI) . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-212.3. SWITCHING BETWEEN SMM AND THE OTHER PROCESSOR

OPERATING MODES 12-212.3.1. Entering SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-212.3.1.1. Exiting From SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-312.4. SMRAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-412.4.1. SMRAM State Save Map. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-512.4.2. SMRAM Caching. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-712.5. SMI HANDLER EXECUTION ENVIRONMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-812.6. EXCEPTIONS AND INTERRUPTS WITHIN SMM . . . . . . . . . . . . . . . . . . . . . . . 12-1012.7. NMI HANDLING WHILE IN SMM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1112.8. SAVING THE FPU STATE WHILE IN SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1112.9. SMM REVISION IDENTIFIER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1212.10. AUTO HALT RESTART . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1312.10.1. Executing the HLT Instruction in SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-1412.11. SMBASE RELOCATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1412.11.1. Relocating SMRAM to an Address Above 1 MByte. . . . . . . . . . . . . . . . . . . . . .12-1512.12. I/O INSTRUCTION RESTART . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1512.12.1. Back-to-Back SMI Interrupts When I/O Instruction Restart Is Being Used . . . .12-1612.13. SMM MULTIPLE-PROCESSOR CONSIDERATIONS. . . . . . . . . . . . . . . . . . . . . 12-17

CHAPTER 13MACHINE-CHECK ARCHITECTURE13.1. MACHINE-CHECK EXCEPTIONS AND ARCHITECTURE. . . . . . . . . . . . . . . . . . 13-113.2. COMPATIBILITY WITH PENTIUM® PROCESSOR . . . . . . . . . . . . . . . . . . . . . . . 13-113.3. MACHINE-CHECK MSRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-213.3.1. Machine-Check Global Control MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13-2

x

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (11)

TABLE OF CONTENTS

13.3.1.1. MCG_CAP MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-213.3.1.2. MCG_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-313.3.1.3. MCG_CTL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-413.3.2. Error-Reporting Register Banks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-413.3.2.1. MCi_CTL MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-413.3.2.2. MCi_STATUS MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-513.3.2.3. MCi_ADDR MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-613.3.2.4. MCi_MISC MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-713.3.3. Mapping of the Pentium® Processor Machine-Check Errors to the P6

Family Machine-Check Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-713.4. MACHINE-CHECK AVAILABILITY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-713.5. MACHINE-CHECK INITIALIZATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-713.6. INTERPRETING THE MCA ERROR CODES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-813.6.1. Simple Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-913.6.2. Compound Error Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-913.6.3. Interpreting the Machine-Check Error Codes for External Bus Errors. . . . . . . 13-1113.7. GUIDELINES FOR WRITING MACHINE-CHECK SOFTWARE . . . . . . . . . . . . . 13-1413.7.1. Machine-Check Exception Handler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-1413.7.2. Pentium® Processor Machine-Check Exception Handling . . . . . . . . . . . . . . . 13-1613.7.3. Logging Correctable Machine-Check Errors . . . . . . . . . . . . . . . . . . . . . . . . . . 13-16

CHAPTER 14CODE OPTIMIZATION14.1. CODE OPTIMIZATION GUIDELINES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-114.1.1. General Code Optimization Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-114.1.2. Guidelines for Optimizing MMX™ Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-214.1.3. Guidelines for Optimizing Floating-Point Code . . . . . . . . . . . . . . . . . . . . . . . . . 14-214.1.4. Guidelines for Optimizing SIMD Floating-point Code . . . . . . . . . . . . . . . . . . . . 14-314.2. BRANCH PREDICTION OPTIMIZATION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-414.2.1. Branch Prediction Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-414.2.2. Optimizing Branch Predictions in Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-514.2.3. Eliminating and Reducing the Number of Branches . . . . . . . . . . . . . . . . . . . . . 14-514.3. REDUCING PARTIAL REGISTER STALLS ON P6 FAMILY PROCESSORS. . . . 14-714.4. ALIGNMENT RULES AND GUIDELINES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-914.4.1. Alignment Penalties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-914.4.2. Code Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-914.4.3. Data Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-914.4.3.1. Alignment of Data Structures and Arrays Greater Than 32 Bytes . . . . . . . 14-1014.4.3.2. Alignment of Data in Memory and on the Stack . . . . . . . . . . . . . . . . . . . . . 14-1014.5. INSTRUCTION SCHEDULING OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1214.5.1. Instruction Pairing Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1214.5.1.1. General Pairing Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1214.5.1.2. Integer Pairing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1314.5.1.3. MMX™ Instruction Pairing Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1714.5.2. Pipelining Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1814.5.2.1. MMX™ Instruction Pipelining Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1814.5.2.2. Floating-Point Pipelining Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-1814.5.3. Scheduling Rules for P6 Family Processors . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2214.6. ACCESSING MEMORY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-2414.6.1. Using MMX™ Instructions That Access Memory. . . . . . . . . . . . . . . . . . . . . . . 14-2414.6.2. Partial Memory Accesses With MMX™ Instructions . . . . . . . . . . . . . . . . . . . . 14-2514.6.3. Write Allocation Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-27

xi

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (12)

TABLE OF CONTENTS

14.7. ADDRESSING MODES AND REGISTER USAGE . . . . . . . . . . . . . . . . . . . . . . . 14-2914.8. INSTRUCTION LENGTH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3014.9. PREFIXED OPCODES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-3114.10. INTEGER INSTRUCTION SELECTION AND OPTIMIZATIONS. . . . . . . . . . . . . 14-32

CHAPTER 15DEBUGGING AND PERFORMANCE MONITORING15.1. OVERVIEW OF THE DEBUGGING SUPPORT FACILITIES . . . . . . . . . . . . . . . . 15-115.2. DEBUG REGISTERS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-215.2.1. Debug Address Registers (DR0-DR3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-415.2.2. Debug Registers DR4 and DR5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-415.2.3. Debug Status Register (DR6) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-415.2.4. Debug Control Register (DR7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-515.2.5. Breakpoint Field Recognition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-615.3. DEBUG EXCEPTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-715.3.1. Debug Exception (#DB)—Interrupt Vector 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-815.3.1.1. Instruction-Breakpoint Exception Condition . . . . . . . . . . . . . . . . . . . . . . . . . .15-815.3.1.2. Data Memory and I/O Breakpoint Exception Conditions . . . . . . . . . . . . . . . .15-915.3.1.3. General-Detect Exception Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1015.3.1.4. Single-Step Exception Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1015.3.1.5. Task-Switch Exception Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1115.3.2. Breakpoint Exception (#BP)—Interrupt Vector 3 . . . . . . . . . . . . . . . . . . . . . . . .15-1115.4. LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING . . . . . . . . . . . . 15-1115.4.1. DebugCtlMSR Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1115.4.2. Last Branch and Last Exception MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1315.4.3. Monitoring Branches, Exceptions, and Interrupts . . . . . . . . . . . . . . . . . . . . . . .15-1315.4.4. Single-Stepping on Branches, Exceptions, and Interrupts . . . . . . . . . . . . . . . .15-1415.4.5. Initializing Last Branch or Last Exception/Interrupt Recording . . . . . . . . . . . . .15-1415.5. TIME-STAMP COUNTER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1415.6. PERFORMANCE-MONITORING COUNTERS . . . . . . . . . . . . . . . . . . . . . . . . . . 15-1515.6.1. P6 Family Processor Performance-Monitoring Counters . . . . . . . . . . . . . . . . .15-1515.6.1.1. PerfEvtSel0 and PerfEvtSel1 MSRs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1615.6.1.2. PerfCtr0 and PerfCtr1 MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1815.6.1.3. Starting and Stopping the Performance-Monitoring Counters . . . . . . . . . . .15-1815.6.1.4. Event and Time-Stamp Monitoring Software . . . . . . . . . . . . . . . . . . . . . . . .15-1815.6.2. Monitoring Counter Overflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-1915.6.3. Pentium® Processor Performance-Monitoring Counters. . . . . . . . . . . . . . . . . .15-2015.6.3.1. Control and Event Select Register (CESR) . . . . . . . . . . . . . . . . . . . . . . . . .15-2015.6.3.2. Use of the Performance-Monitoring Pins . . . . . . . . . . . . . . . . . . . . . . . . . . .15-2115.6.3.3. Events Counted . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-22

CHAPTER 168086 EMULATION16.1. REAL-ADDRESS MODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-116.1.1. Address Translation in Real-Address Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-316.1.2. Registers Supported in Real-Address Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-416.1.3. Instructions Supported in Real-Address Mode . . . . . . . . . . . . . . . . . . . . . . . . . .16-416.1.4. Interrupt and Exception Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-616.2. VIRTUAL-8086 MODE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-916.2.1. Enabling Virtual-8086 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-916.2.2. Structure of a Virtual-8086 Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-916.2.3. Paging of Virtual-8086 Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16-10

xii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (13)

TABLE OF CONTENTS

16.2.4. Protection within a Virtual-8086 Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1116.2.5. Entering Virtual-8086 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1116.2.6. Leaving Virtual-8086 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1316.2.7. Sensitive Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1416.2.8. Virtual-8086 Mode I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1416.2.8.1. I/O-Port-Mapped I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1516.2.8.2. Memory-Mapped I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1516.2.8.3. Special I/O Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1516.3. INTERRUPT AND EXCEPTION HANDLING IN VIRTUAL-8086 MODE . . . . . . . 16-1516.3.1. Class 1—Hardware Interrupt and Exception Handling in Virtual-8086 Mode . 16-1716.3.1.1. Handling an Interrupt or Exception Through a Protected-Mode Trap or

Interrupt Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1716.3.1.2. Handling an Interrupt or Exception With an 8086 Program Interrupt or

Exception Handler. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-1916.3.1.3. Handling an Interrupt or Exception Through a Task Gate . . . . . . . . . . . . . 16-2016.3.2. Class 2—Maskable Hardware Interrupt Handling in Virtual-8086

Mode Using the Virtual Interrupt Mechanism. . . . . . . . . . . . . . . . . . . . . . . . . . 16-2016.3.3. Class 3—Software Interrupt Handling in Virtual-8086 Mode . . . . . . . . . . . . . . 16-2316.3.3.1. Method 1: Software Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2516.3.3.2. Methods 2 and 3: Software Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . 16-2616.3.3.3. Method 4: Software Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2616.3.3.4. Method 5: Software Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2616.3.3.5. Method 6: Software Interrupt Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-2716.4. PROTECTED-MODE VIRTUAL INTERRUPTS . . . . . . . . . . . . . . . . . . . . . . . . . . 16-27

CHAPTER 17MIXING 16-BIT AND 32-BIT CODE17.1. DEFINING 16-BIT AND 32-BIT PROGRAM MODULES . . . . . . . . . . . . . . . . . . . . 17-217.2. MIXING 16-BIT AND 32-BIT OPERATIONS WITHIN A CODE SEGMENT. . . . . . 17-217.3. SHARING DATA AMONG MIXED-SIZE CODE SEGMENTS . . . . . . . . . . . . . . . . 17-317.4. TRANSFERRING CONTROL AMONG MIXED-SIZE CODE SEGMENTS . . . . . . 17-417.4.1. Code-Segment Pointer Size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-517.4.2. Stack Management for Control Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-517.4.2.1. Controlling the Operand-Size Attribute For a Call. . . . . . . . . . . . . . . . . . . . . 17-717.4.2.2. Passing Parameters With a Gate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-717.4.3. Interrupt Control Transfers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-817.4.4. Parameter Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-817.4.5. Writing Interface Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-8

CHAPTER 18INTEL ARCHITECTURE COMPATIBILITY18.1. INTEL ARCHITECTURE FAMILIES AND CATEGORIES . . . . . . . . . . . . . . . . . . . 18-118.2. RESERVED BITS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-118.3. ENABLING NEW FUNCTIONS AND MODES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-218.4. DETECTING THE PRESENCE OF NEW FEATURES THROUGH SOFTWARE . 18-218.5. MMX™ TECHNOLOGY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-318.6. STREAMING SIMD EXTENSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-318.7. NEW INSTRUCTIONS IN THE PENTIUM® AND LATER INTEL

ARCHITECTURE PROCESSORS 18-318.7.1. Instructions Added Prior to the Pentium® Processor. . . . . . . . . . . . . . . . . . . . . 18-518.8. OBSOLETE INSTRUCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-518.9. UNDEFINED OPCODES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-6

xiii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (14)

TABLE OF CONTENTS

18.10. NEW FLAGS IN THE EFLAGS REGISTER. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-618.10.1. Using EFLAGS Flags to Distinguish Between 32-Bit Intel

Architecture Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-618.11. STACK OPERATIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-718.11.1. PUSH SP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-718.11.2. EFLAGS Pushed on the Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-718.12. FPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-718.12.1. Control Register CR0 Flags. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-818.12.2. FPU Status Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-818.12.2.1. Condition Code Flags (C0 through C3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-818.12.2.2. Stack Fault Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-918.12.3. FPU Control Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-918.12.4. FPU Tag Word. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-918.12.5. Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1018.12.5.1. NaNs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1018.12.5.2. Pseudo-zero, Pseudo-NaN, Pseudo-infinity, and Unnormal Formats . . . . .18-1018.12.6. Floating-Point Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1118.12.6.1. Denormal Operand Exception (#D) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1118.12.6.2. Numeric Overflow Exception (#O) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1118.12.6.3. Numeric Underflow Exception (#U) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1218.12.6.4. Exception Precedence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1218.12.6.5. CS and EIP For FPU Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1218.12.6.6. FPU Error Signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1218.12.6.7. Assertion of the FERR# Pin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1318.12.6.8. Invalid Operation Exception On Denormals . . . . . . . . . . . . . . . . . . . . . . . . .18-1318.12.6.9. Alignment Check Exceptions (#AC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1318.12.6.10. Segment Not Present Exception During FLDENV . . . . . . . . . . . . . . . . . . . .18-1418.12.6.11. Device Not Available Exception (#NM). . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1418.12.6.12. Coprocessor Segment Overrun Exception . . . . . . . . . . . . . . . . . . . . . . . . . .18-1418.12.6.13. General Protection Exception (#GP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1418.12.6.14. Floating-Point Error Exception (#MF) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1418.12.7. Changes to Floating-Point Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1418.12.7.1. FDIV, FPREM, and FSQRT Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1518.12.7.2. FSCALE Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1518.12.7.3. FPREM1 Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1518.12.7.4. FPREM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1518.12.7.5. FUCOM, FUCOMP, and FUCOMPP Instructions. . . . . . . . . . . . . . . . . . . . .18-1518.12.7.6. FPTAN Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1518.12.7.7. Stack Overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1618.12.7.8. FSIN, FCOS, and FSINCOS Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1618.12.7.9. FPATAN Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1618.12.7.10. F2XM1 Instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1618.12.7.11. FLD Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1618.12.7.12. FXTRACT Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1718.12.7.13. Load Constant Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1718.12.7.14. FSETPM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1718.12.7.15. FXAM Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1718.12.7.16. FSAVE and FSTENV Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1818.12.8. Transcendental Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1818.12.9. Obsolete Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1818.12.10. WAIT/FWAIT Prefix Differences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-1818.12.11. Operands Split Across Segments and/or Pages . . . . . . . . . . . . . . . . . . . . . . . .18-18

xiv

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (15)

TABLE OF CONTENTS

18.12.12. FPU Instruction Synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1918.13. SERIALIZING INSTRUCTIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-1918.14. FPU AND MATH COPROCESSOR INITIALIZATION . . . . . . . . . . . . . . . . . . . . . 18-1918.14.1. Intel 387 and Intel 287 Math Coprocessor Initialization . . . . . . . . . . . . . . . . . . 18-1918.14.2. Intel486™ SX Processor and Intel 487 SX Math Coprocessor Initialization . . 18-2018.15. CONTROL REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2118.16. MEMORY MANAGEMENT FACILITIES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2318.16.1. New Memory Management Control Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2318.16.1.1. Physical Memory Addressing Extension. . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2318.16.1.2. Global Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2318.16.1.3. Larger Page Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2318.16.2. CD and NW Cache Control Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2318.16.3. Descriptor Types and Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2418.16.4. Changes in Segment Descriptor Loads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2418.17. DEBUG FACILITIES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2418.17.1. Differences in Debug Register DR6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2418.17.2. Differences in Debug Register DR7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2418.17.3. Debug Registers DR4 and DR5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2518.17.4. Recognition of Breakpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2518.18. TEST REGISTERS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2518.19. EXCEPTIONS AND/OR EXCEPTION CONDITIONS . . . . . . . . . . . . . . . . . . . . . 18-2518.19.1. Machine-Check Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2718.19.2. Priority OF Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2718.20. INTERRUPTS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2718.20.1. Interrupt Propagation Delay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2718.20.2. NMI Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2818.20.3. IDT Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2818.21. TASK SWITCHING AND TSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2818.21.1. P6 Family and Pentium® Processor TSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2818.21.2. TSS Selector Writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2818.21.3. Order of Reads/Writes to the TSS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2818.21.4. Using A 16-Bit TSS with 32-Bit Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2918.21.5. Differences in I/O Map Base Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-2918.22. CACHE MANAGEMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3018.22.1. Self-Modifying Code with Cache Enabled . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3118.23. PAGING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3118.23.1. Large Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3218.23.2. PCD and PWT Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3218.23.3. Enabling and Disabling Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3218.24. STACK OPERATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3318.24.1. Selector Pushes and Pops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3318.24.2. Error Code Pushes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3318.24.3. Fault Handling Effects on the Stack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3318.24.4. Interlevel RET/IRET From a 16-Bit Interrupt or Call Gate . . . . . . . . . . . . . . . . 18-3418.25. MIXING 16- AND 32-BIT SEGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3418.26. SEGMENT AND ADDRESS WRAPAROUND. . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3518.26.1. Segment Wraparound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3518.27. WRITE BUFFERS AND MEMORY ORDERING . . . . . . . . . . . . . . . . . . . . . . . . . 18-3618.28. BUS LOCKING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3718.29. BUS HOLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-3718.30. TWO WAYS TO RUN INTEL 286 PROCESSOR TASKS. . . . . . . . . . . . . . . . . . 18-3718.31. MODEL-SPECIFIC EXTENSIONS TO THE INTEL ARCHITECTURE . . . . . . . . 18-38

xv

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (16)

TABLE OF CONTENTS

18.31.1. Model-Specific Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-3818.31.2. RDMSR and WRMSR Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-3818.31.3. Memory Type Range Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-3918.31.4. Machine-Check Exception and Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . .18-3918.31.5. Performance-Monitoring Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-40

APPENDIX APERFORMANCE-MONITORING EVENTSA.1. P6 FAMILY PROCESSOR PERFORMANCE-MONITORING EVENTS . . . . . . . . . A-1A.2. PENTIUM® PROCESSOR PERFORMANCE-MONITORING EVENTS . . . . . . . . A-12

APPENDIX BMODEL-SPECIFIC REGISTERS

APPENDIX CDUAL-PROCESSOR (DP) BOOTUP SEQUENCE EXAMPLE (SPECIFIC TO PENTIUM

®

PROCESSORS)C.1. PRIMARY PROCESSOR’S SEQUENCE OF EVENTS . . . . . . . . . . . . . . . . . . . . . . C-1C.2. SECONDARY PROCESSOR’S SEQUENCE OF EVENTS FOLLOWING

RECEIPT OF START-UP IPI C-4

APPENDIX DMULTIPLE-PROCESSOR (MP) BOOTUP SEQUENCE EXAMPLE (SPECIFIC TO P6 FAMILY PROCESSORS)D.1. BSP’S SEQUENCE OF EVENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-1D.2. AP’S SEQUENCE OF EVENTS FOLLOWING RECEIPT OF START-UP IPI . . . . . D-3

APPENDIX EPROGRAMMING THE LINT0 AND LINT1 INPUTSE.1. CONSTANTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E-1E.2. LINT[0:1] PINS PROGRAMMING PROCEDURE . . . . . . . . . . . . . . . . . . . . . . . . . . E-1

xvi

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (17)

TABLE OF FIGURES

Figure 1-1. Bit and Byte Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-6Figure 2-1. System-Level Registers and Data Structures. . . . . . . . . . . . . . . . . . . . . . . . . .2-2Figure 2-2. Transitions Among the Processor’s Operating Modes . . . . . . . . . . . . . . . . . . .2-7Figure 2-3. System Flags in the EFLAGS Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-8Figure 2-4. Memory Management Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10Figure 2-5. Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-12Figure 3-1. Segmentation and Paging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2Figure 3-2. Flat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-4Figure 3-3. Protected Flat Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-4Figure 3-4. Multisegment Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5Figure 3-5. Logical Address to Linear Address Translation . . . . . . . . . . . . . . . . . . . . . . . .3-7Figure 3-6. Segment Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-8Figure 3-7. Segment Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9Figure 3-8. Segment Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11Figure 3-9. Segment Descriptor When Segment-Present Flag Is Clear . . . . . . . . . . . . . .3-13Figure 3-10. Global and Local Descriptor Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17Figure 3-11. Pseudo-Descriptor Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-18Figure 3-12. Linear Address Translation (4-KByte Pages) . . . . . . . . . . . . . . . . . . . . . . . . .3-21Figure 3-13. Linear Address Translation (4-MByte Pages). . . . . . . . . . . . . . . . . . . . . . . . .3-22Figure 3-14. Format of Page-Directory and Page-Table Entries for 4-KByte Pages

and 32-Bit Physical Addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24Figure 3-15. Format of Page-Directory Entries for 4-MByte Pages and 32-Bit Addresses .3-25Figure 3-16. Format of a Page-Table or Page-Directory Entry for a Not-Present Page . . .3-28Figure 3-17. Register CR3 Format When the Physical Address Extension is Enabled . . .3-30Figure 3-18. Linear Address Translation With Extended Physical Addressing

Enabled (4-KByte Pages) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-31Figure 3-19. Linear Address Translation With Extended Physical Addressing

Enabled (2-MByte or 4-MByte Pages) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-33Figure 3-20. Format of Page-Directory-Pointer-Table, Page-Directory, and Page-Table

Entries for 4-KByte Pages and 36-Bit Extended Physical Addresses . . . . . .3-34Figure 3-21. Format of Page-Directory-Pointer-Table and Page-Directory Entries for

2- or 4-MByte Pages and 36-Bit Extended Physical Addresses. . . . . . . . . . .3-35Figure 3-22. PDE Format Differences between 36-bit and 32-bit addressing. . . . . . . . . . .3-38Figure 3-23. Memory Management Convention That Assigns a Page Table to

Each Segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-40Figure 4-1. Descriptor Fields Used for Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-4Figure 4-2. Protection Rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8Figure 4-3. Privilege Check for Data Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10Figure 4-4. Examples of Accessing Data Segments From Various Privilege Levels . . . .4-11Figure 4-5. Privilege Check for Control Transfer Without Using a Gate . . . . . . . . . . . . . .4-13Figure 4-6. Examples of Accessing Conforming and Nonconforming Code

Segments From Various Privilege Levels. . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14Figure 4-7. Call-Gate Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-17Figure 4-8. Call-Gate Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Figure 4-9. Privilege Check for Control Transfer with Call Gate . . . . . . . . . . . . . . . . . . . .4-19Figure 4-10. Example of Accessing Call Gates At Various Privilege Levels. . . . . . . . . . . .4-20Figure 4-11. Stack Switching During an Interprivilege-Level Call . . . . . . . . . . . . . . . . . . . .4-23Figure 4-12. Use of RPL to Weaken Privilege Level of Called Procedure . . . . . . . . . . . . .4-29Figure 5-1. Relationship of the IDTR and IDT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-13

xvii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (18)

TABLE OF FIGURES

Figure 5-2. IDT Gate Descriptors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14Figure 5-3. Interrupt Procedure Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16Figure 5-4. Stack Usage on Transfers to Interrupt and Exception-Handling Routines . . .5-17Figure 5-5. Interrupt Task Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-19Figure 5-6. Error Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-20Figure 5-7. Page-Fault Error Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-45Figure 6-1. Structure of a Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-2Figure 6-2. 32-Bit Task-State Segment (TSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-5Figure 6-3. TSS Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-7Figure 6-4. Task Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-9Figure 6-5. Task-Gate Descriptor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-9Figure 6-6. Task Gates Referencing the Same Task . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-11Figure 6-7. Nested Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-15Figure 6-8. Overlapping Linear-to-Physical Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . .6-18Figure 6-9. 16-Bit TSS Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-20Figure 7-1. Example of Write Ordering in Multiple-Processor Systems . . . . . . . . . . . . . . .7-8Figure 7-2. I/O APIC and Local APICs in Multiple-Processor Systems . . . . . . . . . . . . . .7-14Figure 7-3. Local APIC Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-17Figure 7-4. APIC_BASE_MSR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-19Figure 7-5. Local APIC ID Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-20Figure 7-6. Logical Destination Register (LDR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-21Figure 7-7. Destination Format Register (DFR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-21Figure 7-8. Local Vector Table (LVT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-24Figure 7-9. Interrupt Command Register (ICR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-26Figure 7-10. IRR, ISR and TMR Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-30Figure 7-11. Interrupt Acceptance Flow Chart for the Local APIC . . . . . . . . . . . . . . . . . . .7-31Figure 7-12. Task Priority Register (TPR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-32Figure 7-13. EOI Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-33Figure 7-14. Spurious-Interrupt Vector Register (SVR) . . . . . . . . . . . . . . . . . . . . . . . . . . .7-34Figure 7-15. Local APIC Version Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-36Figure 7-16. Error Status Register (ESR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-42Figure 7-17. Divide Configuration Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-43Figure 7-18. Initial Count and Current Count Registers . . . . . . . . . . . . . . . . . . . . . . . . . . .7-44Figure 7-19. SMP System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-49Figure 8-1. Contents of CR0 Register after Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-5Figure 8-2. Processor Type and Signature in the EDX Register after Reset . . . . . . . . . . .8-5Figure 8-3. Processor State After Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-17Figure 8-4. Constructing Temporary GDT and Switching to Protected Mode

(Lines 162-172 of List File) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-26Figure 8-5. Moving the GDT, IDT and TSS from ROM to RAM

(Lines 196-261 of List File) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-27Figure 8-6. Task Switching (Lines 282-296 of List File) . . . . . . . . . . . . . . . . . . . . . . . . . .8-28Figure 8-7. Integrating Processor Specific Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-32Figure 8-8. Format of the Microcode Update Data Block . . . . . . . . . . . . . . . . . . . . . . . . .8-35Figure 8-9. Write Operation Flow Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-47Figure 9-1. Intel Architecture Caches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-2Figure 9-2. Cache-Control Mechanisms Available in the Intel Architecture Processors . .9-10Figure 9-3. Mapping Physical Memory With MTRRs . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-20Figure 9-4. MTRRcap Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-21Figure 9-5. MTRRdefType Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-22Figure 9-6. MTRRphysBasen and MTRRphysMaskn Variable-Range Register Pair . . . .9-24Figure 9-7. Page Attribute Table Model Specific Register . . . . . . . . . . . . . . . . . . . . . . . .9-34

xviii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (19)

TABLE OF FIGURES

Figure 9-8. Page Attribute Table Index Scheme for Paging Hierarchy . . . . . . . . . . . . . . 9-36Figure 10-1. Mapping of MMX™ Registers to Floating-Point Registers . . . . . . . . . . . . . . 10-2Figure 10-2. Example of MMX™/FPU State Saving During an

Operating System-Controlled Task Switch . . . . . . . . . . . . . . . . . . . . . . . . . . 10-6Figure 10-3. Mapping of MMX™ Registers to Floating-Point (FP) Registers . . . . . . . . . . 10-9Figure 11-1. Streaming SIMD Extensions Control/Status Register Format. . . . . . . . . . . . 11-3Figure 11-2. Example of SIMD Floating-Point State Saving During an

Operating System-Controlled Task Switch . . . . . . . . . . . . . . . . . . . . . . . . . . 11-9Figure 12-1. SMRAM Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-5Figure 12-2. SMM Revision Identifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-13Figure 12-3. Auto HALT Restart Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-13Figure 12-4. SMBASE Relocation Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-15Figure 12-5. I/O Instruction Restart Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-16Figure 13-1. Machine-Check MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-2Figure 13-2. MCG_CAP Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3Figure 13-3. MCG_STATUS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-3Figure 13-4. MCi_CTL Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-4Figure 13-5. MCi_STATUS Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-5Figure 13-6. Machine-Check Bank Address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13-6Figure 14-1. Stack and Memory Layout of Static Variables . . . . . . . . . . . . . . . . . . . . . . 14-11Figure 14-2. Pipeline Example of AGI Stall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14-29Figure 15-1. Debug Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-3Figure 15-2. DebugCtlMSR Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-12Figure 15-3. PerfEvtSel0 and PerfEvtSel1 MSRs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-17Figure 15-4. CESR MSR (Pentium® Processor Only) . . . . . . . . . . . . . . . . . . . . . . . . . . . 15-21Figure 16-1. Real-Address Mode Address Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16-4Figure 16-2. Interrupt Vector Table in Real-Address Mode. . . . . . . . . . . . . . . . . . . . . . . . 16-7Figure 16-3. Entering and Leaving Virtual-8086 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . 16-12Figure 16-4. Privilege Level 0 Stack After Interrupt or Exception in Virtual-8086 Mode . 16-18Figure 16-5. Software Interrupt Redirection Bit Map in TSS . . . . . . . . . . . . . . . . . . . . . . 16-25Figure 17-1. Stack after Far 16- and 32-Bit Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17-6Figure 18-1. I/O Map Base Address Differences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-30

xix

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (20)

TABLE OF FIGURES

xx

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (21)

TABLE OF TABLES

Table 2-1. Action Taken for Combinations of EM, MP, TS, CR4.OSFXSR, and CPUID.XMM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-15

Table 2-2. Summary of System Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-19Table 3-1. Code- and Data-Segment Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-14Table 3-2. System-Segment and Gate-Descriptor Types . . . . . . . . . . . . . . . . . . . . . . . .3-16Table 3-3. Page Sizes and Physical Address Sizes . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20Table 3-4. Paging Modes and Physical Address Size . . . . . . . . . . . . . . . . . . . . . . . . . . .3-37Table 4-1. Privilege Check Rules for Call Gates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-19Table 4-2. Combined Page-Directory and Page-Table Protection. . . . . . . . . . . . . . . . . .4-33Table 5-1. Protected-Mode Exceptions and Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . .5-6Table 5-2. SIMD Floating-Point Exceptions Priority. . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-11Table 5-3. Priority Among Simultaneous Exceptions and Interrupts . . . . . . . . . . . . . . . .5-12Table 5-4. Interrupt and Exception Classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-32Table 5-5. Conditions for Generating a Double Fault . . . . . . . . . . . . . . . . . . . . . . . . . . .5-33Table 5-6. Invalid TSS Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-35Table 5-7. Alignment Requirements by Data Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50Table 6-1. Exception Conditions Checked During a Task Switch . . . . . . . . . . . . . . . . . .6-13Table 6-2. Effect of a Task Switch on Busy Flag, NT Flag, Previous Task Link Field,

and TS Flag . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-15Table 7-1. Local APIC Register Address Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-18Table 7-2. Valid Combinations for the APIC Interrupt Command Register . . . . . . . . . . .7-29Table 7-3. EOI Message (14 Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-37Table 7-4. Short Message (21 Cycles) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-38Table 7-5. Nonfocused Lowest Priority Message (34 Cycles) . . . . . . . . . . . . . . . . . . . .7-39Table 7-6. APIC Bus Status Cycles Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-40Table 7-7. Types of Boot Phase IPIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-47Table 7-8. Boot Phase IPI Message Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-47Table 8-1. 32-Bit Intel Architecture Processor States Following Power-up,

Reset, or INIT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-3Table 8-2. Recommended Settings of EM and MP Flags on Intel

Architecture Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7Table 8-3. Software Emulation Settings of EM, MP, and NE Flags . . . . . . . . . . . . . . . . . .8-8Table 8-4. Main Initialization Steps in STARTUP.ASM Source Listing . . . . . . . . . . . . . .8-18Table 8-5. Relationship Between BLD Item and ASM Source File . . . . . . . . . . . . . . . . .8-31Table 8-6. P6 Family Processor MSR Register Components . . . . . . . . . . . . . . . . . . . . .8-33Table 8-7. Microcode Update Encoding Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-34Table 8-8. Microcode Update Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-43Table 8-9. Parameters for the Presence Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-44Table 8-10. Parameters for the Write Update Data Function. . . . . . . . . . . . . . . . . . . . . . .8-45Table 8-11. Parameters for the Control Update Sub-function . . . . . . . . . . . . . . . . . . . . . .8-48Table 8-12. Mnemonic Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-48Table 8-13. Parameters for the Read Microcode Update Data Function. . . . . . . . . . . . . .8-49Table 8-14. Return Code Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-50Table 9-1. Characteristics of the Caches, TLBs, and Write Buffer in

Intel Architecture Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-3Table 9-2. Methods of Caching Available in P6 Family, Pentium®,

and Intel486™ Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-6Table 9-3. MESI Cache Line States. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-9Table 9-4. Cache Operating Modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-11

xxi

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (22)

TABLE OF TABLES

Table 9-5. Effective Memory Type Depending on MTRR, PCD, and PWT Settings . . . .9-14Table 9-6. MTRR Memory Types and Their Properties . . . . . . . . . . . . . . . . . . . . . . . . . .9-19Table 9-7. Address Mapping for Fixed-Range MTRRs . . . . . . . . . . . . . . . . . . . . . . . . . .9-23Table 9-8. PAT Indexing and Values After Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-35Table 9-9. Effective Memory Type Depending on MTRRs and PAT . . . . . . . . . . . . . . . .9-37Table 9-10. PAT Memory Types and Their Properties . . . . . . . . . . . . . . . . . . . . . . . . . . .9-38Table 10-1. Effects of MMX™ Instructions on FPU State . . . . . . . . . . . . . . . . . . . . . . . . .10-3Table 10-2. Effect of the MMX™ and Floating-Point Instructions on the

FPU Tag Word . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-3Table 11-1. SIMD Floating-point Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-2Table 11-2. Rounding Control Field (RC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-4Table 11-3. Rounding of Positive Numbers Greater than the

Maximum Positive Finite Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-5Table 11-4. Rounding of Negative Numbers Smaller than the

Maximum Negative Finite Value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-5Table 11-5. CPUID Bits for Streaming SIMD Extensions Support . . . . . . . . . . . . . . . . . .11-6Table 11-6. CR4 Bits for Streaming SIMD Extensions Support . . . . . . . . . . . . . . . . . . . .11-6Table 11-7. Streaming SIMD Extensions Faults . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-12Table 11-8. Invalid Arithmetic Operations and the Masked Responses to Them . . . . . .11-18Table 11-9. Masked Responses to Numeric Overflow . . . . . . . . . . . . . . . . . . . . . . . . . .11-20Table 12-1. SMRAM State Save Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-5Table 12-2. Processor Register Initialization in SMM . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-9Table 12-3. Auto HALT Restart Flag Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-14Table 12-4. I/O Instruction Restart Field Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12-16Table 13-1. Simple Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13-9Table 13-2. General Forms of Compound Error Codes. . . . . . . . . . . . . . . . . . . . . . . . . . .13-9Table 13-3. Encoding for TT (Transaction Type) Sub-Field. . . . . . . . . . . . . . . . . . . . . . .13-10Table 13-4. Level Encoding for LL (Memory Hierarchy Level) Sub-Field . . . . . . . . . . . .13-10Table 13-5. Encoding of Request (RRRR) Sub-Field . . . . . . . . . . . . . . . . . . . . . . . . . . .13-10Table 13-6. Encodings of PP, T, and II Sub-Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13-11Table 13-7. Encoding of the MCi_STATUS Register for External Bus Errors . . . . . . . .13-11Table 14-1. Small and Large General-Purpose Register Pairs . . . . . . . . . . . . . . . . . . . . .14-7Table 14-2. Pairable Integer Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14-14Table 15-1. Breakpointing Examples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-7Table 15-2. Debug Exception Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15-8Table 16-1. Real-Address Mode Exceptions and Interrupts . . . . . . . . . . . . . . . . . . . . . .16-8Table 16-2. Software Interrupt Handling Methods While in Virtual-8086 Mode. . . . . . . .16-24Table 17-1. Characteristics of 16-Bit and 32-Bit Program Modules. . . . . . . . . . . . . . . . . .17-1Table 18-1. New Instructions in the Pentium® and Later Intel Architecture Processors . .18-3Table 18-1. Recommended Values of the FP Related Bits for Intel486™ SX

Microprocessor/Intel 487 SX Math Coprocessor System . . . . . . . . . . . . . . .18-20Table 18-2. EM and MP Flag Interpretation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .18-20Table A-1. Events That Can Be Counted with the P6 Family Performance-

Monitoring Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2Table A-2. Events That Can Be Counted with the Pentium® Processor Performance-

Monitoring Counters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12Table B-1. Model-Specific Registers (MSRs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1

xxii

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (23)

1

About This Manual

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (24)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (25)

CHAPTER 1ABOUT THIS MANUAL

The Intel Architecture Software Developer’s Manual, Volume 2: Instruction Set Reference(Order Number 243191) is part of a three-volume set that describes the architecture andprogramming environment of all Intel Architecture processors. The other two volumes in thisset are:

• The Intel Architecture Software Developer’s Manual, Volume 1: Basic Architecture (OrderNumber 243190).

• The Intel Architecture Software Developer’s Manual, Volume 3: System Programing Guide(Order Number 243192).

The Intel Architecture Software Developer’s Manual, Volume 1, describes the basic architectureand programming environment of an Intel Architecture processor; the Intel Architecture Soft-ware Developer’s Manual, Volume 2, describes the instructions set of the processor and theopcode structure. These two volumes are aimed at application programmers who are writingprograms to run under existing operating systems or executives. The Intel Architecture SoftwareDeveloper’s Manual, Volume 3, describes the operating-system support environment of an IntelArchitecture processor, including memory management, protection, task management, interruptand exception handling, and system management mode. It also provides Intel Architectureprocessor compatibility information. This volume is aimed at operating-system and BIOSdesigners and programmers.

1.1. P6 FAMILY PROCESSOR TERMINOLOGY

This manual includes information pertaining primarily to the 32-bit Intel Architecture proces-sors, which include the Intel386™, Intel486™, and Pentium® processors, and the P6 familyprocessors. The P6 family processors are those Intel Architecture processors based on the P6family microarchitecture. This family includes the Pentium® Pro, Pentium® II, Pentium® IIIprocessor, and any future processors based on the P6 family microarchitecture.

1.2. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL, VOLUME 3 : SYSTEM PROGRAMMING GUIDE

The contents of this manual are as follows:

Chapter 1 — About This Manual. Gives an overview of all three volumes of the Intel Archi-tecture Software Developer’s Manual. It also describes the notational conventions in thesemanuals and lists related Intel manuals and documentation of interest to programmers and hard-ware designers.

1-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (26)

ABOUT THIS MANUAL

level,ron-um

ns.

Chapter 2 — System Architecture Overview. Describes the modes of operation of an IntelArchitecture processor and the mechanisms provided in the Intel Architecture to support oper-ating systems and executives, including the system-oriented registers and data structures and thesystem-oriented instructions. The steps necessary for switching between real-address andprotected modes are also identified.

Chapter 3 — Protected-Mode Memory Management. Describes the data structures, registers,and instructions that support segmentation and paging and explains how they can be used toimplement a “flat” (unsegmented) memory model or a segmented memory model.

Chapter 4 — Protection. Describes the support for page and segment protection provided inthe Intel Architecture. This chapter also explains the implementation of privilege rules, stackswitching, pointer validation, user and supervisor modes.

Chapter 5 — Interrupt and Exception Handling. Describes the basic interrupt mechanismsdefined in the Intel Architecture, shows how interrupts and exceptions relate to protection, anddescribes how the architecture handles each exception type. Reference information for eachIntel Architecture exception is given at the end of this chapter.

Chapter 6 — Task Management. Describes the mechanisms the Intel Architecture provides tosupport multitasking and inter-task protection.

Chapter 7 — Multiple-Processor Management. Describes the instructions and flags thatsupport multiple processors with shared memory, memory ordering, and the advanced program-mable interrupt controller (APIC).

Chapter 8 — Processor Management and Initialization. Defines the state of an Intel Archi-tecture processor and its floating-point and SIMD floating-point units after reset initialization.This chapter also explains how to set up an Intel Architecture processor for real-address modeoperation and protected- mode operation, and how to switch between modes.

Chapter 9 — Memory Cache Control. Describes the general concept of caching and thecaching mechanisms supported by the Intel Architecture. This chapter also describes thememory type range registers (MTRRs) and how they can be used to map memory types of phys-ical memory. MTRRs were introduced into the Intel Architecture with the Pentium® Proprocessor. It also presents information on using the new cache control and memory streaminginstructions introduced with the Pentium® III processor.

Chapter 10 — MMX™ Technology System Programming. Describes those aspects of theIntel MMX™ technology that must be handled and considered at the system programmingincluding task switching, exception handling, and compatibility with existing system enviments. The MMX™ technology was introduced into the Intel Architecture with the Penti®

processor.

Chapter 11 — Streaming SIMD Extensions System Programming. Describes those aspectsof Streaming SIMD Extensions that must be handled and considered at the system programminglevel, including task switching, exception handling, and compatibility with existing systemenvironments. Streaming SIMD Extensions were introduced into the Intel Architecture with thePentium® processor.

Chapter 12 — System Management Mode (SMM). Describes the Intel Architecture’s systemmanagement mode (SMM), which can be used to implement power management functio

1-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (27)

ABOUT THIS MANUAL

Chapter 13 — Machine-Check Architecture. Describes the machine-check architecture,which was introduced into the Intel Architecture with the Pentium® processor.

Chapter 14 — Code Optimization. Discusses general optimization techniques for program-ming an Intel Architecture processor.

Chapter 15 — Debugging and Performance Monitoring. Describes the debugging registersand other debug mechanism provided in the Intel Architecture. This chapter also describes thetime-stamp counter and the performance-monitoring counters.

Chapter 16 — 8086 Emulation. Describes the real-address and virtual-8086 modes of the IntelArchitecture.

Chapter 17 — Mixing 16-Bit and 32-Bit Code. Describes how to mix 16-bit and 32-bit codemodules within the same program or task.

Chapter 18 — Intel Architecture Compatibility. Describes the programming differencesbetween the Intel 286, Intel386™, Intel486™, Pentium®, and P6 family processors. The differ-ences among the 32-bit Intel Architecture processors (the Intel386™, Intel486™, Pentium®, andP6 family processors) are described throughout the three volumes of the Intel Architecture Soft-ware Developer’s Manual, as relevant to particular features of the architecture. This chapterprovides a collection of all the relevant compatibility information for all Intel Architectureprocessors and also describes the basic differences with respect to the 16-bit Intel Architectureprocessors (the Intel 8086 and Intel 286 processors).

Appendix A — Performance-Monitoring Events. Lists the events that can be counted withthe performance-monitoring counters and the codes used to select these events. Both Pentium®

processor and P6 family processor events are described.

Appendix B — Model-Specific Registers (MSRs). Lists the MSRs available in the Pentium®

and P6 family processors and their functions.

Appendix C — Dual-Processor (DP) Bootup Sequence Example (Specific to Pentium®

Processors). Gives an example of how to use the DP protocol to boot two Pentium® processors(a primary processor and a secondary processor) in a DP system and initialize their APICs.

Appendix D — Multiple-Processor (MP) Bootup Sequence Example (Specific to P6 FamilyProcessors). Gives an example of how to use of the MP protocol to boot two P6 family proces-sors in a MP system and initialize their APICs.

Appendix E — Programming the LINT0 and LINT1 Inputs. Gives an example of how toprogram the LINT0 and LINT1 pins for specific interrupt vectors.

1.3. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL, VOLUME 1: BASIC ARCHITECTURE

The contents of the Intel Architecture Software Developer’s Manual, Volume 1 are as follows:

Chapter 1 — About This Manual. Gives an overview of all three volumes of the Intel Archi-tecture Software Developer’s Manual. It also describes the notational conventions in these

1-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (28)

ABOUT THIS MANUAL

s are

X™

ort

manuals and lists related Intel manuals and documentation of interest to programmers and hard-ware designers.

Chapter 2 — Introduction to the Intel Architecture. Introduces the Intel Architecture and thefamilies of Intel processors that are based on this architecture. It also gives an overview of thecommon features found in these processors and brief history of the Intel Architecture.

Chapter 3 — Basic Execution Environment. Introduces the models of memory organizationand describes the register set used by applications.

Chapter 4 — Procedure Calls, Interrupts, and Exceptions. Describes the procedure stackand the mechanisms provided for making procedure calls and for servicing interrupts andexceptions.

Chapter 5 — Data Types and Addressing Modes. Describes the data types and addressingmodes recognized by the processor.

Chapter 6 — Instruction Set Summary. Gives an overview of all the Intel Architectureinstructions except those executed by the processor’s floating-point unit. The instructionpresented in functionally related groups.

Chapter 7 — Floating-Point Unit. Describes the Intel Architecture floating-point unit,including the floating-point registers and data types; gives an overview of the floating-pointinstruction set; and describes the processor’s floating-point exception conditions.

Chapter 8 — Programming with the Intel MMX™ Technology. Describes the Intel MMX™technology, including MMX™ registers and data types, and gives an overview of the MMinstruction set.

Chapter 9 — Programming with the Streaming SIMD Extensions. Describes the IntelStreaming SIMD Extensions, including the registers and data types.

Chapter 10— Input/Output. Describes the processor’s I/O architecture, including I/O paddressing, the I/O instructions, and the I/O protection mechanism.

Chapter 11 — Processor Identification and Feature Determination. Describes how to deter-mine the CPU type and the features that are available in the processor.

Appendix A — EFLAGS Cross-Reference. Summarizes how the Intel Architecture instruc-tions affect the flags in the EFLAGS register.

Appendix B — EFLAGS Condition Codes. Summarizes how the conditional jump, move, andbyte set on condition code instructions use the condition code flags (OF, CF, ZF, SF, and PF) inthe EFLAGS register.

Appendix C — Floating-Point Exceptions Summary. Summarizes the exceptions that can beraised by floating-point instructions.

Appendix D — SIMD Floating-Point Exceptions Summary. Provides the Streaming SIMDExtensions mnemonics, and the exceptions that each instruction can cause.

Appendix E — Guidelines for Writing FPU Exception Handlers. Describes how to designand write MS-DOS* compatible exception handling facilities for FPU and SIMD floating-pointexceptions, including both software and hardware requirements and assembly-language code

1-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (29)

ABOUT THIS MANUAL

g

and

ion ofeasier

examples. This appendix also describes general techniques for writing robust FPU exceptionhandlers.

Appendix F — Guidelines for Writing SIMD-FP Exception Handlers. Provides guidelinesfor the Streaming SIMD Extensions instructions that can generate numeric (floating-point)exceptions, and gives an overview of the necessary support for handling such exceptions.

1.4. OVERVIEW OF THE INTEL ARCHITECTURE SOFTWARE DEVELOPER’S MANUAL, VOLUME 2: INSTRUCTION SET REFERENCE

The contents of the Intel Architecture Software Developer’s Manual, Volume 2, are as follows:

Chapter 1 — About This Manual. Gives an overview of all three volumes of the Intel Archi-tecture Software Developer’s Manual. It also describes the notational conventions in thesemanuals and lists related Intel manuals and documentation of interest to programmers and hard-ware designers.

Chapter 2 — Instruction Format. Describes the machine-level instruction format used for allIntel Architecture instructions and gives the allowable encodings of prefixes, the operand-iden-tifier byte (ModR/M byte), the addressing-mode specifier byte (SIB byte), and the displacementand immediate bytes.

Chapter 3 — Instruction Set Reference. Describes each of the Intel Architecture instructionsin detail, including an algorithmic description of operations, the effect on flags, the effect ofoperand- and address-size attributes, and the exceptions that may be generated. The instructionsare arranged in alphabetical order. The FPU, MMX™ Technology instructions, and StreaminSIMD Extensions are included in this chapter.

Appendix A — Opcode Map. Gives an opcode map for the Intel Architecture instruction set.

Appendix B — Instruction Formats and Encodings. Gives the binary encoding of each formof each Intel Architecture instruction.

Appendix C — Compiler Intrinsics and Functional Equivalents. Gives the Intel C/C++compiler intrinsics and functional equivalents for the MMX™ Technology instructions Streaming SIMD Extensions.

1.5. NOTATIONAL CONVENTIONS

This manual uses special notation for data-structure formats, for symbolic representatinstructions, and for hexadecimal numbers. A review of this notation makes the manual to read.

1-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (30)

ABOUT THIS MANUAL

bered

ftwareshouldelines

1.5.1. Bit and Byte Order

In illustrations of data structures in memory, smaller addresses appear toward the bottom of thefigure; addresses increase toward the top. Bit positions are numbered from right to left. Thenumerical value of a set bit is equal to two raised to the power of the bit position. Intel Archi-tecture processors are “little endian” machines; this means the bytes of a word are numstarting from the least significant byte. Figure 1-1 illustrates these conventions.

1.5.2. Reserved Bits and Software Compatibility

In many register and memory layout descriptions, certain bits are marked as reserved. Whenbits are marked as reserved, it is essential for compatibility with future processors that sotreat these bits as having a future, though unknown, effect. The behavior of reserved bits be regarded as not only undefined, but unpredictable. Software should follow these guidin dealing with reserved bits:

• Do not depend on the states of any reserved bits when testing the values of registers whichcontain such bits. Mask out the reserved bits before testing.

• Do not depend on the states of any reserved bits when storing to memory or to a register.

• Do not depend on the ability to retain information written into any reserved bits.

• When loading a register, always load the reserved bits with the values indicated in thedocumentation, if any, or reload them with values previously read from the same register.

NOTE

Avoid any software dependence upon the state of reserved bits in Intel Archi-tecture registers. Depending upon the values of reserved register bits willmake software dependent upon the unspecified manner in which theprocessor handles these bits. Programs that depend upon reserved values riskincompatibility with future processors.

Figure 1-1. Bit and Byte Order

Byte 3

HighestData Structure

Byte 1Byte 2 Byte 0

31 24 23 16 15 8 7 0Address

Lowest

Bit offset28

24201612840 Address

Byte Offset

1-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (31)

ABOUT THIS MANUAL

onfu-

ed as a used tolled an

here a

always

1.5.3. Instruction Operands

When instructions are represented symbolically, a subset of the Intel Architecture assemblylanguage is used. In this subset, an instruction has the following format:

label: mnemonic argument1, argument2, argument3

where:

• A label is an identifier which is followed by a colon.

• A mnemonic is a reserved name for a class of instruction opcodes which have the samefunction.

• The operands argument1, argument2, and argument3 are optional. There may be fromzero to three operands, depending on the opcode. When present, they take the form ofeither literals or identifiers for data items. Operand identifiers are either reserved names ofregisters or are assumed to be assigned to data items declared in another part of theprogram (which may not be shown in the example).

When two operands are present in an arithmetic or logical instruction, the right operand is thesource and the left operand is the destination.

For example:

LOADREG: MOV EAX, SUBTOTAL

In this example, LOADREG is a label, MOV is the mnemonic identifier of an opcode, EAX isthe destination operand, and SUBTOTAL is the source operand. Some assembly languages putthe source and destination in reverse order.

1.5.4. Hexadecimal and Binary Numbers

Base 16 (hexadecimal) numbers are represented by a string of hexadecimal digits followed bythe character H (for example, F82EH). A hexadecimal digit is a character from the followingset: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F.

Base 2 (binary) numbers are represented by a string of 1s and 0s, sometimes followed by thecharacter B (for example, 1010B). The “B” designation is only used in situations where csion as to the type of number might arise.

1.5.5. Segmented Addressing

The processor uses byte addressing. This means memory is organized and accesssequence of bytes. Whether one or more bytes are being accessed, a byte address islocate the byte or bytes of memory. The range of memory that can be addressed is caaddress space.

The processor also supports segmented addressing. This is a form of addressing wprogram may have many independent address spaces, called segments. For example, a programcan keep its code (instructions) and stack in separate segments. Code addresses would

1-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (32)

ABOUT THIS MANUAL

refer to the code space, and stack addresses would always refer to the stack space. The followingnotation is used to specify a byte address within a segment:

Segment-register:Byte-address

For example, the following segment address identifies the byte at address FF79H in the segmentpointed by the DS register:

DS:FF79H

The following segment address identifies an instruction address in the code segment. The CSregister points to the code segment and the EIP register contains the address of the instruction.

CS:EIP

1.5.6. Exceptions

An exception is an event that typically occurs when an instruction causes an error. For example,an attempt to divide by zero generates an exception. However, some exceptions, such as break-points, occur under other conditions. Some types of exceptions may provide error codes. Anerror code reports additional information about the error. An example of the notation used toshow an exception and error code is shown below.

#PF(fault code)

This example refers to a page-fault exception under conditions where an error code naming atype of fault is reported. Under some conditions, exceptions which produce error codes may notbe able to report an accurate code. In this case, the error code is zero, as shown below for ageneral-protection exception.

#GP(0)

Refer to Chapter 5, Interrupt and Exception Handling, for a list of exception mnemonics andtheir descriptions.

1-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (33)

ABOUT THIS MANUAL

ory

1.6. RELATED LITERATURE

The following books contain additional material related to Intel processors:

• Intel Pentium® II Processor Specification Update, Order Number 243337-010.

• Intel Pentium® Pro Processor Specification Update, Order Number 242689-031.

• Intel Pentium® Processor Specification Update, Order Number 242480.

• AP-485, Intel Processor Identification and the CPUID Instruction, Order Number 241618-006.

• AP-578, Software and Hardware Considerations for FPU Exception Handlers for IntelArchitecture Processors, Order Number 243291.

• Pentium® Pro Processor Data Book, Order Number 242690.

• Pentium® Pro BIOS Writer’s Guide, http://www.intel.com/procs/ppro/info/index.htm.

• Pentium® Processor Data Book, Order Number 241428.

• 82496 Cache Controller and 82491 Cache SRAM Data Book For Use With the Pentium®

Processor, Order Number 241429.

• Intel486™ Microprocessor Data Book, Order Number 240440.

• Intel486™ SX CPU/Intel487™ SX Math Coprocessor Data Book, Order Number 240950.

• Intel486™ DX2 Microprocessor Data Book, Order Number 241245.

• Intel486™ Microprocessor Product Brief Book, Order Number 240459.

• Intel386™ Processor Hardware Reference Manual, Order Number 231732.

• Intel386™ Processor System Software Writer's Guide, Order Number 231499.

• Intel386™ High-Performance 32-Bit CHMOS Microprocessor with Integrated MemManagement, Order Number 231630.

• 376 Embedded Processor Programmer’s Reference Manual, Order Number 240314.

• 80387 DX User’s Manual Programmer’s Reference, Order Number 231917.

• 376 High-Performance 32-Bit Embedded Processor, Order Number 240182.

• Intel386™ SX Microprocessor, Order Number 240187.

• Intel Architecture Optimization Manual, Order Number 242816-002.

1-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (34)

ABOUT THIS MANUAL

1-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (35)

2

System Architecture Overview

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (36)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (37)

SYSTEM ARCHITECTURE OVERVIEW

ssor’s

tailedr alsotem levels.

ystem chap-es usedrams.

s, andement,multi-d data

CHAPTER 2SYSTEM ARCHITECTURE OVERVIEW

The 32-bit members of the Intel Architecture family of processors provide extensive support foroperating-system and system-development software. This support is part of the procesystem-level architecture and includes features to assist in the following operations:

• Memory management

• Protection of software modules

• Multitasking

• Exception and interrupt handling

• Multiprocessing

• Cache management

• Hardware resource and power management

• Debugging and performance monitoring

This chapter provides a brief overview of the processor’s system-level architecture; a dedescription of each part of this architecture given in the following chapters. This chaptedescribes the system registers that are used to set up and control the processor at the sysand gives a brief overview of the processor’s system-level (operating system) instruction

Many of the system-level architectural features of the processor are used only by sprogrammers. Application programmers may need to read this chapter, and the followingters which describe the use of these features, in order to understand the hardware facilitiby system programmers to create a reliable and secure environment for application prog

NOTE

This overview and most of the subsequent chapters of this book focus on the“native” or protected-mode operation of the 32-bit Intel Architectureprocessors. As described in Chapter 8, Processor Management and Initial-ization, all Intel Architecture processors enter real-address mode following apower-up or reset. Software must then initiate a switch from real-addressmode to protected mode.

2.1. OVERVIEW OF THE SYSTEM-LEVEL ARCHITECTURE

The Intel Architecture’s system architecture consists of a set of registers, data structureinstructions designed to support basic system-level operations such as memory managinterrupt and exception handling, task management, and control of multiple processors (processing). Figure 2-1 provides a generalized summary of the system registers anstructures.

2-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (38)

SYSTEM ARCHITECTURE OVERVIEW

Figure 2-1. System-Level Registers and Data Structures

Local DescriptorTable (LDT)

EFLAGS Register

Control Registers

CR1CR2CR3CR4

CR0Global Descriptor

Table (GDT)

Interrupt DescriptorTable (IDT)

IDTR

GDTR

Interrupt Gate

Trap Gate

LTD Desc.

TSS Desc.

Code

Stack

CodeStack

CodeStack

Task-StateSegment (TSS)

CodeData

Stack

Task

Interrupt Handler

Exception Handler

Protected Procedure

TSS Seg. Sel.

Call-GateSegment Selector

Dir Table OffsetLinear Address

Page Directory

Pg. Dir. Entry

Linear Address Space

Linear Addr.

Seg. Desc.Segment Sel.

Code, Data orStack Segment

InterruptVector

TSS Desc.

Seg. Desc.

Task Gate

CurrentTSS

Call Gate

Task-StateSegment (TSS)

CodeData

Stack

Task

Seg. Desc.

CurrentTSS

CurrentTSS

Segment Selector

Linear Address

Task Register

CR3*

Page Table

Pg. Tbl. Entry

Page

Physical Addr.

LDTR

This page mapping example is for 4-KByte pagesand the normal 32-bit physical address size.

Register

*Physical Address

Physical Address

MXCSR1

1. MXCSR is new control/status register in the Pentium® III processor.

2-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (39)

SYSTEM ARCHITECTURE OVERVIEW

2.1.1. Global and Local Descriptor Tables

When operating in protected mode, all memory accesses pass through either the globaldescriptor table (GDT) or the (optional) local descriptor table (LDT), shown in Figure 2-1.These tables contain entries called segment descriptors. A segment descriptor provides the baseaddress of a segment and access rights, type, and usage information. Each segment descriptorhas a segment selector associated with it. The segment selector provides an index into the GDTor LDT (to its associated segment descriptor), a global/local flag (that determines whether thesegment selector points to the GDT or the LDT), and access rights information.

To access a byte in a segment, both a segment selector and an offset must be supplied. Thesegment selector provides access to the segment descriptor for the segment (in the GDT orLDT). From the segment descriptor, the processor obtains the base address of the segment in thelinear address space. The offset then provides the location of the byte relative to the baseaddress. This mechanism can be used to access any valid code, data, or stack segment in theGDT or LDT, provided the segment is accessible from the current privilege level (CPL) at whichthe processor is operating. (The CPL is defined as the protection level of the currently executingcode segment.)

In Figure 2-1 the solid arrows indicate a linear address, the dashed lines indicate a segmentselector, and the dotted arrows indicate a physical address. For simplicity, many of the segmentselectors are shown as direct pointers to a segment. However, the actual path from a segmentselector to its associated segment is always through the GDT or LDT.

The linear address of the base of the GDT is contained in the GDT register (GDTR); the linearaddress of the LDT is contained in the LDT register (LDTR).

2.1.2. System Segments, Segment Descriptors, and Gates

Besides the code, data, and stack segments that make up the execution environment of a programor procedure, the system architecture also defines two system segments: the task-state segment(TSS) and the LDT. (The GDT is not considered a segment because it is not accessed by meansof a segment selector and segment descriptor.) Each of these segment types has a segmentdescriptor defined for it.

The system architecture also defines a set of special descriptors called gates (the call gate, inter-rupt gate, trap gate, and task gate) that provide protected gateways to system procedures andhandlers that operate at different privilege levels than application programs and procedures.For example, a CALL to a call gate provides access to a procedure in a code segment that is atthe same or numerically lower privilege level (more privileged) than the current code segment.To access a procedure through a call gate, the calling procedure1 must supply the selector of thecall gate. The processor than performs an access rights check on the call gate, comparing theCPL with the privilege level of the call gate and the destination code segment pointed to by thecall gate. If access to the destination code segment is allowed, the processor gets the segmentselector for the destination code segment and an offset into that code segment from the call gate.

1. The word “procedure” is commonly used in this document as a general term for a logical unit or block ofcode (such as a program, procedure, function, or routine). The term is not restricted to the definition of aprocedure in the Intel Architecture assembly language.

2-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (40)

SYSTEM ARCHITECTURE OVERVIEW

If the call requires a change in privilege level, the processor also switches to the stack for thatprivilege level. (The segment selector for the new stack is obtained from the TSS for thecurrently running task.) Gates also facilitate transitions between 16-bit and 32-bit codesegments, and vice versa.

2.1.3. Task-State Segments and Task Gates

The TSS (refer to Figure 2-1) defines the state of the execution environment for a task. Itincludes the state of the general-purpose registers, the segment registers, the EFLAGS register,the EIP register, and segment selectors and stack pointers for three stack segments (one stackeach for privilege levels 0, 1, and 2). It also includes the segment selector for the LDT associatedwith the task and the page-table base address.

All program execution in protected mode happens within the context of a task, called the currenttask. The segment selector for the TSS for the current task is stored in the task register. Thesimplest method of switching to a task is to make a call or jump to the task. Here, the segmentselector for the TSS of the new task is given in the CALL or JMP instruction. In switching tasks,the processor performs the following actions:

1. Stores the state of the current task in the current TSS.

2. Loads the task register with the segment selector for the new task.

3. Accesses the new TSS through a segment descriptor in the GDT.

4. Loads the state of the new task from the new TSS into the general-purpose registers, thesegment registers, the LDTR, control register CR3 (page-table base address), the EFLAGSregister, and the EIP register.

5. Begins execution of the new task.

A task can also be accessed through a task gate. A task gate is similar to a call gate, except thatit provides access (through a segment selector) to a TSS rather than a code segment.

2.1.4. Interrupt and Exception Handling

External interrupts, software interrupts, and exceptions are handled through the interruptdescriptor table (IDT), refer to Figure 2-1. The IDT contains a collection of gate descriptors,which provide access to interrupt and exception handlers. Like the GDT, the IDT is not asegment. The linear address of the base of the IDT is contained in the IDT register (IDTR).

The gate descriptors in the IDT can be of the interrupt-, trap-, or task-gate type. To access aninterrupt or exception handler, the processor must first receive an interrupt vector (interruptnumber) from internal hardware, an external interrupt controller, or from software by means ofan INT, INTO, INT 3, or BOUND instruction. The interrupt vector provides an index into theIDT to a gate descriptor. If the selected gate descriptor is an interrupt gate or a trap gate, the asso-ciated handler procedure is accessed in a manner very similar to calling a procedure through acall gate. If the descriptor is a task gate, the handler is accessed through a task switch.

2-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (41)

SYSTEM ARCHITECTURE OVERVIEW

stem

f

for a

2.1.5. Memory Management

The system architecture supports either direct physical addressing of memory or virtual memory(through paging). When physical addressing is used, a linear address is treated as a physicaladdress. When paging is used, all the code, data, stack, and system segments and the GDT andIDT can be paged, with only the most recently accessed pages being held in physical memory.

The location of pages (or page frames as they are sometimes called in the Intel Architecture) inphysical memory is contained in two types of system data structures (a page directory and a setof page tables), both of which reside in physical memory (refer to Figure 2-1). An entry in a pagedirectory contains the physical address of the base of a page table, access rights, and memorymanagement information. An entry in a page table contains the physical address of a page frame,access rights, and memory management information. The base physical address of the pagedirectory is contained in control register CR3.

To use this paging mechanism, a linear address is broken into three parts, providing separateoffsets into the page directory, the page table, and the page frame.

A system can have a single page directory or several. For example, each task can have its ownpage directory.

2.1.6. System Registers

To assist in initializing the processor and controlling system operations, the system architectureprovides system flags in the EFLAGS register and several system registers:

• The system flags and IOPL field in the EFLAGS register control task and mode switching,interrupt handling, instruction tracing, and access rights. Refer to Section 2.3., “SyFlags and Fields in the EFLAGS Register” for a description of these flags.

• The control registers (CR0, CR2, CR3, and CR4) contain a variety of flags and data fieldsfor controlling system-level operations. With the introduction of the Pentium® IIIprocessor, CR4 now contains bits indicating support Pentium® III processor specificcapabilities within the OS. Refer to Section 2.5., “Control Registers” for a description othese flags.

• The debug registers (not shown in Figure 2-1) allow the setting of breakpoints for use indebugging programs and systems software. Refer to Chapter 15, Debugging andPerformance Monitoring, for a description of these registers.

• The GDTR, LDTR, and IDTR registers contain the linear addresses and sizes (limits) oftheir respective tables. Refer to Section 2.4., “Memory-Management Registers” description of these registers.

• The task register contains the linear address and size of the TSS for the current task. Referto Section 2.4., “Memory-Management Registers” for a description of this register.

• Model-specific registers (not shown in Figure 2-1).

The model-specific registers (MSRs) are a group of registers available primarily to operating-system or executive procedures (that is, code running at privilege level 0). These registerscontrol items such as the debug extensions, the performance-monitoring counters, the machine-check architecture, and the memory type ranges (MTRRs). The number and functions of these

2-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (42)

SYSTEM ARCHITECTURE OVERVIEW

appli-s run ated to

system

s”, in

ore bothtem and

SLparent

MMtes a

registers varies among the different members of the Intel Architecture processor families.Section 8.4., “Model-Specific Registers (MSRs)” in Chapter 8, Processor Management andInitialization for more information about the MSRs and Appendix B, Model-Specific Registersfor a complete list of the MSRs.

Most systems restrict access to all system registers (other than the EFLAGS register) bycation programs. Systems can be designed, however, where all programs and procedurethe most privileged level (privilege level 0), in which case application programs are allowmodify the system registers.

2.1.7. Other System Resources

Besides the system registers and data structures described in the previous sections, thearchitecture provides the following additional resources:

• Operating system instructions (refer to Section 2.6., “System Instruction Summary”).

• Performance-monitoring counters (not shown in Figure 2-1).

• Internal caches and buffers (not shown in Figure 2-1).

The performance-monitoring counters are event counters that can be programmed to countprocessor events such as the number of instructions decoded, the number of interrupts received,or the number of cache loads. Refer to Section 15.6., “Performance-Monitoring CounterChapter 15, Debugging and Performance Monitoring, for more information about thesecounters.

The processor provides several internal caches and buffers. The caches are used to stdata and instructions. The buffers are used to store things like decoded addresses to sysapplication segments and write operations waiting to be performed. Refer to Chapter 9, MemoryCache Control, for a detailed discussion of the processor’s caches and buffers.

2.2. MODES OF OPERATION

The Intel Architecture supports three operating modes and one quasi-operating mode:

• Protected mode. This is the native operating mode of the processor. In this mode allinstructions and architectural features are available, providing the highest performance andcapability. This is the recommended mode for all new applications and operating systems.

• Real-address mode. This operating mode provides the programming environment of theIntel 8086 processor, with a few extensions (such as the ability to switch to protected orsystem management mode).

• System management mode (SMM). The system management mode (SMM) is a standardarchitectural feature in all Intel Architecture processors, beginning with the Intel386™processor. This mode provides an operating system or executive with a transmechanism for implementing power management and OEM differentiation features. Sis entered through activation of an external system interrupt pin (SMI#), which genera

2-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (43)

SYSTEM ARCHITECTURE OVERVIEW

ode

tectede are

andler

real-, the

system management interrupt (SMI). In SMM, the processor switches to a separate addressspace while saving the context of the currently running program or task. SMM-specificcode may then be executed transparently. Upon returning from SMM, the processor isplaced back into its state prior to the SMI.

• Virtual-8086 mode. In protected mode, the processor supports a quasi-operating modeknown as virtual-8086 mode. This mode allows the processor to execute 8086 software ina protected, multitasking environment.

Figure 2-2 shows how the processor moves among these operating modes.

The processor is placed in real-address mode following power-up or a reset. Thereafter, the PEflag in control register CR0 controls whether the processor is operating in real-address orprotected mode (refer to Section 2.5., “Control Registers”). Refer to Section 8.8., “MSwitching” in Chapter 8, Processor Management and Initialization for detailed information onswitching between real-address mode and protected mode.

The VM flag in the EFLAGS register determines whether the processor is operating in promode or virtual-8086 mode. Transitions between protected mode and virtual-8086 modgenerally carried out as part of a task switch or a return from an interrupt or exception h(refer to Section 16.2.5., “Entering Virtual-8086 Mode” in Chapter 16, 8086 Emulation).

The processor switches to SMM whenever it receives an SMI while the processor is inaddress, protected, or virtual-8086 modes. Upon execution of the RSM instructionprocessor always returns to the mode it was in when the SMI occurred.

Figure 2-2. Transitions Among the Processor’s Operating Modes

Real-Address

Protected Mode

Virtual-8086Mode

SystemManagement

Mode

PE=1Reset or

VM=1VM=0

PE=0

Resetor

RSM

SMI#

RSM

SMI#

RSM

SMI#

Reset

Mode

2-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (44)

SYSTEM ARCHITECTURE OVERVIEW

” inreffectPL,e IF

oris fieldL of

2.3. SYSTEM FLAGS AND FIELDS IN THE EFLAGS REGISTER

The system flags and IOPL field of the EFLAGS register control I/O, maskable hardware inter-rupts, debugging, task switching, and the virtual-8086 mode (refer to Figure 2-3). Only privi-leged code (typically operating system or executive code) should be allowed to modify thesebits.

The functions of the system flags and IOPL are as follows:

TF Trap (bit 8). Set to enable single-step mode for debugging; clear to disable single-stepmode. In single-step mode, the processor generates a debug exception after eachinstruction, which allows the execution state of a program to be inspected after eachinstruction. If an application program sets the TF flag using a POPF, POPFD, or IRETinstruction, a debug exception is generated after the instruction that follows the POPF,POPFD, or IRET instruction.

IF Interrupt enable (bit 9). Controls the response of the processor to maskable hardwareinterrupt requests (refer to Section 5.1.1.2., “Maskable Hardware InterruptsChapter 5, Interrupt and Exception Handling). Set to respond to maskable hardwainterrupts; cleared to inhibit maskable hardware interrupts. The IF flag does not athe generation of exceptions or nonmaskable interrupts (NMI interrupts). The CIOPL, and the state of the VME flag in control register CR4 determine whether thflag can be modified by the CLI, STI, POPF, POPFD, and IRET instructions.

IOPL I/O privilege level field (bits 12 and 13). Indicates the I/O privilege level (IOPL) ofthe currently running program or task. The CPL of the currently running programtask must be less than or equal to the IOPL to access the I/O address space. Thcan only be modified by the POPF and IRET instructions when operating at a CP0. Refer to Chapter 10, Input/Output, of the Intel Architecture Software Developer’sManual, Volume 1, for more information on the relationship of the IOPL to I/O opera-tions.

Figure 2-3. System Flags in the EFLAGS Register

31 22 21 20 19 18 17 16

RF

ID

AC

VM

VM — Virtual-8086 ModeRF — Resume FlagNT — Nested Task FlagIOPL— I/O Privilege LevelIF — Interrupt Enable Flag

AC — Alignment Check

ID — Identification FlagVIP — Virtual Interrupt Pending

15 1314 12 11 10 9 8 7 6 5 4 3 2 1 0

0 CF

AF

PF 1D

FIF

TF

SF

ZF

NT 00

VIP

VIF

OF

IOPL

VIF — Virtual Interrupt Flag

TF — Trap Flag

Reserved

Reserved (set to 0)

2-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (45)

SYSTEM ARCHITECTURE OVERVIEW

ndi-eing cannerate

ingHere,ior toruc-mati-

cuted,

topteral-

leg torefer-ss or aecks thate this

useful. The

ers asinter

The IOPL is also one of the mechanisms that controls the modification of the IF flagand the handling of interrupts in virtual-8086 mode when the virtual mode extensionsare in effect (the VME flag in control register CR4 is set).

NT Nested task (bit 14). Controls the chaining of interrupted and called tasks. Theprocessor sets this flag on calls to a task initiated with a CALL instruction, an interrupt,or an exception. It examines and modifies this flag on returns from a task initiated withthe IRET instruction. The flag can be explicitly set or cleared with the POPF/POPFDinstructions; however, changing to the state of this flag can generate unexpected excep-tions in application programs. Refer to Section 6.4., “Task Linking” in Chapter 6, TaskManagement for more information on nested tasks.

RF Resume (bit 16). Controls the processor’s response to instruction-breakpoint cotions. When set, this flag temporarily disables debug exceptions (#DE) from bgenerated for instruction breakpoints; although, other exception conditionscause an exception to be generated. When clear, instruction breakpoints will gedebug exceptions.

The primary function of the RF flag is to allow the restarting of an instruction followa debug exception that was caused by an instruction breakpoint condition. debugger software must set this flag in the EFLAGS image on the stack just prreturning to the interrupted program with the IRETD instruction, to prevent the insttion breakpoint from causing another debug exception. The processor then autocally clears this flag after the instruction returned to has been successfully exeenabling instruction breakpoint faults again.

Refer to Section 15.3.1.1., “Instruction-Breakpoint Exception Condition”, in Chapter15, Debugging and Performance Monitoring, for more information on the use of thisflag.

VM Virtual-8086 mode (bit 17). Set to enable virtual-8086 mode; clear to return protected mode. Refer to Section 16.2.1., “Enabling Virtual-8086 Mode” in Cha16, 8086 Emulation for a detailed description of the use of this flag to switch to virtu8086 mode.

AC Alignment check (bit 18). Set this flag and the AM flag in the CR0 register to enabalignment checking of memory references; clear the AC flag and/or the AM fladisable alignment checking. An alignment-check exception is generated when ence is made to an unaligned operand, such as a word at an odd byte addredoubleword at an address which is not an integral multiple of four. Alignment-chexceptions are generated only in user mode (privilege level 3). Memory referencedefault to privilege level 0, such as segment descriptor loads, do not generatexception even when caused by instructions executed in user-mode.

The alignment-check exception can be used to check alignment of data. This is when exchanging data with other processors, which require all data to be alignedalignment-check exception can also be used by interpreters to flag some pointspecial by misaligning the pointer. This eliminates overhead of checking each poand only handles the special pointer when used.

2-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (46)

SYSTEM ARCHITECTURE OVERVIEW

areter

isjunc-Theg ins thenter-and

lag

TR)ementters.

baseber ofspec-

of 0 and

VIF Virtual Interrupt (bit 19). Contains a virtual image of the IF flag. This flag is used inconjunction with the VIP flag. The processor only recognizes the VIF flag when eitherthe VME flag or the PVI flag in control register CR4 is set and the IOPL is less than 3.(The VME flag enables the virtual-8086 mode extensions; the PVI flag enables theprotected-mode virtual interrupts.) Refer to Section 16.3.3.5., “Method 6: SoftwInterrupt Handling” and Section 16.4., “Protected-Mode Virtual Interrupts” in Chap16, 8086 Emulation for detailed information about the use of this flag.

VIP Virtual interrupt pending (bit 20). Set by software to indicate that an interrupt pending; cleared to indicate that no interrupt is pending. This flag is used in contion with the VIF flag. The processor reads this flag but never modifies it. processor only recognizes the VIP flag when either the VME flag or the PVI flacontrol register CR4 is set and the IOPL is less than 3. (The VME flag enablevirtual-8086 mode extensions; the PVI flag enables the protected-mode virtual irupts.) Refer to Section 16.3.3.5., “Method 6: Software Interrupt Handling” Section 16.4., “Protected-Mode Virtual Interrupts” in Chapter 16, 8086 Emulation fordetailed information about the use of this flag.

ID Identification (bit 21). The ability of a program or procedure to set or clear this findicates support for the CPUID instruction.

2.4. MEMORY-MANAGEMENT REGISTERS

The processor provides four memory-management registers (GDTR, LDTR, IDTR, andthat specify the locations of the data structures which control segmented memory manag(refer to Figure 2-4). Special instructions are provided for loading and storing these regis

2.4.1. Global Descriptor Table Register (GDTR)

The GDTR register holds the 32-bit base address and 16-bit table limit for the GDT. Theaddress specifies the linear address of byte 0 of the GDT; the table limit specifies the numbytes in the table. The LGDT and SGDT instructions load and store the GDTR register, retively. On power up or reset of the processor, the base address is set to the default value

Figure 2-4. Memory Management Registers

047

GDTR

IDTR

System Table Registers

32-bit Linear Base Address 16-Bit Table Limit

1516

32-bit Linear Base Address

0Task

LDTR

System Segment

Seg. Sel.

15

Seg. Sel.

Segment Descriptor Registers (Automatically Loaded)

32-bit Linear Base Address Segment Limit

AttributesRegisters

32-bit Linear Base Address Segment LimitRegister

16-Bit Table Limit

2-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (47)

SYSTEM ARCHITECTURE OVERVIEW

gment

t limit,te 0 offer to

ister, GDT.

it, and

r andaved

e default

baseber ofspec-of 0 and as partDT)”nd

t limit, in theit spec-,

gister,e base

the limit is set to FFFFH. A new base address must be loaded into the GDTR as part of theprocessor initialization process for protected-mode operation. Refer to Section 3.5.1., “SeDescriptor Tables” in Chapter 3, Protected-Mode Memory Management for more informationon the base address and limit fields.

2.4.2. Local Descriptor Table Register (LDTR)

The LDTR register holds the 16-bit segment selector, 32-bit base address, 16-bit segmenand descriptor attributes for the LDT. The base address specifies the linear address of bythe LDT segment; the segment limit specifies the number of bytes in the segment. ReSection 3.5.1., “Segment Descriptor Tables” in Chapter 3, Protected-Mode Memory Manage-ment for more information on the base address and limit fields.

The LLDT and SLDT instructions load and store the segment selector part of the LDTR regrespectively. The segment that contains the LDT must have a segment descriptor in theWhen the LLDT instruction loads a segment selector in the LDTR, the base address, limdescriptor attributes from the LDT descriptor are automatically loaded into the LDTR.

When a task switch occurs, the LDTR is automatically loaded with the segment selectodescriptor for the LDT for the new task. The contents of the LDTR are not automatically sprior to writing the new LDT information into the register.

On power up or reset of the processor, the segment selector and base address are set to thvalue of 0 and the limit is set to FFFFH.

2.4.3. IDTR Interrupt Descriptor Table Register

The IDTR register holds the 32-bit base address and 16-bit table limit for the IDT. Theaddress specifies the linear address of byte 0 of the IDT; the table limit specifies the numbytes in the table. The LIDT and SIDT instructions load and store the IDTR register, retively. On power up or reset of the processor, the base address is set to the default value the limit is set to FFFFH. The base address and limit in the register can then be changedof the processor initialization process. Refer to Section 5.8., “Interrupt Descriptor Table (Iin Chapter 5, Interrupt and Exception Handling for more information on the base address alimit fields.

2.4.4. Task Register (TR)

The task register holds the 16-bit segment selector, 32-bit base address, 16-bit segmenand descriptor attributes for the TSS of the current task. It references a TSS descriptorGDT. The base address specifies the linear address of byte 0 of the TSS; the segment limifies the number of bytes in the TSS. (Refer to Section 6.2.3., “Task Register” in Chapter 6TaskManagement for more information about the task register.)

The LTR and STR instructions load and store the segment selector part of the task rerespectively. When the LTR instruction loads a segment selector in the task register, th

2-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (48)

SYSTEM ARCHITECTURE OVERVIEW

address, limit, and descriptor attributes from the TSS descriptor are automatically loaded intothe task register. On power up or reset of the processor, the base address is set to the default valueof 0 and the limit is set to FFFFH.

When a task switch occurs, the task register is automatically loaded with the segment selectorand descriptor for the TSS for the new task. The contents of the task register are not automati-cally saved prior to writing the new TSS information into the register.

2.5. CONTROL REGISTERS

The control registers (CR0, CR1, CR2, CR3, and CR4) determine operating mode of theprocessor and the characteristics of the currently executing task (refer to Figure 2-5).

Figure 2-5. Control Registers

CR1

WP

AM

Page-Directory Base

VME

PSE

TSD

DE

PVI

PGE

MCE

PAE

PCE

NW

PG

CD

PWT

PCD

Page-Fault Linear Address

PE

EM

MP

TS

NE

ET

CR2

CR0

CR4

Reserved

CR3

Reserved (set to 0)

31 2930 30 19 18 17 16 15 6 5 4 3 2 1 0

31 0

31 0

31 12 11 5 4 3 2 0

31 9 8 7 6 5 4 3 2 1 0

(PDBR)

10

OSFXSROSXMMEXCPT

2-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (49)

SYSTEM ARCHITECTURE OVERVIEW

f the

fault).

(PCD Onlyer 12 a page the

ctory

ress ofn” in

ell as

llownly).

3) are

ntrol fault

henPG flage PGgener-

sm.

of(andcribed

ust benting

-ns of

krs)

fer to

The control registers:

• CR0—Contains system control flags that control operating mode and states oprocessor.

• CR1—Reserved.

• CR2—Contains the page-fault linear address (the linear address that caused a page

• CR3—Contains the physical address of the base of the page directory and two flagsand PWT). This register is also known as the page-directory base register (PDBR).the 20 most-significant bits of the page-directory base address are specified; the lowbits of the address are assumed to be 0. The page directory must thus be aligned to(4-KByte) boundary. The PCD and PWT flags control caching of the page directory inprocessor’s internal data caches (they do not control TLB caching of page-direinformation).

When using the physical address extension, the CR3 register contains the base addthe page-directory-pointer table (refer to Section 3.8., “Physical Address ExtensioChapter 3, Protected-Mode Memory Management).

• CR4—Contains a group of flags that enable several architectural extensions, as windicating the level of OS support for the Streaming SIMD Extensions.

In protected mode, the move-to-or-from-control-registers forms of the MOV instruction athe control registers to be read (at privilege level 0 only) or loaded (at privilege level 0 oThese restrictions mean that application programs (running at privilege levels 1, 2, or prevented from reading or loading the control registers.

A program running at privilege level 1, 2, or 3 should not attempt to read or write the coregisters. An attempt to read or write these registers will result in a general protection(GP(0)). The functions of the flags in the control registers are as follows:

PG Paging (bit 31 of CR0). Enables paging when set; disables paging when clear. Wpaging is disabled, all linear addresses are treated as physical addresses. The has no effect if the PE flag (bit 0 of register CR0) is not also set; in fact, setting thflag when the PE flag is clear causes a general-protection exception (#GP) to be ated. Refer to Section 3.6., “Paging (Virtual Memory)” in Chapter 3, Protected-ModeMemory Management for a detailed description of the processor’s paging mechani

CD Cache Disable (bit 30 of CR0). When the CD and NW flags are clear, caching memory locations for the whole of physical memory in the processor’s internal external) caches is enabled. When the CD flag is set, caching is restricted as desin Table 9-4, in Chapter 9, Memory Cache Control. To prevent the processor fromaccessing and updating its caches, the CD flag must be set and the caches minvalidated so that no cache hits can occur (refer to Section 9.5.2., “PreveCaching”, in Chapter 9, Memory Cache Control). Refer to Section 9.5., “CacheControl”, Chapter 9, Memory Cache Control, for a detailed description of the additional restrictions that can be placed on the caching of selected pages or regiomemory.

NW Not Write-through (bit 29 of CR0). When the NW and CD flags are clear, write-bac(for Pentium® and P6 family processors) or write-through (for Intel486™ processois enabled for writes that hit the cache and invalidation cycles are enabled. Re

2-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (50)

SYSTEM ARCHITECTURE OVERVIEW

andssor

toets thisetic

d

Table 9-4, in Chapter 9, Memory Cache Control, for detailed information about theaffect of the NW flag on caching for other settings of the CD and NW flags.

AM Alignment Mask (bit 18 of CR0). Enables automatic alignment checking when set;disables alignment checking when clear. Alignment checking is performed only whenthe AM flag is set, the AC flag in the EFLAGS register is set, the CPL is 3, and theprocessor is operating in either protected or virtual-8086 mode.

WP Write Protect (bit 16 of CR0). Inhibits supervisor-level procedures from writing intouser-level read-only pages when set; allows supervisor-level procedures to write intouser-level read-only pages when clear. This flag facilitates implementation of the copy-on-write method of creating a new process (forking) used by operating systems such asUNIX*.

NE Numeric Error (bit 5 of CR0). Enables the native (internal) mechanism for reportingFPU errors when set; enables the PC-style FPU error reporting mechanism when clear.When the NE flag is clear and the IGNNE# input is asserted, FPU errors are ignored.When the NE flag is clear and the IGNNE# input is deasserted, an unmasked FPU errorcauses the processor to assert the FERR# pin to generate an external interrupt and tostop instruction execution immediately before executing the next waiting floating-point instruction or WAIT/FWAIT instruction. The FERR# pin is intended to drive aninput to an external interrupt controller (the FERR# pin emulates the ERROR# pin ofthe Intel 287 and Intel 387 DX math coprocessors). The NE flag, IGNNE# pin, andFERR# pin are used with external logic to implement PC-style error reporting. (Referto “Software Exception Handling” in Chapter 7, and Appendix D in the Intel Architec-ture Software Developer’s Manual, Volume 1, for more information about FPU errorreporting and for detailed information on when the FERR# pin is asserted, which isimplementation dependent.)

ET Extension Type (bit 4 of CR0). Reserved in the P6 family and Pentium® processors.(In the P6 family processors, this flag is hardcoded to 1.) In the Intel386™ Intel486™ processors, this flag indicates support of Intel 387 DX math coproceinstructions when set.

TS Task Switched (bit 3 of CR0). Allows the saving of FPU context on a task switch be delayed until the FPU is actually accessed by the new task. The processor sflag on every task switch and tests it when interpreting floating-point arithminstructions.

• If the TS flag is set, a device-not-available exception (#NM) is raised prior to theexecution of a floating-point instruction.

• If the TS flag and the MP flag (also in the CR0 register) are both set, an #NMexception is raised prior to the execution of floating-point instruction or aWAIT/FWAIT instruction.

Table 2-1 shows the actions taken for floating-point, WAIT/FWAIT, MMX™, anStreaming SIMD Extensions based on the settings of the TS, EM, and MP flags.

2-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (51)

SYSTEM ARCHITECTURE OVERVIEW

The processor does not automatically save the context of the FPU on a task switch.Instead it sets the TS flag, which causes the processor to raise an #NM exception when-ever it encounters a floating-point instruction in the instruction stream for the new task.The fault handler for the #NM exception can then be used to clear the TS flag (with theCLTS instruction) and save the context of the FPU. If the task never encounters afloating-point instruction, the FPU context is never saved.

EM Emulation (bit 2 of CR0). Indicates that the processor does not have an internal orexternal FPU when set; indicates an FPU is present when clear. When the EM flag isset, execution of a floating-point instruction generates a device-not-available exception(#NM). This flag must be set when the processor does not have an internal FPU or isnot connected to a math coprocessor. If the processor does have an internal FPU,setting this flag would force all floating-point instructions to be handled by softwareemulation. Table 8-2 in Chapter 8, Processor Management and Initialization shows therecommended setting of this flag, depending on the Intel Architecture processor and

Table 2-1. Action Taken for Combinations of EM, MP, TS, CR4.OSFXSR, and CPUID.XMM

CR0 Flags CR4 CPUID Instruction Type

EM MP TS OSFXSR XMM Floating-Point WAIT/FWAIT MMX™ Technology

Streaming SIMD

Extensions

0 0 0 - - Execute Execute Execute -

0 0 1 - - #NM Exception Execute #NM Exception

-

0 1 0 - - Execute Execute Execute -

0 1 1 - - #NM Exception #NM Exception #NM Exception

-

1 0 0 - - #NM Exception Execute #UD Exception -

1 0 1 - - #NM Exception Execute #UD Exception -

1 1 0 - - #NM Exception Execute #UD Exception -

EM MP TS OSFXSR XMM Floating-Point WAIT/FWAIT MMX™ Technology

Streaming SIMD

Extensions

1 1 1 - - #NM Exception #NM Exception #UD Exception -

1 - - - - - - - #UD Interrupt 6

0 - 1 1 1 - - - #NM Interrupt 7

- - - 0 - - - - #UD Interrupt 6

- - - - 0 - - - #UD Interrupt 6

2-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (52)

SYSTEM ARCHITECTURE OVERVIEW

alidssorn of

ming Thus,mustis theg.

TIf the2 ingpro-

nd TS

al-ablese set.

ted

-n thesor’s

lag if) flag

tries”

rite-. Thisessor CDpter

to

FPU or math coprocessor present in the system. Table 2-1 shows the interaction of theEM, MP, and TS flags.

Note that the EM flag also affects the execution of the MMX™ instructions (refer toTable 2-1). When this flag is set, execution of an MMX™ instruction causes an invopcode exception (#UD) to be generated. Thus, if an Intel Architecture proceincorporates MMX™ technology, the EM flag must be set to 0 to enable executioMMX™ instructions.

Similarly for the Streaming SIMD Extensions, when this flag is set, execution of a StreaSIMD Extensions instruction causes an invalid opcode exception (#UD) to be generated.if an Intel Architecture processor incorporates Streaming SIMD Extensions, the EM flag be set to 0 to enable execution of Streaming SIMD Extensions. The exception to this PREFETCH and SFENCE instructions. These instructions are not affected by the EM fla

MP Monitor Coprocessor (bit 1 of CR0). Controls the interaction of the WAIT (orFWAIT) instruction with the TS flag (bit 3 of CR0). If the MP flag is set, a WAIinstruction generates a device-not-available exception (#NM) if the TS flag is set. MP flag is clear, the WAIT instruction ignores the setting of the TS flag. Table 8-Chapter 8, Processor Management and Initialization shows the recommended settinof this flag, depending on the Intel Architecture processor and FPU or math cocessor present in the system. Table 2-1 shows the interaction of the MP, EM, aflags.

PE Protection Enable (bit 0 of CR0). Enables protected mode when set; enables readdress mode when clear. This flag does not enable paging directly. It only ensegment-level protection. To enable paging, both the PE and PG flags must bRefer to Section 8.8., “Mode Switching” in Chapter 8, Processor Management andInitialization for information using the PE flag to switch between real and protecmode.

PCD Page-level Cache Disable (bit 4 of CR3). Controls caching of the current page directory. When the PCD flag is set, caching of the page-directory is prevented; wheflag is clear, the page-directory can be cached. This flag affects only the procesinternal caches (both L1 and L2, when present). The processor ignores this fpaging is not used (the PG flag in register CR0 is clear) or the CD (cache disablein CR0 is set. Refer to Chapter 9, Memory Cache Control, for more information aboutthe use of this flag. Refer to Section 3.6.4., “Page-Directory and Page-Table Enin Chapter 3, Protected-Mode Memory Management for a description of a companionPCD flag in the page-directory and page-table entries.

PWT Page-level Writes Transparent (bit 3 of CR3). Controls the write-through or write-back caching policy of the current page directory. When the PWT flag is set, wthrough caching is enabled; when the flag is clear, write-back caching is enabledflag affects only the internal caches (both L1 and L2, when present). The procignores this flag if paging is not used (the PG flag in register CR0 is clear) or the(cache disable) flag in CR0 is set. Refer to Section 9.5., “Cache Control”, in Cha9, Memory Cache Control, for more information about the use of this flag. Refer Section 3.6.4., “Page-Directory and Page-Table Entries” in Chapter 3, Protected-Mode

2-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (53)

SYSTEM ARCHITECTURE OVERVIEW

t alsoof

fer to 16,

innter-s

no be

5 clear,twareebug

esr 3,.

e clear.

n

nd

esr. Theobal totry).taskging

Memory Management for a description of a companion PCD flag in the page-directoryand page-table entries.

VME Virtual-8086 Mode Extensions (bit 0 of CR4). Enables interrupt- and exception-handling extensions in virtual-8086 mode when set; disables the extensions when clear.Use of the virtual mode extensions can improve the performance of virtual-8086 appli-cations by eliminating the overhead of calling the virtual-8086 monitor to handle inter-rupts and exceptions that occur while executing an 8086 program and, instead,redirecting the interrupts and exceptions back to the 8086 program’s handlers. Iprovides hardware support for a virtual interrupt flag (VIF) to improve reliability running 8086 programs in multitasking and multiple-processor environments. ReSection 16.3., “Interrupt and Exception Handling in Virtual-8086 Mode” in Chapter8086 Emulation for detailed information about the use of this feature.

PVI Protected-Mode Virtual Interrupts (bit 1 of CR4). Enables hardware support for avirtual interrupt flag (VIF) in protected mode when set; disables the VIF flagprotected mode when clear. Refer to Section 16.4., “Protected-Mode Virtual Irupts” in Chapter 16, 8086 Emulation for detailed information about the use of thifeature.

TSD Time Stamp Disable (bit 2 of CR4). Restricts the execution of the RDTSC instructioto procedures running at privilege level 0 when set; allows RDTSC instruction texecuted at any privilege level when clear.

DE Debugging Extensions (bit 3 of CR4). References to debug registers DR4 and DRcause an undefined opcode (#UD) exception to be generated when set; whenprocessor aliases references to registers DR4 and DR5 for compatibility with sofwritten to run on earlier Intel Architecture processors. Refer to Section 15.2.2., “DRegisters DR4 and DR5”, in Chapter 15, Debugging and Performance Monitoring, formore information on the function of this flag.

PSE Page Size Extensions (bit 4 of CR4). Enables 4-MByte pages when set; restricts pagto 4 KBytes when clear. Refer to Section 3.6.1., “Paging Options” in ChapteProtected-Mode Memory Management for more information about the use of this flag

PAE Physical Address Extension (bit 5 of CR4). Enables paging mechanism to referenc36-bit physical addresses when set; restricts physical addresses to 32 bits whenRefer to Section 3.8., “Physical Address Extension” in Chapter 3, Protected-ModeMemory Management for more information about the physical address extension.

MCE Machine-Check Enable (bit 6 of CR4). Enables the machine-check exception wheset; disables the machine-check exception when clear. Refer to Chapter 13, Machine-Check Architecture, for more information about the machine-check exception amachine- check architecture.

PGE Page Global Enable (bit 7 of CR4). (Introduced in the P6 family processors.) Enablthe global page feature when set; disables the global page feature when cleaglobal page feature allows frequently used or shared pages to be marked as glall users (done with the global flag, bit 8, in a page-directory or page-table enGlobal pages are not flushed from the translation-lookaside buffer (TLB) on a switch or a write to register CR3. In addition, the bit must not be enabled before pa

2-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (54)

SYSTEM ARCHITECTURE OVERVIEW

hen

of

ep-

s in withy are

isters,f these proce-e thushether detail

l,

is enabled via CR0.PG. Program correctness may be affected by reversing thissequence, and processor performance will be impacted. Refer to Section 3.7., “Trans-lation Lookaside Buffers (TLBs)” in Chapter 3, Protected-Mode Memory Managementfor more information on the use of this bit.

PCE Performance-Monitoring Counter Enable (bit 8 of CR4). Enables execution of theRDPMC instruction for programs or procedures running at any protection level wset; RDPMC instruction can be executed only at protection level 0 when clear.

OSFXSR

Operating Sytsem FXSAVE/FXRSTOR Support (bit 9 of CR4). The operatingsystem will set this bit if both the CPU and the OS support the useFXSAVE/FXRSTOR for use during context switches.

OSXMMEXCPT

Operating System Unmasked Exception Support (bit 10 of CR4). The operatingsystem will set this bit if it provides support for unmasked SIMD floating-point exctions.

2.5.1. CPUID Qualification of Control Register Flags

The VME, PVI, TSD, DE, PSE, PAE, MCE, PGE, PCE, OSFXSR, and OSXMMCEPT flagcontrol register CR4 are model specific. All of these flags (except PCE) can be qualifiedthe CPUID instruction to determine if they are implemented on the processor before theused.

2.6. SYSTEM INSTRUCTION SUMMARY

The system instructions handle system-level functions such as loading system regmanaging the cache, managing interrupts, or setting up the debug registers. Many oinstructions can be executed only by operating-system or executive procedures (that is,dures running at privilege level 0). Others can be executed at any privilege level and aravailable to application programs. Table 2-2 lists the system instructions and indicates wthey are available and useful for application programs. These instructions are described inin Chapter 3, Instruction Set Reference, of the Intel Architecture Software Developer’s ManuaVolume 2.

2-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (55)

SYSTEM ARCHITECTURE OVERVIEW

:

NOTES:

1. Useful to application programs running at a CPL of 1 or 2.

2. The TSD and PCE flags in control register CR4 control access to these instructions by applicationprograms running at a CPL of 3.

3. These instructions were introduced into the Intel Architecture with the Pentium® processor.

4. This instruction was introduced into the Intel Architecture with the Pentium® Pro processor and the Pen-tium processor with MMX™ technology.

5. This instruction was introduced into the Intel Architecture with the Pentium® III processor.

Table 2-2. Summary of System Instructions

Instruction DescriptionUseful to

Application?Protected fromApplication?

LLDT Load LDT Register No Yes

SLDT Store LDT Register No No

LGDT Load GDT Register No Yes

SGDT Store GDT Register No No

LTR Load Task Register No Yes

STR Store Task Register No No

LIDT Load IDT Register No Yes

SIDT Store IDT Register No No

MOV CRn Load and store control registers Yes Yes (load only)

SMSW Store MSW Yes No

LMSW Load MSW No Yes

CLTS Clear TS flag in CR0 No Yes

ARPL Adjust RPL Yes1 No

LAR Load Access Rights Yes No

LSL Load Segment Limit Yes No

VERR Verify for Reading Yes No

VERW Verify for Writing Yes No

MOV DBn Load and store debug registers No Yes

INVD Invalidate cache, no writeback No Yes

WBINVD Invalidate cache, with writeback No Yes

INVLPG Invalidate TLB entry No Yes

HLT Halt Processor No Yes

LOCK (Prefix) Bus Lock Yes No

RSM Return from system management mode No Yes

RDMSR3 Read Model-Specific Registers No Yes

WRMSR3 Write Model-Specific Registers No Yes

RDPMC4 Read Performance-Monitoring Counter Yes Yes2

RDTSC3 Read Time-Stamp Counter Yes Yes2

LDMXCSR5 Load MXCSR Register Yes No

STMXCSR5 Store MXCSR Resister Yes No

2-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (56)

SYSTEM ARCHITECTURE OVERVIEW

ontrol

ction.ntents

gmentuctions

2.6.1. Loading and Storing System Registers

The GDTR, LDTR, IDTR, and TR registers each have a load and store instruction for loadingdata into and storing data from the register:

LGDT (Load GDTR Register) Loads the GDT base address and limit from memory into theGDTR register.

SGDT (Store GDTR Register) Stores the GDT base address and limit from the GDTR registerinto memory.

LIDT (Load IDTR Register) Loads the IDT base address and limit from memory into theIDTR register.

SIDT (Load IDTR Register Stores the IDT base address and limit from the IDTR registerinto memory.

LLDT (Load LDT Register) Loads the LDT segment selector and segment descriptor frommemory into the LDTR. (The segment selector operand can alsobe located in a general-purpose register.)

SLDT (Store LDT Register) Stores the LDT segment selector from the LDTR register intomemory or a general-purpose register.

LTR (Load Task Register) Loads segment selector and segment descriptor for a TSS frommemory into the task register. (The segment selector operandcan also be located in a general-purpose register.)

STR (Store Task Register) Stores the segment selector for the current task TSS from thetask register into memory or a general-purpose register.

The LMSW (load machine status word) and SMSW (store machine status word) instructionsoperate on bits 0 through 15 of control register CR0. These instructions are provided for compat-ibility with the 16-bit Intel 286 processor. Program written to run on 32-bit Intel Architectureprocessors should not use these instructions. Instead, they should access the control register CR0using the MOV instruction.

The CLTS (clear TS flag in CR0) instruction is provided for use in handling a device-not-avail-able exception (#NM) that occurs when the processor attempts to execute a floating-pointinstruction when the TS flag is set. This instruction allows the TS flag to be cleared after theFPU context has been saved, preventing further #NM exceptions. Refer to Section 2.5., “CRegisters” for more information about the TS flag.

The control registers (CR0, CR1, CR2, CR3, and CR4) are loaded with the MOV instruThis instruction can load a control register from a general-purpose register or store the coof the control register in a general-purpose register.

2.6.2. Verifying of Access Privileges

The processor provides several instructions for examining segment selectors and sedescriptors to determine if access to their associated segments is allowed. These instr

2-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (57)

SYSTEM ARCHITECTURE OVERVIEW

t andneral-nt type

(LARis

t andegister.rmineointer

edecking

isters from

TLB andes indi-

theory

ternal

ec-

duplicate some of the automatic access rights and type checking done by the processor, thusallowing operating-system or executive software to prevent exceptions from being generated.

The ARPL (adjust RPL) instruction adjusts the RPL (requestor privilege level) of a segmentselector to match that of the program or procedure that supplied the segment selector. Refer toSection 4.10.4., “Checking Caller Access Privileges (ARPL Instruction)” in Chapter 4, Protec-tion for a detailed explanation of the function and use of this instruction.

The LAR (load access rights) instruction verifies the accessibility of a specified segmenloads the access rights information from the segment’s segment descriptor into a gepurpose register. Software can then examine the access rights to determine if the segmeis compatible with its intended use. Refer to Section 4.10.1., “Checking Access Rights Instruction)” in Chapter 4, Protection for a detailed explanation of the function and use of thinstruction.

The LSL (load segment limit) instruction verifies the accessibility of a specified segmenloads the segment limit from the segment’s segment descriptor into a general-purpose rSoftware can then compare the segment limit with an offset into the segment to detewhether the offset lies within the segment. Refer to Section 4.10.3., “Checking That the POffset Is Within Limits (LSL Instruction)” in Chapter 4, Protection for a detailed explanation ofthe function and use of this instruction.

The VERR (verify for reading) and VERW (verify for writing) instructions verify if a selectsegment is readable or writable, respectively, at the CPL. Refer to Section 4.10.2., “ChRead/Write Rights (VERR and VERW Instructions)” in Chapter 4, Protection for a detailedexplanation of the function and use of this instruction.

2.6.3. Loading and Storing Debug Registers

The internal debugging facilities in the processor are controlled by a set of 8 debug reg(DR0 through DR7). The MOV instruction allows setup data to be loaded into and storedthese registers.

2.6.4. Invalidating Caches and TLBs

The processor provides several instructions for use in explicitly invalidating its caches andentries. The INVD (invalidate cache with no writeback) instruction invalidates all datainstruction entries in the internal caches and TLBs and sends a signal to the external cachcating that they should be invalidated also.

The WBINVD (invalidate cache with writeback) instruction performs the same function asINVD instruction, except that it writes back any modified lines in its internal caches to membefore it invalidates the caches. After invalidating the internal caches, it signals the excaches to write back modified data and invalidate their contents.

The INVLPG (invalidate TLB entry) instruction invalidates (flushes) the TLB entry for a spified page.

2-21

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (58)

SYSTEM ARCHITECTURE OVERVIEW

for bus

o the

nter)g and

occur-umberountervaluess the

e the

takerrent

2.6.5. Controlling the Processor

The HLT (halt processor) instruction stops the processor until an enabled interrupt (such as NMIor SMI, which are normally enabled), the BINIT# signal, the INIT# signal, or the RESET#signal is received. The processor generates a special bus cycle to indicate that the halt mode hasbeen entered. Hardware may respond to this signal in a number of ways. An indicator light onthe front panel may be turned on. An NMI interrupt for recording diagnostic information maybe generated. Reset initialization may be invoked. (Note that the BINIT# pin was introducedwith the Pentium® Pro processor.)

The LOCK prefix invokes a locked (atomic) read-modify-write operation when modifying amemory operand. This mechanism is used to allow reliable communications between processorsin multiprocessor systems. In the Pentium® and earlier Intel Architecture processors, the LOCKprefix causes the processor to assert the LOCK# signal during the instruction, which alwayscauses an explicit bus lock to occur. In the P6 family processors, the locking operation is handledwith either a cache lock or bus lock. If a memory access is cacheable and affects only a singlecache line, a cache lock is invoked and the system bus and the actual memory location in systemmemory are not locked during the operation. Here, other P6 family processors on the bus write-back any modified data and invalidate their caches as necessary to maintain system memorycoherency. If the memory access is not cacheable and/or it crosses a cache line boundary, theprocessor’s LOCK# signal is asserted and the processor does not respond to requestscontrol during the locked operation.

The RSM (return from SMM) instruction restores the processor (from a context dump) tstate it was in prior to an system management mode (SMM) interrupt.

2.6.6. Reading Performance-Monitoring and Time-Stamp Counters

The RDPMC (read performance-monitoring counter) and RDTSC (read time-stamp couinstructions allow an application program to read the processors performance-monitorintime-stamp counters, respectively.

The P6 family processors have two 40-bit performance counters that record either the rence of events or the duration of events. The events that can be monitored include the nof instructions decoded, number of interrupts received, of number of cache loads. Each ccan be set up to monitor a different event, using the system instruction WRMSR to set up in the model-specific registers PerfEvtSel0 and PerfEvtSel1. The RDPMC instruction loadcurrent count in counter 0 or 1 into the EDX:EAX registers.

The time-stamp counter is a model-specific 64-bit counter that is reset to zero each timprocessor is reset. If not reset, the counter will increment ~6.3 x 1015 times per year whenthe processor is operating at a clock rate of 200 MHz. At this clock frequency, it wouldover 2000 years for the counter to wrap around. The RDTSC instruction loads the cucount of the time-stamp counter into the EDX:EAX registers.

2-22

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (59)

SYSTEM ARCHITECTURE OVERVIEW

ring

tively.MSR

SR8.4.,

the

The, to set infor-

Refer to Section 15.5., “Time-Stamp Counter”, and Section 15.6., “Performance-MonitoCounters”, in Chapter 15, Debugging and Performance Monitoring, for more information aboutthe performance monitoring and time-stamp counters.

The RDTSC instruction was introduced into the Intel Architecture with the Pentium® processor.The RDPMC instruction was introduced into the Intel Architecture with the Pentium® Proprocessor and the Pentium® processor with MMX™ technology. Earlier Pentium® processorshave two performance-monitoring counters, but they can be read only with the RDMSR instruc-tion, and only at privilege level 0.

2.6.7. Reading and Writing Model-Specific Registers

The RDMSR (read model-specific register) and WRMSR (write model-specific register) allowthe processor’s 64-bit model-specific registers (MSRs) to be read and written to, respecThe MSR to be read or written to is specified by the value in the ECX register. The RDinstruction reads the value from the specified MSR into the EDX:EAX registers; the WRMwrites the value in the EDX:EAX registers into the specified MSR. Refer to Section “Model-Specific Registers (MSRs)” in Chapter 8, Processor Management and Initialization formore information about the MSRs.

The RDMSR and WRMSR instructions were introduced into the Intel Architecture withPentium® processor.

2.6.8. Loading and Storing the Streaming SIMD Extensions Control/Status Word

The LDMXCSR (load Streaming SIMD Extensions control/status word from memory) andSTMXCSR (store Streaming SIMD Extensions control/status word to memory) allow thePentium® III processor’s 32-bit control/status word to be read and written to, respectively.MXCSR control/status register is used to enable masked/unmasked exception handlingrounding modes, to set flush-to-zero mode, and to view exception status flags. For moremation on the LDMXCSR and STMXCSR instructions, refer to the Intel Architecture SoftwareDeveloper’s Manual, Vol 2, for a complete description of these instructions.

2-23

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (60)

SYSTEM ARCHITECTURE OVERVIEW

2-24

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (61)

3

Protected-Mode Memory Management

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (62)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (63)

PROTECTED-MODE MEMORY MANAGEMENT

ilities,aging

h-n

men-a, andt inter-

tionalnmentolationust be

l.

single-at used

sor’ssgramask) isrocessordoes notents.may be

ce. Toeelectorriptorcriptor.ss rightsof thet part of

thin the

CHAPTER 3PROTECTED-MODE MEMORY MANAGEMENT

This chapter describes the Intel Architecture’s protected-mode memory management facincluding the physical memory requirements, the segmentation mechanism, and the pmechanism. Refer to Chapter 4, Protection for a description of the processor’s protection mecanism. Refer to Chapter 16, 8086 Emulation for a description of memory addressing protectioin real-address and virtual-8086 modes.

3.1. MEMORY MANAGEMENT OVERVIEW

The memory management facilities of the Intel Architecture are divided into two parts: segtation and paging. Segmentation provides a mechanism of isolating individual code, datstack modules so that multiple programs (or tasks) can run on the same processor withoufering with one another. Paging provides a mechanism for implementing a convendemand-paged, virtual-memory system where sections of a program’s execution enviroare mapped into physical memory as needed. Paging can also be used to provide isbetween multiple tasks. When operating in protected mode, some form of segmentation mused. There is no mode bit to disable segmentation. The use of paging, however, is optiona

These two mechanisms (segmentation and paging) can be configured to support simpleprogram (or single-task) systems, multitasking systems, or multiple-processor systems thshared memory.

As shown in Figure 3-1, segmentation provides a mechanism for dividing the procesaddressable memory space (called the linear address space) into smaller protected addresspaces called segments. Segments can be used to hold the code, data, and stack for a proor to hold system data structures (such as a TSS or LDT). If more than one program (or trunning on a processor, each program can be assigned its own set of segments. The pthen enforces the boundaries between these segments and insures that one program interfere with the execution of another program by writing into the other program’s segmThe segmentation mechanism also allows typing of segments so that the operations that performed on a particular type of segment can be restricted.

All of the segments within a system are contained in the processor’s linear address spalocate a byte in a particular segment, a logical address (sometimes called a far pointer) must bprovided. A logical address consists of a segment selector and an offset. The segment sis a unique identifier for a segment. Among other things it provides an offset into a desctable (such as the global descriptor table, GDT) to a data structure called a segment desEach segment has a segment descriptor, which specifies the size of the segment, the acceand privilege level for the segment, the segment type, and the location of the first byte segment in the linear address space (called the base address of the segment). The offsethe logical address is added to the base address for the segment to locate a byte wisegment. The base address plus the offset thus forms a linear address in the processor’s linear

3-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (64)

PROTECTED-MODE MEMORY MANAGEMENT

space

ulatedusingstored pagettempts directory

address space.

If paging is not used, the linear address space of the processor is mapped directly into the phys-ical address space of processor. The physical address space is defined as the range of addressesthat the processor can generate on its address bus.

Because multitasking computing systems commonly define a linear address space much largerthan it is economically feasible to contain all at once in physical memory, some method of“virtualizing” the linear address space is needed. This virtualization of the linear address is handled through the processor’s paging mechanism.

Paging supports a “virtual memory” environment where a large linear address space is simwith a small amount of physical memory (RAM and ROM) and some disk storage. When paging, each segment is divided into pages (ordinarily 4 KBytes each in size), which are either in physical memory or on the disk. The operating system or executive maintains adirectory and a set of page tables to keep track of the pages. When a program (or task) ato access an address location in the linear address space, the processor uses the page

Figure 3-1. Segmentation and Paging

Global DescriptorTable (GDT)

Linear AddressSpace

SegmentSegmentDescriptor

Offset

Logical Address

SegmentBase Address

Page

Phy. Addr.Lin. Addr.

SegmentSelector

Dir Table OffsetLinear Address

Page Table

Page Directory

Entry

Physical

Space

Entry

(or Far Pointer)

PagingSegmentation

Address

Page

3-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (65)

PROTECTED-MODE MEMORY MANAGEMENT

atingpace. Toe archi-

mentg a datae linear0 and thetation if nohe topH. RAM the DS

and page tables to translate the linear address into a physical address and then performs therequested operation (read or write) on the memory location. If the page being accessed is notcurrently in physical memory, the processor interrupts execution of the program (by generatinga page-fault exception). The operating system or executive then reads the page into physicalmemory from the disk and continues executing the program.

When paging is implemented properly in the operating-system or executive, the swapping ofpages between physical memory and the disk is transparent to the correct execution of aprogram. Even programs written for 16-bit Intel Architecture processors can be paged (transpar-ently) when they are run in virtual-8086 mode.

3.2. USING SEGMENTS

The segmentation mechanism supported by the Intel Architecture can be used to implement awide variety of system designs. These designs range from flat models that make only minimaluse of segmentation to protect programs to multisegmented models that employ segmentationto create a robust operating environment in which multiple programs and tasks can be executedreliably.

The following sections give several examples of how segmentation can be employed in a systemto improve memory management performance and reliability.

3.2.1. Basic Flat Model

The simplest memory model for a system is the basic “flat model,” in which the opersystem and application programs have access to a continuous, unsegmented address sthe greatest extent possible, this basic flat model hides the segmentation mechanism of thtecture from both the system designer and the application programmer.

To implement a basic flat memory model with the Intel Architecture, at least two segdescriptors must be created, one for referencing a code segment and one for referencinsegment (refer to Figure 3-2). Both of these segments, however, are mapped to the entiraddress space: that is, both segment descriptors have the same base address value of same segment limit of 4 GBytes. By setting the segment limit to 4 GBytes, the segmenmechanism is kept from generating exceptions for out of limit memory references, evenphysical memory resides at a particular address. ROM (EPROM) is generally located at tof the physical address space, because the processor begins execution at FFFF_FFF0(DRAM) is placed at the bottom of the address space because the initial base address fordata segment after reset initialization is 0.

3-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (66)

PROTECTED-MODE MEMORY MANAGEMENT

3.2.2. Protected Flat Model

The protected flat model is similar to the basic flat model, except the segment limits are set toinclude only the range of addresses for which physical memory actually exists (refer to Figure3-3). A general-protection exception (#GP) is then generated on any attempt to access nonex-istent memory. This model provides a minimum level of hardware protection against some kindsof program bugs.

More complexity can be added to this protected flat model to provide more protection. Forexample, for the paging mechanism to provide isolation between user and supervisor code anddata, four segments need to be defined: code and data segments at privilege level 3 for the user,and code and data segments at privilege level 0 for the supervisor. Usually these segments alloverlay each other and start at address 0 in the linear address space. This flat segmentation

Figure 3-2. Flat Model

Figure 3-3. Protected Flat Model

Linear Address Space(or Physical Memory)

Data and

FFFFFFFFHSegment

LimitAccessBase Address

Registers

CS

SS

DS

ES

FS

GS

Code

Code- and Data-SegmentDescriptors

Stack

Not Present

Linear Address Space(or Physical Memory)

Data and

FFFFFFFFHSegment

LimitAccessBase Address

Registers

CS

ES

SS

DS

FS

GS

Code

SegmentDescriptors

LimitAccessBase Address

Memory I/O

Stack

Not Present

3-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (67)

PROTECTED-MODE MEMORY MANAGEMENT

model along with a simple paging structure can protect the operating system from applications,and by adding a separate paging structure for each task or process, it can also protect applica-tions from each other. Similar designs are used by several popular multitasking operatingsystems.

3.2.3. Multisegment Model

A multisegment model (such as the one shown in Figure 3-4) uses the full capabilities of thesegmentation mechanism to provided hardware enforced protection of code, data structures, andprograms and tasks. Here, each program (or task) is given its own table of segment descriptorsand its own segments. The segments can be completely private to their assigned programs orshared among programs. Access to all segments and to the execution environments of individualprograms running on the system is controlled by hardware.

Figure 3-4. Multisegment Model

Linear Address Space(or Physical Memory)

SegmentRegisters

CS

SegmentDescriptors

LimitAccessBase Address

SS LimitAccessBase Address

DS LimitAccessBase Address

ES LimitAccessBase Address

FS LimitAccessBase Address

GS LimitAccessBase Address

LimitAccessBase Address

LimitAccessBase Address

LimitAccessBase Address

LimitAccessBase Address

Stack

Code

Data

Data

Data

Data

3-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (68)

PROTECTED-MODE MEMORY MANAGEMENT

nts are mappedl protec-mple,m also basis.

Bytesus. This 0 tod-only

er can

” for

ddresss space

ccessedt offsetnd thegment.

a 32-bite linear0 to

Access checks can be used to protect not only against referencing an address outside the limitof a segment, but also against performing disallowed operations in certain segments. Forexample, since code segments are designated as read-only segments, hardware can be used toprevent writes into code segments. The access rights information created for segments can alsobe used to set up protection rings or levels. Protection levels can be used to protect operating-system procedures from unauthorized access by application programs.

3.2.4. Paging and Segmentation

Paging can be used with any of the segmentation models described in Figures 3-2, 3-3, and 3-4.The processor’s paging mechanism divides the linear address space (into which segmemapped) into pages (as shown in Figure 3-1). These linear-address-space pages are thento pages in the physical address space. The paging mechanism offers several page-levetion facilities that can be used with or instead of the segment-protection facilities. For exait lets read-write protection be enforced on a page-by-page basis. The paging mechanisprovides two-level user-supervisor protection that can also be specified on a page-by-page

3.3. PHYSICAL ADDRESS SPACE

In protected mode, the Intel Architecture provides a normal physical address space of 4 G(232

bytes). This is the address space that the processor can address on its address baddress space is flat (unsegmented), with addresses ranging continuously fromFFFFFFFFH. This physical address space can be mapped to read-write memory, reamemory, and memory mapped I/O. The memory mapping facilities described in this chaptbe used to divide this physical memory up into segments and/or pages.

(Introduced in the Pentium® Pro processor.) The Intel Architecture also supports an extension ofthe physical address space to 236 bytes (64 GBytes), with a maximum physical address ofFFFFFFFFFH. This extension is invoked with the physical address extension (PAE) flag,located in bit 5 of control register CR4. (Refer to Section 3.8., “Physical Address Extensionmore information about extended physical addressing.)

3.4. LOGICAL AND LINEAR ADDRESSES

At the system-architecture level in protected mode, the processor uses two stages of atranslation to arrive at a physical address: logical-address translation and linear addrespaging.

Even with the minimum use of segments, every byte in the processor’s address space is awith a logical address. A logical address consists of a 16-bit segment selector and a 32-bi(refer to Figure 3-5). The segment selector identifies the segment the byte is located in aoffset specifies the location of the byte in the segment relative to the base address of the se

The processor translates every logical address into a linear address. A linear address is address in the processor’s linear address space. Like the physical address space, thaddress space is a flat (unsegmented), 232-byte address space, with addresses ranging from

3-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (69)

PROTECTED-MODE MEMORY MANAGEMENT

space isphysical

pointent. A

hementfrom

FFFFFFFH. The linear address space contains all the segments and system tables defined for asystem.

To translate a logical address into a linear address, the processor does the following:

1. Uses the offset in the segment selector to locate the segment descriptor for the segment inthe GDT or LDT and reads it into the processor. (This step is needed only when a newsegment selector is loaded into a segment register.)

2. Examines the segment descriptor to check the access rights and range of the segment toinsure that the segment is accessible and that the offset is within the limits of the segment.

3. Adds the base address of the segment from the segment descriptor to the offset to form alinear address.

If paging is not used, the processor maps the linear address directly to a physical address (thatis, the linear address goes out on the processor’s address bus). If the linear address paged, a second level of address translation is used to translate the linear address into a address. Page translation is described in Section 3.6., “Paging (Virtual Memory)”

3.4.1. Segment Selectors

A segment selector is a 16-bit identifier for a segment (refer to Figure 3-6). It does not directly to the segment, but instead points to the segment descriptor that defines the segmsegment selector contains the following items:

Index (Bits 3 through 15). Selects one of 8192 descriptors in the GDT or LDT. Tprocessor multiplies the index value by 8 (the number of bytes in a segdescriptor) and adds the result to the base address of the GDT or LDT (the GDTR or LDTR register, respectively).

Figure 3-5. Logical Address to Linear Address Translation

Offset0

Base Address

Descriptor Table

SegmentDescriptor

31Seg. Selector

015Logical

Address

+

Linear Address031

3-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (70)

PROTECTED-MODE MEMORY MANAGEMENT

theints

to this usedegmentwever,memory.egisterd.

valuesation

ers forsupportgramegments threeal data

n loaded, only 6

TI (table indicator) flag(Bit 2). Specifies the descriptor table to use: clearing this flag selects the GDT;setting this flag selects the current LDT.

Requested Privilege Level (RPL)(Bits 0 and 1). Specifies the privilege level of the selector. The privilege levelcan range from 0 to 3, with 0 being the most privileged level. Refer to Section4.5., “Privilege Levels” in Chapter 4, Protection for a description of the rela-tionship of the RPL to the CPL of the executing program (or task) anddescriptor privilege level (DPL) of the descriptor the segment selector poto.

The first entry of the GDT is not used by the processor. A segment selector that points entry of the GDT (that is, a segment selector with an index of 0 and the TI flag set to 0) isas a “null segment selector.” The processor does not generate an exception when a sregister (other than the CS or SS registers) is loaded with a null selector. It does, hogenerate an exception when a segment register holding a null selector is used to access A null selector can be used to initialize unused segment registers. Loading the CS or SS rwith a null segment selector causes a general-protection exception (#GP) to be generate

Segment selectors are visible to application programs as part of a pointer variable, but theof selectors are usually assigned or modified by link editors or linking loaders, not applicprograms.

3.4.2. Segment Registers

To reduce address translation time and coding complexity, the processor provides registholding up to 6 segment selectors (refer to Figure 3-7). Each of these segment registers a specific kind of memory reference (code, stack, or data). For virtually any kind of proexecution to take place, at least the code-segment (CS), data-segment (DS), and stack-s(SS) registers must be loaded with valid segment selectors. The processor also provideadditional data-segment registers (ES, FS, and GS), which can be used to make additionsegments available to the currently executing program (or task).

For a program to access a segment, the segment selector for the segment must have beein one of the segment registers. So, although a system can define thousands of segments

Figure 3-6. Segment Selector

15 3 2 1 0

TIIndex

Table Indicator 0 = GDT 1 = LDTRequested Privilege Level (RPL)

RPL

3-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (71)

PROTECTED-MODE MEMORY MANAGEMENT

timesloadedegmentgment

egister cyclesultiple reload

egmentas been

truc-

RETgeidental

neral-

ith theegment exec-

can be available for immediate use. Other segments can be made available by loading theirsegment selectors into these registers during program execution.

Every segment register has a “visible” part and a “hidden” part. (The hidden part is somereferred to as a “descriptor cache” or a “shadow register.”) When a segment selector is into the visible part of a segment register, the processor also loads the hidden part of the sregister with the base address, segment limit, and access control information from the sedescriptor pointed to by the segment selector. The information cached in the segment r(visible and hidden) allows the processor to translate addresses without taking extra busto read the base address and limit from the segment descriptor. In systems in which mprocessors have access to the same descriptor tables, it is the responsibility of software tothe segment registers when the descriptor tables are modified. If this is not done, an old sdescriptor cached in a segment register might be used after its memory-resident version hmodified.

Two kinds of load instructions are provided for loading the segment registers:

1. Direct load instructions such as the MOV, POP, LDS, LES, LSS, LGS, and LFS instions. These instructions explicitly reference the segment registers.

2. Implied load instructions such as the far pointer versions of the CALL, JMP, and instructions and the IRET, INTn, INTO and INT3 instructions. These instructions chanthe contents of the CS register (and sometimes other segment registers) as an incpart of their operation.

The MOV instruction can also be used to store visible part of a segment register in a gepurpose register.

3.4.3. Segment Descriptors

A segment descriptor is a data structure in a GDT or LDT that provides the processor wsize and location of a segment, as well as access control and status information. Sdescriptors are typically created by compilers, linkers, loaders, or the operating system or

Figure 3-7. Segment Registers

CS

SS

DS

ES

FS

GS

Segment Selector Base Address, Limit, Access Information

Visible Part Hidden Part

3-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (72)

PROTECTED-MODE MEMORY MANAGEMENT

to 1

s to

g onfer tofor-gical

gmentwn frome Bcep-wn

space,rds,

utive, but not application programs. Figure 3-8 illustrates the general descriptor format for alltypes of segment descriptors.

The flags and fields in a segment descriptor are as follows:

Segment limit fieldSpecifies the size of the segment. The processor puts together the two segmentlimit fields to form a 20-bit value. The processor interprets the segment limitin one of two ways, depending on the setting of the G (granularity) flag:

• If the granularity flag is clear, the segment size can range from 1 byteMByte, in byte increments.

• If the granularity flag is set, the segment size can range from 4 KByte4 GBytes, in 4-KByte increments.

The processor uses the segment limit in two different ways, dependinwhether the segment is an expand-up or an expand-down segment. ReSection 3.4.3.1., “Code- and Data-Segment Descriptor Types” for more inmation about segment types. For expand-up segments, the offset in a loaddress can range from 0 to the segment limit. Offsets greater than the selimit generate general-protection exceptions (#GP). For expand-dosegments, the segment limit has the reverse function; the offset can rangethe segment limit to FFFFFFFFH or FFFFH, depending on the setting of thflag. Offsets less than the segment limit generate general-protection extions. Decreasing the value in the segment limit field for an expand-dosegment allocates new memory at the bottom of the segment's address rather than at the top. Intel Architecture stacks always grow downwamaking this mechanism is convenient for expandable stacks.

3-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (73)

PROTECTED-MODE MEMORY MANAGEMENT

ptorata-

lag is

Base address fieldsDefines the location of byte 0 of the segment within the 4-GByte linear addressspace. The processor puts together the three base address fields to form a single32-bit value. Segment base addresses should be aligned to 16-byte boundaries.Although 16-byte alignment is not required, this alignment allows programs tomaximize performance by aligning code and data on 16-byte boundaries.

Type field Indicates the segment or gate type and specifies the kinds of access that can bemade to the segment and the direction of growth. The interpretation of this fielddepends on whether the descriptor type flag specifies an application (code ordata) descriptor or a system descriptor. The encoding of the type field isdifferent for code, data, and system descriptors (refer to Figure 4-1 in Chapter4, Protection). Refer to Section 3.4.3.1., “Code- and Data-Segment DescriTypes” for a description of how this field is used to specify code and dsegment types.

S (descriptor type) flagSpecifies whether the segment descriptor is for a system segment (S fclear) or a code or data segment (S flag is set).

Figure 3-8. Segment Descriptor

31 24 23 22 21 20 19 16 15 1314 12 11 8 7 0

PBase 31:24 GDPL

TypeS0 4

31 16 15 0

Base Address 15:00 Segment Limit 15:00 0

Base 23:16D/B

AVL

Seg.Limit19:16

G — GranularityLIMIT — Segment LimitP — Segment presentS — Descriptor type (0 = system; 1 = code or data)TYPE — Segment type

DPL — Descriptor privilege level

AVL — Available for use by system softwareBASE — Segment base addressD/B — Default operation size (0 = 16-bit segment; 1 = 32-bit segment)

3-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (74)

PROTECTED-MODE MEMORY MANAGEMENT

ting

lear).ptionaded

ag toiven

esent freerma-

or isstackents

etruc-8-bit8-bitct anelect

for set,r; if-bit

datas the

sd is

(64

DPL (descriptor privilege level) fieldSpecifies the privilege level of the segment. The privilege level can range from0 to 3, with 0 being the most privileged level. The DPL is used to control accessto the segment. Refer to Section 4.5., “Privilege Levels” in Chapter 4, Protec-tion for a description of the relationship of the DPL to the CPL of the execucode segment and the RPL of a segment selector.

P (segment-present) flagIndicates whether the segment is present in memory (set) or not present (cIf this flag is clear, the processor generates a segment-not-present exce(#NP) when a segment selector that points to the segment descriptor is lointo a segment register. Memory management software can use this flcontrol which segments are actually loaded into physical memory at a gtime. It offers a control in addition to paging for managing virtual memory.

Figure 3-9 shows the format of a segment descriptor when the segment-prflag is clear. When this flag is clear, the operating system or executive isto use the locations marked “Available” to store its own data, such as infotion regarding the whereabouts of the missing segment.

D/B (default operation size/default stack pointer size and/or upper bound) flagPerforms different functions depending on whether the segment descriptan executable code segment, an expand-down data segment, or a segment. (This flag should always be set to 1 for 32-bit code and data segmand to 0 for 16-bit code and data segments.)

• Executable code segment. The flag is called the D flag and it indicates thdefault length for effective addresses and operands referenced by instions in the segment. If the flag is set, 32-bit addresses and 32-bit or operands are assumed; if it is clear, 16-bit addresses and 16-bit or operands are assumed. The instruction prefix 66H can be used to seleoperand size other than the default, and the prefix 67H can be used san address size other than the default.

• Stack segment (data segment pointed to by the SS register). The flag iscalled the B (big) flag and it specifies the size of the stack pointer usedimplicit stack operations (such as pushes, pops, and calls). If the flag isa 32-bit stack pointer is used, which is stored in the 32-bit ESP registethe flag is clear, a 16-bit stack pointer is used, which is stored in the 16SP register. If the stack segment is set up to be an expand-downsegment (described in the next paragraph), the B flag also specifieupper bound of the stack segment.

• Expand-down data segment. The flag is called the B flag and it specifiethe upper bound of the segment. If the flag is set, the upper bounFFFFFFFFH (4 GBytes); if the flag is clear, the upper bound is FFFFHKBytes).

3-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (75)

PROTECTED-MODE MEMORY MANAGEMENT

G (granularity) flagDetermines the scaling of the segment limit field. When the granularity flag isclear, the segment limit is interpreted in byte units; when flag is set, thesegment limit is interpreted in 4-KByte units. (This flag does not affect thegranularity of the base address; it is always byte granular.) When the granu-larity flag is set, the twelve least significant bits of an offset are not tested whenchecking the offset against the segment limit. For example, when the granu-larity flag is set, a limit of 0 results in valid offsets from 0 to 4095.

Available and reserved bitsBit 20 of the second doubleword of the segment descriptor is available for useby system software; bit 21 is reserved and should always be set to 0.

3.4.3.1. CODE- AND DATA-SEGMENT DESCRIPTOR TYPES

When the S (descriptor type) flag in a segment descriptor is set, the descriptor is for either a codeor a data segment. The highest order bit of the type field (bit 11 of the second double word ofthe segment descriptor) then determines whether the descriptor is for a data segment (clear) ora code segment (set).

For data segments, the three low-order bits of the type field (bits 8, 9, and 10) are interpreted asaccessed (A), write-enable (W), and expansion-direction (E). Refer to Table 3-1 for a descrip-tion of the encoding of the bits in the type field for code and data segments. Data segments canbe read-only or read/write segments, depending on the setting of the write-enable bit.

Figure 3-9. Segment Descriptor When Segment-Present Flag Is Clear

31 16 15 1314 12 11 8 7 0

0AvailableDPL

TypeS 4

31 0

Available 0

Available

3-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (76)

PROTECTED-MODE MEMORY MANAGEMENT

t Calls

Stack segments are data segments which must be read/write segments. Loading the SS registerwith a segment selector for a nonwritable data segment generates a general-protection exception(#GP). If the size of a stack segment needs to be changed dynamically, the stack segment can bean expand-down data segment (expansion-direction flag set). Here, dynamically changing thesegment limit causes stack space to be added to the bottom of the stack. If the size of a stacksegment is intended to remain static, the stack segment may be either an expand-up or expand-down type.

The accessed bit indicates whether the segment has been accessed since the last time the oper-ating-system or executive cleared the bit. The processor sets this bit whenever it loads a segmentselector for the segment into a segment register. The bit remains set until explicitly cleared. Thisbit can be used both for virtual memory management and for debugging.

For code segments, the three low-order bits of the type field are interpreted as accessed (A), readenable (R), and conforming (C). Code segments can be execute-only or execute/read, dependingon the setting of the read-enable bit. An execute/read segment might be used when constants orother static data have been placed with instruction code in a ROM. Here, data can be read fromthe code segment either by using an instruction with a CS override prefix or by loading asegment selector for the code segment in a data-segment register (the DS, ES, FS, or GS regis-ters). In protected mode, code segments are not writable.

Code segments can be either conforming or nonconforming. A transfer of execution into a more-privileged conforming segment allows execution to continue at the current privilege level. Atransfer into a nonconforming segment at a different privilege level results in a general-protec-tion exception (#GP), unless a call gate or task gate is used (refer to Section 4.8.1., “Direcor Jumps to Code Segments” in Chapter 4, Protection for more information on conforming and

Table 3-1. Code- and Data-Segment Types

Type Field

DescriptorType DescriptionDecimal

11 10E

9W

8A

01234567

00000000

00001111

00110011

01010101

DataDataDataDataDataDataDataData

Read-OnlyRead-Only, accessedRead/WriteRead/Write, accessedRead-Only, expand-downRead-Only, expand-down, accessedRead/Write, expand-downRead/Write, expand-down, accessed

C R A

89

101112131415

11111111

00001111

00110011

01010101

CodeCodeCodeCodeCodeCodeCodeCode

Execute-OnlyExecute-Only, accessedExecute/ReadExecute/Read, accessedExecute-Only, conformingExecute-Only, conforming, accessedExecute/Read-Only, conformingExecute/Read-Only, conforming, accessed

3-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (77)

PROTECTED-MODE MEMORY MANAGEMENT

e

r tables

system

mentsble 3-2s.

nonconforming code segments). System utilities that do not access protected facilities andhandlers for some types of exceptions (such as, divide error or overflow) may be loaded inconforming code segments. Utilities that need to be protected from less privileged programs andprocedures should be placed in nonconforming code segments.

NOTE

Execution cannot be transferred by a call or a jump to a less-privileged(numerically higher privilege level) code segment, regardless of whether thetarget segment is a conforming or nonconforming code segment. Attemptingsuch an execution transfer will result in a general-protection exception.

All data segments are nonconforming, meaning that they cannot be accessed by less privilegedprograms or procedures (code executing at numerically high privilege levels). Unlike codesegments, however, data segments can be accessed by more privileged programs or procedures(code executing at numerically lower privilege levels) without using a special access gate.

The processor may update the Type field when a segment is accessed, even if the access is a readcycle. If the descriptor tables have been put in ROM, it may be necessary for hardware to preventthe ROM from being enabled onto the data bus during a write cycle. It also may be necessary toreturn the READY# signal to the processor when a write cycle to ROM occurs, otherwisethe cycle will not terminate. These features of the hardware design are necessary for usingROM-based descriptor tables with the Intel386™ DX processor, which always sets thAccessed bit when a segment descriptor is loaded. The P6 family, Pentium®, and Intel486™processors, however, only set the accessed bit if it is not already set. Writes to descriptoin ROM can be avoided by setting the accessed bits in every descriptor.

3.5. SYSTEM DESCRIPTOR TYPES

When the S (descriptor type) flag in a segment descriptor is clear, the descriptor type is a descriptor. The processor recognizes the following types of system descriptors:

• Local descriptor-table (LDT) segment descriptor.

• Task-state segment (TSS) descriptor.

• Call-gate descriptor.

• Interrupt-gate descriptor.

• Trap-gate descriptor.

• Task-gate descriptor.

These descriptor types fall into two categories: system-segment descriptors and gate descriptors.System-segment descriptors point to system segments (LDT and TSS segments). Gate descrip-tors are in themselves “gates,” which hold pointers to procedure entry points in code seg(call, interrupt, and trap gates) or which hold segment selectors for TSS’s (task gates). Tashows the encoding of the type field for system-segment descriptors and gate descriptor

3-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (78)

PROTECTED-MODE MEMORY MANAGEMENT

gment

apter

criptor

For more information on the system-segment descriptors, refer to Section 3.5.1., “SeDescriptor Tables”, and Section 6.2.2., “TSS Descriptor” in Chapter 6, Task Management. Formore information on the gate descriptors, refer to Section 4.8.2., “Gate Descriptors” in Ch4, Protection; Section 5.9., “IDT Descriptors” in Chapter 5, Interrupt and Exception Handling;and Section 6.2.4., “Task-Gate Descriptor” in Chapter 6, Task Management.

3.5.1. Segment Descriptor Tables

A segment descriptor table is an array of segment descriptors (refer to Figure 3-10). A destable is variable in length and can contain up to 8192 (213) 8-byte descriptors. There are twokinds of descriptor tables:

• The global descriptor table (GDT)

• The local descriptor tables (LDT)

Table 3-2. System-Segment and Gate-Descriptor Types

Type Field

Decimal 11 10 9 8 Description

0 0 0 0 0 Reserved

1 0 0 0 1 16-Bit TSS (Available)

2 0 0 1 0 LDT

3 0 0 1 1 16-Bit TSS (Busy)

4 0 1 0 0 16-Bit Call Gate

5 0 1 0 1 Task Gate

6 0 1 1 0 16-Bit Interrupt Gate

7 0 1 1 1 16-Bit Trap Gate

8 1 0 0 0 Reserved

9 1 0 0 1 32-Bit TSS (Available)

10 1 0 1 0 Reserved

11 1 0 1 1 32-Bit TSS (Busy)

12 1 1 0 0 32-Bit Call Gate

13 1 1 0 1 Reserved

14 1 1 1 0 32-Bit Interrupt Gate

15 1 1 1 1 32-Bit Trap Gate

3-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (79)

PROTECTED-MODE MEMORY MANAGEMENT

cessor limite of 0e GDT

“nullS, ES,mpt is

Each system must have one GDT defined, which may be used for all programs and tasks in thesystem. Optionally, one or more LDTs can be defined. For example, an LDT can be defined foreach separate task being run, or some or all tasks can share the same LDT.

The GDT is not a segment itself; instead, it is a data structure in the linear address space. Thebase linear address and limit of the GDT must be loaded into the GDTR register (refer to Section2.4., “Memory-Management Registers” in Chapter 2, System Architecture Overview). The baseaddresses of the GDT should be aligned on an eight-byte boundary to yield the best properformance. The limit value for the GDT is expressed in bytes. As with segments, thevalue is added to the base address to get the address of the last valid byte. A limit valuresults in exactly one valid byte. Because segment descriptors are always 8 bytes long, thlimit should always be one less than an integral multiple of eight (that is, 8N – 1).

The first descriptor in the GDT is not used by the processor. A segment selector to thisdescriptor” does not generate an exception when loaded into a data-segment register (DFS, or GS), but it always generates a general-protection exception (#GP) when an atte

Figure 3-10. Global and Local Descriptor Tables

SegmentSelector

GlobalDescriptor

T

First Descriptor inGDT is Not Used

TI = 0I

56

40

48

32

24

16

8

TI = 1

56

40

48

32

24

16

8

Table (GDT)

LocalDescriptor

Table (LDT)

Base AddressLimit

GDTR Register LDTR Register

Base AddressSeg. Sel.

Limit

3-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (80)

PROTECTED-MODE MEMORY MANAGEMENT

infor-

essing stored

er 2,

iptor”(priv-ddressligned

sibilityin thisstruc-ely),D 4 is

e to bectlyppingvirtual

(gener-hen aaddresss into antly inhandler pagemory

made to access memory using the descriptor. By initializing the segment registers with thissegment selector, accidental reference to unused segment registers can be guaranteed to generatean exception.

The LDT is located in a system segment of the LDT type. The GDT must contain a segmentdescriptor for the LDT segment. If the system supports multiple LDTs, each must have a sepa-rate segment selector and segment descriptor in the GDT. The segment descriptor for an LDTcan be located anywhere in the GDT. Refer to Section 3.5., “System Descriptor Types” for mation on the LDT segment-descriptor type.

An LDT is accessed with its segment selector. To eliminate address translations when accthe LDT, the segment selector, base linear address, limit, and access rights of the LDT arein the LDTR register (refer to Section 2.4., “Memory-Management Registers” in ChaptSystem Architecture Overview).

When the GDTR register is stored (using the SGDT instruction), a 48-bit “pseudo-descris stored in memory (refer to Figure 3-11). To avoid alignment check faults in user mode ilege level 3), the pseudo-descriptor should be located at an odd word address (that is, aMOD 4 is equal to 2). This causes the processor to store an aligned word, followed by an adoubleword. User-mode programs normally do not store pseudo-descriptors, but the posof generating an alignment check fault can be avoided by aligning pseudo-descriptors way. The same alignment should be used when storing the IDTR register using the SIDT intion. When storing the LDTR or task register (using the SLTR or STR instruction, respectivthe pseudo-descriptor should be located at a doubleword address (that is, address MOequal to 0).

3.6. PAGING (VIRTUAL MEMORY)

When operating in protected mode, the Intel Architecture permits the linear address spacmapped directly into a large physical memory (for example, 4 GBytes of RAM) or indire(using paging) into a smaller physical memory and disk storage. This latter method of mathe linear address space is commonly referred to as virtual memory or demand-paged memory.

When paging is used, the processor divides the linear address space into fixed-size pagesally 4 KBytes in length) that can be mapped into physical memory and/or disk storage. Wprogram (or task) references a logical address in memory, the processor translates the into a linear address and then uses its paging mechanism to translate the linear addrescorresponding physical address. If the page containing the linear address is not currephysical memory, the processor generates a page-fault exception (#PF). The exception for the page-fault exception typically directs the operating system or executive to load thefrom disk storage into physical memory (perhaps writing a different page from physical me

Figure 3-11. Pseudo-Descriptor Format

Base Address Limit

47 1516

3-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (81)

PROTECTED-MODE MEMORY MANAGEMENT

” for

page-m or ifmode.

flag isefer to

” for

only behysical

more

out to disk in the process). When the page has been loaded in physical memory, a return fromthe exception handler causes the instruction that generated the exception to be restarted. Theinformation that the processor uses to map linear addresses into the physical address space andto generate page-fault exceptions (when necessary) is contained in page directories and pagetables stored in memory.

Paging is different from segmentation through its use of fixed-size pages. Unlike segments,which usually are the same size as the code or data structures they hold, pages have a fixed size.If segmentation is the only form of address translation used, a data structure present in physicalmemory will have all of its parts in memory. If paging is used, a data structure can be partly inmemory and partly in disk storage.

To minimize the number of bus cycles required for address translation, the most recentlyaccessed page-directory and page-table entries are cached in the processor in devices calledtranslation lookaside buffers (TLBs). The TLBs satisfy most requests for reading the currentpage directory and page tables without requiring a bus cycle. Extra bus cycles occur only whenthe TLBs do not contain a page-table entry, which typically happens when a page has not beenaccessed for a long time. Refer to Section 3.7., “Translation Lookaside Buffers (TLBs)more information on the TLBs.

3.6.1. Paging OptionsPaging is controlled by three flags in the processor’s control registers:

• PG (paging) flag, bit 31 of CR0 (available in all Intel Architecture processors beginningwith the Intel386™ processor).

• PSE (page size extensions) flag, bit 4 of CR4 (introduced in the Pentium® and Pentium®

Pro processors).

• PAE (physical address extension) flag, bit 5 of CR4 (introduced in the Pentium® Proprocessors).

The PG flag enables the page-translation mechanism. The operating system or executive usuallysets this flag during processor initialization. The PG flag must be set if the processor’s translation mechanism is to be used to implement a demand-paged virtual memory systethe operating system is designed to run more than one program (or task) in virtual-8086

The PSE flag enables large page sizes: 4-MByte pages or 2-MByte pages (when the PAEset). When the PSE flag is clear, the more common page length of 4 KBytes is used. RChapter 3.6.2.2., Linear Address Translation (4-MByte Pages) and Section 3.8.2., “LinearAddress Translation With Extended Addressing Enabled (2-MByte or 4-MByte Pages)more information about the use of the PSE flag.

The PAE flag enables 36-bit physical addresses. This physical address extension can used when paging is enabled. It relies on page directories and page tables to reference paddresses above FFFFFFFFH. Refer to Section 3.8., “Physical Address Extension” forinformation about the physical address extension.

3-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (82)

PROTECTED-MODE MEMORY MANAGEMENT

Byte

e. Up for 2-page-

to asion is

ysicald (36-ess sizeains a in turno a 4-

resses a pagep to 2

3.6.2. Page Tables and Directories

The information that the processor uses to translate linear addresses into physical addresses(when paging is enabled) is contained in four data structures:

• Page directory—An array of 32-bit page-directory entries (PDEs) contained in a 4-Kpage. Up to 1024 page-directory entries can be held in a page directory.

• Page table—An array of 32-bit page-table entries (PTEs) contained in a 4-KByte pagto 1024 page-table entries can be held in a page table. (Page tables are not usedMByte or 4-MByte pages. These page sizes are mapped directly from one or more directory entries.)

• Page—A 4-KByte, 2-MByte, or 4-MByte flat address space.

• Page-Directory-Pointer Table—An array of four 64-bit entries, each of which points page directory. This data structure is only used when the physical address extenenabled (refer to Section 3.8., “Physical Address Extension”).

These tables provide access to either 4-KByte or 4-MByte pages when normal 32-bit phaddressing is being used and to either 4-KByte, 2-MByte, or 4-MByte pages when extendebit) physical addressing is being used. Table 3-3 shows the page size and physical addrobtained from various settings of the paging control flags. Each page-directory entry contPS (page size) flag that specifies whether the entry points to a page table whose entriespoint to 4-KByte pages (PS set to 0) or whether the page-directory entry points directly tMByte or 2-MByte page (PSE or PAE set to 1 and PS set to 1).

3.6.2.1. LINEAR ADDRESS TRANSLATION (4-KBYTE PAGES)

Figure 3-12 shows the page directory and page-table hierarchy when mapping linear addto 4-KByte pages. The entries in the page directory point to page tables, and the entries intable point to pages in physical memory. This paging method can be used to address u20

pages, which spans a linear address space of 232 bytes (4 GBytes).

Table 3-3. Page Sizes and Physical Address Sizes

PG Flag, CR0PAE Flag,

CR4 PSE Flag, CR4 PS Flag, PDE Page SizePhysical

Address Size

0 X X X — Paging Disabled

1 0 0 X 4 KBytes 32 Bits

1 0 1 0 4 KBytes 32 Bits

1 0 1 1 4 MBytes 32 Bits

1 1 X 0 4 KBytes 36 Bits

1 1 X 1 2 MBytes 36 Bits

3-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (83)

PROTECTED-MODE MEMORY MANAGEMENT

page

try inhysical

s and

pages.ging

To select the various table entries, the linear address is divided into three sections:

• Page-directory entry—Bits 22 through 31 provide an offset to an entry in the directory. The selected entry provides the base physical address of a page table.

• Page-table entry—Bits 12 through 21 of the linear address provide an offset to an enthe selected page table. This entry provides the base physical address of a page in pmemory.

• Page offset—Bits 0 through 11 provides an offset to a physical address in the page.

Memory management software has the option of using one page directory for all programtasks, one page directory for each task, or some combination of the two.

3.6.2.2. LINEAR ADDRESS TRANSLATION (4-MBYTE PAGES)

Figure 3-12 shows how a page directory can be used to map linear addresses to 4-MByteThe entries in the page directory point to 4-MByte pages in physical memory. This pamethod can be used to map up to 1024 pages into a 4-GByte linear address space.

Figure 3-12. Linear Address Translation (4-KByte Pages)

Directory Table Offset

Page Directory

Directory Entry

CR3 (PDBR)

Page Table

Page-Table Entry

4-KByte Page

Physical Address

31 21 111222Linear Address

1024 PDE ∗ 1024 PTE = 220 Pages32*

10

12

10

*32 bits aligned onto a 4-KByte boundary.

3-21

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (84)

PROTECTED-MODE MEMORY MANAGEMENT

page

es can-KByte).

m orperfor-parate

The 4-MByte page size is selected by setting the PSE flag in control register CR4 and settingthe page size (PS) flag in a page-directory entry (refer to Figure 3-14). With these flags set, thelinear address is divided into two sections:

• Page directory entry—Bits 22 through 31 provide an offset to an entry in the directory. The selected entry provides the base physical address of a 4-MByte page.

• Page offset—Bits 0 through 21 provides an offset to a physical address in the page.

NOTE

(For the Pentium® processor only.) When enabling or disabling large pagesizes, the TLBs must be invalidated (flushed) after the PSE flag in controlregister CR4 has been set or cleared. Otherwise, incorrect page translationmight occur due to the processor using outdated page translation informationstored in the TLBs. Refer to Section 9.10., “Invalidating the TranslationLookaside Buffers (TLBs)”, in Chapter 9, Memory Cache Control, forinformation on how to invalidate the TLBs.

3.6.2.3. MIXING 4-KBYTE AND 4-MBYTE PAGES

When the PSE flag in CR4 is set, both 4-MByte pages and page tables for 4-KByte pagbe accessed from the same page directory. If the PSE flag is clear, only page tables for 4pages can be accessed (regardless of the setting of the PS flag in a page-directory entry

A typical example of mixing 4-KByte and 4-MByte pages is to place the operating systeexecutive’s kernel in a large page to reduce TLB misses and thus improve overall system mance. The processor maintains 4-MByte page entries and 4-KByte page entries in se

Figure 3-13. Linear Address Translation (4-MByte Pages)

Directory Offset

Page Directory

Directory Entry

CR3 (PDBR)

4-MByte Page

Physical Address

31 2122Linear Address

1024 PDE = 1024 Pages

10

22

32*

*32 bits aligned onto a 4-KByte boundary.

3-22

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (85)

PROTECTED-MODE MEMORY MANAGEMENT

egis-

rior tolue in.1.,

e not-eratingTSS isremain

Bytefor the. Refer

-table

TLBs. So, placing often used code such as the kernel in a large page, frees up 4-KByte-pageTLB entries for application programs and tasks.

3.6.3. Base Address of the Page Directory

The physical address of the current page directory is stored in the CR3 register (also called thepage directory base register or PDBR). (Refer to Figure 2-5 and Section 2.5., “Control Rters” in Chapter 2, System Architecture Overview for more information on the PDBR.) If pagingis to be used, the PDBR must be loaded as part of the processor initialization process (penabling paging). The PDBR can then be changed either explicitly by loading a new vaCR3 with a MOV instruction or implicitly as part of a task switch. (Refer to Section 6.2“Task-State Segment (TSS)” in Chapter 6, Task Management for a description of how thecontents of the CR3 register is set for a task.)

There is no present flag in the PDBR for the page directory. The page directory may bpresent (paged out of physical memory) while its associated task is suspended, but the opsystem must ensure that the page directory indicated by the PDBR image in a task's present in physical memory before the task is dispatched. The page directory must also in memory as long as the task is active.

3.6.4. Page-Directory and Page-Table Entries

Figure 3-14 shows the format for the page-directory and page-table entries when 4-Kpages and 32-bit physical addresses are being used. Figure 3-14 shows the format page-directory entries when 4-MByte pages and 32-bit physical addresses are being usedto Section 3.8., “Physical Address Extension” for the format of page-directory and pageentries when the physical address extension is being used.

3-23

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (86)

PROTECTED-MODE MEMORY MANAGEMENT

Figure 3-14. Format of Page-Directory and Page-Table Entries for 4-KByte Pages and 32-Bit Physical Addresses

31

Available for system programmer’s useGlobal page (Ignored)Page size (0 indicates 4 KBytes)Reserved (set to 0)

12 11 9 8 7 6 5 4 3 2 1 0

PS

PCA0

AccessedCache disabledWrite-throughUser/SupervisorRead/WritePresent

DP

PWT

U/S

R/

WGAvail.Page-Table Base Address

31

Available for system programmer’s useGlobal pageReserved (set to 0)Dirty

12 11 9 8 7 6 5 4 3 2 1 0

PCAD

AccessedCache disabledWrite-throughUser/SupervisorRead/WritePresent

DP

PWT

U/S

R/

WGAvail.Page Base Address

Page-Directory Entry (4-KByte Page Table)

Page-Table Entry (4-KByte Page)

3-24

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (87)

PROTECTED-MODE MEMORY MANAGEMENT

The functions of the flags and fields in the entries in Figures 3-14 and 3-15 are as follows:

Page base address, bits 12 through 32(Page-table entries for 4-KByte pages.) Specifies the physical address of thefirst byte of a 4-KByte page. The bits in this field are interpreted as the 20 most-significant bits of the physical address, which forces pages to be aligned on4-KByte boundaries.

(Page-directory entries for 4-KByte page tables.) Specifies the physicaladdress of the first byte of a page table. The bits in this field are interpreted asthe 20 most-significant bits of the physical address, which forces page tables tobe aligned on 4-KByte boundaries.

(Page-directory entries for 4-MByte pages.) Specifies the physical address ofthe first byte of a 4-MByte page. Only bits 22 through 31 of this field are used(and bits 12 through 21 are reserved and must be set to 0, for Intel Architectureprocessors through the Pentium® II processor). The base address bits are inter-preted as the 10 most-significant bits of the physical address, which forces 4-MByte pages to be aligned on 4-MByte boundaries.

Present (P) flag, bit 0Indicates whether the page or page table being pointed to by the entry iscurrently loaded in physical memory. When the flag is set, the page is in phys-ical memory and address translation is carried out. When the flag is clear, thepage is not in memory and, if the processor attempts to access the page, itgenerates a page-fault exception (#PF).

The processor does not set or clear this flag; it is up to the operating system orexecutive to maintain the state of the flag.

Figure 3-15. Format of Page-Directory Entries for 4-MByte Pages and 32-Bit Addresses

31

Available for system programmer’s useGlobal pagePage size (1 indicates 4 MBytes)Dirty

12 11 9 8 7 6 5 4 3 2 1 0

PS

PCAD

AccessedCache disabledWrite-throughUser/SupervisorRead/WritePresent

DP

PWT

U/S

R/

WGAvail.Page Base Address

Page-Directory Entry (4-MByte Page)22 21

Reserved

3-25

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (88)

PROTECTED-MODE MEMORY MANAGEMENT

d

or

se ofr, the into.r to

theag iss set,R/Wvel

orr theng is flag ifche

l

ag isflag isto be

The bit must be set to 1 whenever extended physical addressing mode isenabled.

If the processor generates a page-fault exception, the operating system mustcarry out the following operations in the order below:

1. Copy the page from disk storage into physical memory, if needed.

2. Load the page address into the page-table or page-directory entry and setit* present flag. Other bits, such as the dirty and accessed flags, may alsobe set at this time.

3. Invalidate the current page-table entry in the TLB (refer to Section 3.7.,“Translation Lookaside Buffers (TLBs)” for a discussion of TLBs anhow to invalidate them).

4. Return from the page-fault handler to restart the interrupted programtask.

Read/write (R/W) flag, bit 1Specifies the read-write privileges for a page or group of pages (in the caa page-directory entry that points to a page table). When this flag is cleapage is read only; when the flag is set, the page can be read and writtenThis flag interacts with the U/S flag and the WP flag in register CR0. RefeSection 4.11., “Page-Level Protection” and Table 4-2 in Chapter 4, Protectionfor a detailed discussion of the use of these flags.

User/supervisor (U/S) flag, bit 2Specifies the user-supervisor privileges for a page or group of pages (incase of a page-directory entry that points to a page table). When this flclear, the page is assigned the supervisor privilege level; when the flag ithe page is assigned the user privilege level. This flag interacts with the flag and the WP flag in register CR0. Refer to Section 4.11., “Page-LeProtection” and Table 4-2 in Chapter 4, Protection for a detail discussion of theuse of these flags.

Page-level write-through (PWT) flag, bit 3Controls the write-through or write-back caching policy of individual pagespage tables. When the PWT flag is set, write-through caching is enabled foassociated page or page table; when the flag is clear, write-back cachienabled for the associated page or page table. The processor ignores thisthe CD (cache disable) flag in CR0 is set. Refer to Section 9.5., “CaControl”, in Chapter 9, Memory Cache Control, for more information about theuse of this flag. Refer to Section 2.5., “Control Registers” in Chapter 2, SystemArchitecture Overview for a description of a companion PWT flag in controregister CR3.

Page-level cache disable (PCD) flag, bit 4Controls the caching of individual pages or page tables. When the PCD flset, caching of the associated page or page table is prevented; when the clear, the page or page table can be cached. This flag permits caching

3-26

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (89)

PROTECTED-MODE MEMORY MANAGEMENT

r it.d ford page

ed insoft-caled forsorndre tomory.

tries.entryormalbled)ointsByte

ffers in

disabled for pages that contain memory-mapped I/O ports or that do notprovide a performance benefit when cached. The processor ignores this flag(assumes it is set) if the CD (cache disable) flag in CR0 is set. Refer to Chapter9, Memory Cache Control, for more information about the use of this flag.Refer to Section 2.5. in Chapter 2, System Architecture Overview for a descrip-tion of a companion PCD flag in control register CR3.

Accessed (A) flag, bit 5Indicates whether a page or page table has been accessed (read from or writtento) when set. Memory management software typically clears this flag when apage or page table is initially loaded into physical memory. The processor thensets this flag the first time a page or page table is accessed. This flag is a“sticky” flag, meaning that once set, the processor does not implicitly cleaOnly software can clear this flag. The accessed and dirty flags are provideuse by memory management software to manage the transfer of pages antables into and out of physical memory.

Dirty (D) flag, bit 6Indicates whether a page has been written to when set. (This flag is not uspage-directory entries that point to page tables.) Memory management ware typically clears this flag when a page is initially loaded into physimemory. The processor then sets this flag the first time a page is accessa write operation. This flag is “sticky,” meaning that once set, the procesdoes not implicitly clear it. Only software can clear this flag. The dirty aaccessed flags are provided for use by memory management softwamanage the transfer of pages and page tables into and out of physical me

Page size (PS) flag, bit 7Determines the page size. This flag is only used in page-directory enWhen this flag is clear, the page size is 4 KBytes and the page-directory points to a page table. When the flag is set, the page size is 4 MBytes for n32-bit addressing (and 2 MBytes if extended physical addressing is enaand the page-directory entry points to a page. If the page-directory entry pto a page table, all the pages associated with that page table will be 4-Kpages.

Global (G) flag, bit 8(Introduced in the Pentium® Pro processor.) Indicates a global page when set.When a page is marked global and the page global enable (PGE) flag in registerCR4 is set, the page-table or page-directory entry for the page is not invalidatedin the TLB when register CR3 is loaded or a task switch occurs. This flag isprovided to prevent frequently used pages (such as pages that contain kernel orother operating system or executive code) from being flushed from the TLB.Only software can set or clear this flag. For page-directory entries that point topage tables, this flag is ignored and the global characteristics of a page are setin the page-table entries. Refer to Section 3.7., “Translation Lookaside Bu(TLBs)” for more information about the use of this flag. (This bit is reservedPentium® and earlier Intel Architecture processors.)

3-27

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (90)

PROTECTED-MODE MEMORY MANAGEMENT

isteret to 0.

tem ore page

on-chip

Exten-must

Reserved and available-to-software bitsIn a page-table entry, bit 7 is reserved and should be set to 0; in a page-directoryentry that points to a page table, bit 6 is reserved and should be set to 0. For apage-directory entry for a 4-MByte page, bits 12 through 21 are reserved andmust be set to 0, for Intel Architecture processors through the Pentium® IIprocessor. For both types of entries, bits 9, 10, and 11 are available for use bysoftware. (When the present bit is clear, bits 1 through 31 are available to soft-ware—refer to Figure 3-16.) When the PSE and PAE flags in control regCR4 are set, the processor generates a page fault if reserved bits are not s

3.6.5. Not Present Page-Directory and Page-Table Entries

When the present flag is clear for a page-table or page-directory entry, the operating sysexecutive may use the rest of the entry for storage of information such as the location of thin the disk storage system (refer to ).

3.7. TRANSLATION LOOKASIDE BUFFERS (TLBS)

The processor stores the most recently used page-directory and page-table entries in caches called translation lookaside buffers or TLBs. The P6 family and Pentium® processorshave separate TLBs for the data and instruction caches. Also, the P6 family processors maintainseparate TLBs for 4-KByte and 4-MByte page sizes. The CPUID instruction can be used todetermine the sizes of the TLBs provided in the P6 family and Pentium® processors.

Most paging is performed using the contents of the TLBs. Bus cycles to the page directory andpage tables in memory are performed only when the TLBs do not contain the translation infor-mation for a requested page.

The TLBs are inaccessible to application programs and tasks (privilege level greater than 0); thatis, they cannot invalidate TLBs. Only operating system or executive procedures running at priv-ilege level of 0 can invalidate TLBs or selected TBL entries. Whenever a page-directory orpage-table entry is changed (including when the present flag is set to zero), the operating-systemmust immediately invalidate the corresponding entry in the TLB so that it can be updated thenext time the entry is referenced. However, if the physical address extension (PAE) feature isenabled to use 36-bit addressing, a new table is added to the paging hierarchy. This new table iscalled the page directory pointer table (as described in Section 3.8., “Physical Address sion”). If an entry is changed in this table (to point to another page directory), the TLBs then be flushed by writing to CR3.

Figure 3-16. Format of a Page-Table or Page-Directory Entry for a Not-Present Page

31 0

0Available to Operating System or Executive

3-28

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (91)

PROTECTED-MODE MEMORY MANAGEMENT

more-tableayte the

able

ting

ysical in the

All (nonglobal) TLBs are automatically invalidated any time the CR3 register is loaded (unlessthe G flag for a page or page-table entry is set, as describe later in this section). The CR3 registercan be loaded in either of two ways:

• Explicitly, using the MOV instruction, for example:

MOV CR3, EAX

where the EAX register contains an appropriate page-directory base address.

• Implicitly by executing a task switch, which automatically changes the contents of the CR3register.

The INVLPG instruction is provided to invalidate a specific page-table entry in the TLB.Normally, this instruction invalidates only an individual TLB entry; however, in some cases, itmay invalidate more than the selected entry and may even invalidate all of the TLBs. Thisinstruction ignores the setting of the G flag in a page-directory or page-table entry (refer to thefollowing paragraph).

(Introduced in the Pentium® Pro processor.) The page global enable (PGE) flag in register CR4and the global (G) flag of a page-directory or page-table entry (bit 8) can be used to preventfrequently used pages from being automatically invalidated in the TLBs on a task switch or aload of register CR3. (Refer to Section 3.6.4., “Page-Directory and Page-Table Entries” forinformation about the global flag.) When the processor loads a page-directory or pageentry for a global page into a TLB, the entry will remain in the TLB indefinitely. The only wto deterministically invalidate global page entries is to clear the PGE flag and then invalidaTLBs or to use the INVLPG instruction to invalidate individual page-directory or page-tentries in the TLBs.

For additional information about invalidation of the TLBs, refer to Section 9.10., “Invalidathe Translation Lookaside Buffers (TLBs)”, in Chapter 9, Memory Cache Control.

3.8. PHYSICAL ADDRESS EXTENSION

The physical address extension (PAE) flag in register CR4 enables an extension of phaddresses from 32 bits to 36 bits. (This feature was introduced into the Intel ArchitecturePentium® Pro processors.) Here, the processor provides 4 additional address line pins to accom-modate the additional address bits. This option can only be used when paging is enabled (thatis, when both the PG flag in register CR0 and the PAE flag in register CR4 are set).

When the physical address extension is enabled, the processor allows several sizes of pages:4-KByte, 2-MByte, or 4-MByte. As with 32-bit addressing, these page sizes can be addressedwithin the same set of paging tables (that is, a page-directory entry can point to either a 2-MByteor 4-MByte page or a page table that in turn points to 4-KByte pages). To support the 36-bitphysical addresses, the following changes are made to the paging data structures:

• The paging table entries are increased to 64 bits to accommodate 36-bit base physicaladdresses. Each 4-KByte page directory and page table can thus have up to 512 entries.

3-29

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (92)

PROTECTED-MODE MEMORY MANAGEMENT

• A new table, called the page-directory-pointer table, is added to the linear-addresstranslation hierarchy. This table has 4 entries of 64-bits each, and it lies above the pagedirectory in the hierarchy. With the physical address extension mechanism enabled, theprocessor supports up to 4 page directories.

• The 20-bit page-directory base address field in register CR3 (PDPR) is replaced with a27-bit page-directory-pointer-table base address field (refer to Figure 3-17). (In this case,register CR3 is called the PDPTR.) This field provides the 27 most-significant bits of thephysical address of the first byte of the page-directory-pointer table, which forces the tableto be located on a 32-byte boundary.

• Linear address translation is changed to allow mapping 32-bit linear addresses into thelarger physical address space.

3.8.1. Linear Address Translation With Extended Addressing Enabled (4-KByte Pages)

Figure 3-12 shows the page-directory-pointer, page-directory, and page-table hierarchy whenmapping linear addresses to 4-KByte pages with extended physical addressing enabled. Thispaging method can be used to address up to 220 pages, which spans a linear address space of 232

bytes (4 GBytes).

Figure 3-17. Register CR3 Format When the Physical Address Extension is Enabled

31 0

0Page-Directory-Pointer-Table Base AddressPCD

PWT

00

3-30

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (93)

PROTECTED-MODE MEMORY MANAGEMENT

ntriesddress

page

page

To select the various table entries, the linear address is divided into three sections:

• Page-directory-pointer-table entry—Bits 30 and 31 provide an offset to one of the 4 ein the page-directory-pointer table. The selected entry provides the base physical aof a page directory.

• Page-directory entry—Bits 21 through 29 provide an offset to an entry in the selecteddirectory. The selected entry provides the base physical address of a page table.

• Page-table entry—Bits 12 through 20 provide an offset to an entry in the selectedtable. This entry provides the base physical address of a page in physical memory.

• Page offset—Bits 0 through 11 provide an offset to a physical address in the page.

Figure 3-18. Linear Address Translation With Extended Physical Addressing Enabled (4-KByte Pages)

Directory Table Offset

Page Directory

Directory Entry

Page Table

Page-Table Entry

4-KByte Page

Physical Address

31 20 111221Linear Address

Page-Directory-

Dir. Pointer Entry

CR3 (PDBR)

30 29

Pointer Table

Directory Pointer

4 PDPTE ∗ 512 PDE ∗ 512 PTE = 220 Pages

2

9

32*

12

9

*32 bits aligned onto a 32-byte boundary

3-31

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (94)

PROTECTED-MODE MEMORY MANAGEMENT

thes of a

ageByte

tes of-GByte

3.8.2. Linear Address Translation With Extended Addressing Enabled (2-MByte or 4-MByte Pages)

Figure 3-12 shows how a page-directory-pointer table and page directories can be used to maplinear addresses to 2-MByte or 4-MByte pages. This paging method can be used to map up to2048 pages (4 page-directory-pointer-table entries times 512 page-directory entries) into a4-GByte linear address space.

The 2-MByte or 4-MByte page size is selected by setting the PSE flag in control register CR4and setting the page size (PS) flag in a page-directory entry (refer to Figure 3-14). With theseflags set, the linear address is divided into three sections:

• Page-directory-pointer-table entry—Bits 30 and 31 provide an offset to an entry inpage-directory-pointer table. The selected entry provides the base physical addrespage directory.

• Page-directory entry—Bits 21 through 29 provide an offset to an entry in the pdirectory. The selected entry provides the base physical address of a 2-MByte or 4-Mpage.

• Page offset—Bits 0 through 20 provides an offset to a physical address in the page.

3.8.3. Accessing the Full Extended Physical Address Space With the Extended Page-Table Structure

The page-table structure described in the previous two sections allows up to 4 GBythe 64-GByte extended physical address space to be addressed at one time. Additional 4sections of physical memory can be addressed in either of two way:

• Change the pointer in register CR3 to point to another page-directory-pointer table, whichin turn points to another set of page directories and page tables.

• Change entries in the page-directory-pointer table to point to other page directories, whichin turn point to other sets of page tables.

3-32

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (95)

PROTECTED-MODE MEMORY MANAGEMENT

llows:

3.8.4. Page-Directory and Page-Table Entries With Extended Addressing Enabled

Figure 3-20 shows the format for the page-directory-pointer-table, page-directory, andpage-table entries when 4-KByte pages and 36-bit extended physical addresses are beingused. Figure 3-21 shows the format for the page-directory-pointer-table and page-directoryentries when 2-MByte or 4-MByte pages and 36-bit extended physical addresses are beingused. The functions of the flags in these entries are the same as described in Section 3.6.4.,“Page-Directory and Page-Table Entries”. The major differences in these entries are as fo

• A page-directory-pointer-table entry is added.

• The size of the entries are increased from 32 bits to 64 bits.

• The maximum number of entries in a page directory or page table is 512.

• The base physical address field in each entry is extended to 24 bits.

Figure 3-19. Linear Address Translation With Extended Physical Addressing Enabled (2-MByte or 4-MByte Pages)

Directory Offset

Page Directory

Directory Entry

2 or 4-MByte Pages

Physical Address

31 2021Linear Address

Page-Directory-

Dir. Pointer Entry

CR3 (PDBR)

30 29

Pointer Table

DirectoryPointer

4 PDPTE ∗ 512 PDE = 2048 Pages

2

32*

9

21

*32 bits aligned onto a 32-byte boundary

3-33

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (96)

PROTECTED-MODE MEMORY MANAGEMENT

or a

ges),it phys-When a as the

on 2-

The base physical address in an entry specifies the following, depending on the type of entry:

• Page-directory-pointer-table entry—the physical address of the first byte of a 4-KByte page directory.

• Page-directory entry—the physical address of the first byte of a 4-KByte page table2-MByte page.

• Page-table entry—the physical address of the first byte of a 4-KByte page.

For all table entries (except for page-directory entries that point to 2-MByte or 4-MByte pathe bits in the page base address are interpreted as the 24 most-significant bits of a 36-bical address, which forces page tables and pages to be aligned on 4-KByte boundaries. page-directory entry points to a 2-MByte or 4-MByte page, the base address is interpreted15 most-significant bits of a 36-bit physical address, which forces pages to be alignedMByte or 4-MByte boundaries.

Figure 3-20. Format of Page-Directory-Pointer-Table, Page-Directory, and Page-Table Entries for 4-KByte Pages and 36-Bit Extended Physical Addresses

63 36 35 32

BaseReserved (set to 0)

Page-Directory-Pointer-Table Entry

31 12 11 9 8 5 4 3 2 0PCD

PWT

Avail.Page-Directory Base Address

Addr.

Res.Reserved

63 36 35 32

BaseReserved (set to 0)

Page-Directory Entry (4-KByte Page Table)

31 12 11 9 8 7 6 5 4 3 2 1 0PC0D

PPWT

Page-Table Base Address

Addr.

0 0 AR/

W

U/S

63 36 35 32

BaseReserved (set to 0)

Page-Table Entry (4-KByte Page)

31 12 11 9 8 7 6 5 4 3 2 1 0PCDD

PPWT

Page Base Address

Addr.

G 0 AR/

W

U/S

Avail.

Avail.

1

1

3-34

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (97)

PROTECTED-MODE MEMORY MANAGEMENT

The present (P) flag (bit 0) in all page-directory-pointer-table entries must be set to 1 anytimeextended physical addressing mode is enabled; that is, whenever the PAE flag (bit 5 in registerCR4) and the PG flag (bit 31 in register CR0) are set. If the P flag is not set in all 4 page-direc-tory-pointer-table entries in the page-directory-pointer table when extended physical addressingis enabled, a general-protection exception (#GP) is generated.

The page size (PS) flag (bit 7) in a page-directory entry determines if the entry points to a pagetable or a 2-MByte or 4-MByte page. When this flag is clear, the entry points to a page table;when the flag is set, the entry points to a 2-MByte or 4-MByte page. This flag allows 4-KByte,2-MByte, or 4-MByte pages to be mixed within one set of paging tables.

Access (A) and dirty (D) flags (bits 5 and 6) are provided for table entries that point to pages.

Bits 9, 10, and 11 in all the table entries for the physical address extension are available for useby software. (When the present flag is clear, bits 1 through 63 are available to software.) All bitsin Figure 3-14 that are marked reserved or 0 should be set to 0 by software and not accessed bysoftware. When the PSE and/or PAE flags in control register CR4 are set, the processor gener-ates a page fault (#PF) if reserved bits in page-directory and page-table entries are not set to 0,and it generates a general-protection exception (#GP) if reserved bits in a page-directory-pointer-table entry are not set to 0.

3.9. 36-BIT PAGE SIZE EXTENSION (PSE)

The 36-bit PSE extends 36-bit physical address support to 4-MByte pages while maintaining a4-byte page-directory entry. This approach provides a simple mechanism for operating system

Figure 3-21. Format of Page-Directory-Pointer-Table and Page-Directory Entries for 2- or 4-MByte Pages and 36-Bit Extended Physical Addresses

63 36 35 32

BaseReserved (set to 0)

Page-Directory Entry (2- or 4-MByte Pages)

31 12 11 9 8 7 6 5 4 3 2 1 0PCDD

PPWT

Page Base Address

Addr.

G 1 AReserved (set to 0)

21 20R/

W

U/S

63 36 35 32

BaseReserved (set to 0)

Page-Directory-Pointer-Table Entry

31 12 11 9 8 5 4 3 2 0PCD

PWT

Avail.Page Directory Base Address

Addr.

Res.Reserved

Avail.

1

1

3-35

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (98)

PROTECTED-MODE MEMORY MANAGEMENT

accessry and table,

ture isicalation

hi-

fer toam-

al,

vendors to address physical memory above 4-GBytes without requiring major design changes,but has practical limitations with respect to demand paging.

The P6 family of processors’ physical address extension (PAE) feature provides generic to a 36-bit physical address space. However, it requires expansion of the page-directopage-table entries to an 8-byte format (64 bit), and the addition of a page-directory-pointerresulting in another level of indirection to address translation.

For P6-family processors that support the 36-bit PSE feature, the virtual memory architecextended to support 4-MByte page size granularity in combination with 36-bit physaddressing. Note that some P6-family processors do not support this feature. For informabout determining a processor’s feature support, refer to the following documents:

• AP-485, Intel Processor Identification and the CPUID Instruction

• Addendum—Intel Architecture Software Developer’s Manual, Volume1: Basic Arctecture

For information about the virtual memory architecture features of P6-family processors, reChapter 3 of the Intel Architecture Software Developer’s Manual, Volume3: System Progrming Guide.

3.9.1. Description of the 36-bit PSE Feature

The 36-bit PSE feature (PSE-36) is detected by an operating system through the CPUID instruc-tion. Specifically, the operating system executes the CPUID instruction with the value 1 in theEAX register and then determines support for the feature by inspecting bit 17 of the EDXregister return value (see Addendum—Intel Architecture Software Developer’s ManuVolume1: Basic Architecture). If the PSE-36 feature is supported, an operating system ispermitted to utilize the feature, as well as use certain formerly reserved bits. To use the 36-bitPSE feature, the PSE flag must be enabled by the operating system (bit 4 of CR4). Note that aseparate control bit in CR 4 does not exist to regulate the use of 36-bit MByte pages, becausethis feature becomes the example for 4-MByte pages on processors that support it.

Table 3-8 shows the page size and physical address size obtained from various settings of thepage-control flags for the P6-family processors that support the 36-bit PSE feature. Shaded ingray is the change to this table resulting from the 36-bit PSE feature.

3-36

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (99)

PROTECTED-MODE MEMORY MANAGEMENT

To use the 36-bit PSE feature, the PAE feature must be cleared (as indicated in Table 3-4).However, the 36-bit PSE in no way affects the PAE feature. Existing operating systems and soft-wware that use the PAE will continue to have compatible functionality and features with P6-family processors that support 36-bit PSE. Specifically, the Page-Directory Entry (PDE) formatwhen PAE is enabled for 2-MByte or 4-MByte pages is exactly as depicted in Figure 3-21 of theIntel Architecture Software Developer’s Manual, Volume3: System Programming Guide.

No matter which 36-bit addressing feature is used (PAE or 36-bit PSE), the linear address spaceof the processor remains at 32 bits. Applications must partition the address space of their workloads across multiple operating system process to take advantage of the additonal physicalmemory provided in the system.

The 36-bit PSE feature estends the PDE format of the Intel Architecture for 4-MByte pages and32-bit addresses by utilizing bits 16-13 (formerly reserved bits that were required to be zero) toextend the physical address without requiring an 8-byte page-directory entry. Therefore, withthe 36-bit PSE feature, a page directory can contain up to 1024 entries, each pointing to a 4-MByte page that can exist anywhere in the 36-bit physical address space of the processor.

Figure 3-22 shows the difference between PDE formats for 4-MByte pages on P6-family proces-sors that support the 36-bit PSE feature compared to P6-family processors that do not supportthe 36-bit PSE feature (i.e., 32-bit addressing).

Figure 3-22 also shows the linear address mapping to 4-MByte pages when the 36-bit PSE isenabled. The base physical address of the 4-MByte page is contained in the PDE. PA-2 (bits 13-16) is used to provide the upper four bits (bits 32-35) of the 36-bit physical address. PA-1 (bits22-31) continues to provide the next ten bits (bits 22-31) of the physical address for the 4-MBytepage. The offset into the page is provided by the lower 22 bits of the linear address. This schemeeliminates the second level of indirection caused by the use of 4-KByte page tables.

Table 3-4. Paging Modes and Physical Address Size

PG Flag(in CR0)

PAE Flag(in CR4)

PSE Flag(in CR4)

PS Flag(in the PDE)

PageSize

PhysicalAddress Size

0 X X X — Paging Disabled

1 0 0 X 4 KB 32 bits

1 0 1 0 4 KB 32 bits

1 0 1 1 4 KB 36 bits

1 1 X 0 4 KB 36 bits

1 1 X 1 2 KB 36 bits

3-37

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (100)

PROTECTED-MODE MEMORY MANAGEMENT

Notes:

1. PA-2 = Bits 35-32 of thebase physical address for the 4-MByte page (correspond to bits 16-13)

2. PA-2 = Bits 31-22 of thebase physical address for the 4-MByte page

3. PAT = Bit 12 used as the Most Significant Bit of the index into Page Attribute Table (PAT); see Section10.2.

4. PS = Bit 7 is the Page Size Bit—indicates 4-MByte page (must be set to 1)

5. Reserved = Bits 21-17 are reserved for future expansion

6. No change in format or meaning of bits 11-8 and 6-0; refer to Figure 3-15 for details.

The PSE-36 feature is transparent to existing operating systems that utilize 4-MByte pages,because unused bits in PA-2 are currently enforced as zero by Intel processors. The featurerequires 4-MByte pages aligned on a 4-MByte boundary and 4 MBytes of physically contiguousmemory. Therefore, the ten bits of PA-1 are sufficient to specify the base physical address of any4-MByte page below 4 GBytes. An operating system can easily support addresses greater than4 GBytes simply by providing the upper 4 bits of the physical address in PA-2 when creating aPDE for a 4-MByte page.

Figure 3-23 shows the linear address mapping to 4 MB pages when the 36-bit PSE is enabled.The base physical address of the 4 MB page is contained in the PDE. PA-2 (bits 13-16) is usedto provide the upper four bits (bits 32-35) of the 36-bit physical address. PA-1 (bits 22-31)continues to provide the next ten bits (bits 22-31) of the physical address for the 4 MB page. Theoffset into the page is provided by the lower 22 bits of the linear address. This scheme eliminatesthe second level of indirection caused by the use of 4 KB page tables.

Page Directory Entry format for processors that support 36-bit addressing for 4-MByte pages

31 22 21 17 16 13 12 11 8 7 6 0

PA - 1 Reserved PA - 2 PAT PS=1

Page Directory Entry format for processors that support 32-bit addressing for 4-MByte pages

31 22 21 12 11 8 7 6 0

Base Page Address Reserved PS=1

Figure 3-22. PDE Format Differences between 36-bit and 32-bit addressing

3-38

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (101)

PROTECTED-MODE MEMORY MANAGEMENT

“Page

The PSE-36 feature is transparent to existing operating systems that utilize 4 MB pages becauseunused bits in PA-2 are currently enforced as zero by Intel processors. The feature requires 4MB pages aligned on a 4 MB boundary and 4 MB of physically contiguous memory. Therefore,the ten bits of PA-1 are sufficient to specify the base physical address of any 4 MB page below4GB. An operating system easily can support addresses greater than 4 GB simply by providingthe upper 4 bits of the physical address in PA-2 when creating a PDE for a 4 MB page.

3.9.2. Fault Detection

There are several conditions that can cause P6-family processors that support this feature togenerate a page fault (PF) fault. These conditions are related to the use of, or switching between,various memory management features:

• If the PSE feature is enabled, a nonzero value in any of the remaining reserved bits (17-21)of a 4-MByte PDE causes a page fault, with the reserved bit (bit 3) set in the error code.

• If the PAE feature is enabled and set to use 2-MByte or 4-MByte pages (that is, 8-bytepage-directory table entries are being used), a nonzero value in any of the reserved bits 13-20 causes a page fault, with the reserved bit (bit 3) set in the error code. Note that bit 12 isnow being used to support the Page Attribute Table feature (refer to Section 9.13., Attribute Table (PAT)”).

Figure 3-23. Page Size Extension Linear to Physical Translation

Directory Index

31 22

31

21 0Linear Address 4 MB Page

Page Directory

CR3

Page Frame AddressPA-1

Reserved PA-2 PAT PS=1

2221 131617 12 711 8 6 0

3-39

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (102)

PROTECTED-MODE MEMORY MANAGEMENT

3.10. MAPPING SEGMENTS TO PAGES

The segmentation and paging mechanisms provide in the Intel Architecture support a widevariety of approaches to memory management. When segmentation and paging is combined,segments can be mapped to pages in several ways. To implement a flat (unsegmented)addressing environment, for example, all the code, data, and stack modules can be mapped toone or more large segments (up to 4-GBytes) that share same range of linear addresses (refer toFigure 3-2). Here, segments are essentially invisible to applications and the operating-system orexecutive. If paging is used, the paging mechanism can map a single linear address space(contained in a single segment) into virtual memory. Or, each program (or task) can have its ownlarge linear address space (contained in its own segment), which is mapped into virtual memorythrough its own page directory and set of page tables.

Segments can be smaller than the size of a page. If one of these segments is placed in a pagewhich is not shared with another segment, the extra memory is wasted. For example, a small datastructure, such as a 1-byte semaphore, occupies 4K bytes if it is placed in a page by itself. Ifmany semaphores are used, it is more efficient to pack them into a single page.

The Intel Architecture does not enforce correspondence between the boundaries of pages andsegments. A page can contain the end of one segment and the beginning of another. Likewise, asegment can contain the end of one page and the beginning of another.

Memory-management software may be simpler and more efficient if it enforces some alignmentbetween page and segment boundaries. For example, if a segment which can fit in one page isplaced in two pages, there may be twice as much paging overhead to support access to thatsegment.

One approach to combining paging and segmentation that simplifies memory-management soft-ware is to give each segment its own page table, as shown in Figure 3-24. This convention givesthe segment a single entry in the page directory that provides the access control information forpaging the entire segment.

Figure 3-24. Memory Management Convention That Assigns a Page Table to Each Segment

Seg. Descript.

LDT

Seg. Descript.PDE

Page Directory

PDE

PTEPTEPTE

PTEPTE

Page Tables

Page Frames

3-40

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (103)

4

Protection

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (104)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (105)

PROTECTION

essing

in local-ucts to.

that itd; any

ransla-to the

CHAPTER 4PROTECTION

In protected mode, the Intel Architecture provides a protection mechanism that operates at boththe segment level and the page level. This protection mechanism provides the ability to limitaccess to certain segments or pages based on privilege levels (four privilege levels for segmentsand two privilege levels for pages). For example, critical operating-system code and data can beprotected by placing them in more privileged segments than those that contain applicationscode. The processor’s protection mechanism will then prevent application code from accthe operating-system code and data in any but a controlled, defined manner.

Segment and page protection can be used at all stages of software development to assistizing and detecting design problems and bugs. It can also be incorporated into end-prodoffer added robustness to operating systems, utilities software, and applications software

When the protection mechanism is used, each memory reference is checked to verify satisfies various protection checks. All checks are made before the memory cycle is starteviolation results in an exception. Because checks are performed in parallel with address ttion, there is no performance penalty. The protection checks that are performed fall infollowing categories:

• Limit checks.

• Type checks.

• Privilege level checks.

• Restriction of addressable domain.

• Restriction of procedure entry-points.

• Restriction of instruction set.

All protection violation results in an exception being generated. Refer to Chapter 5, Interruptand Exception Handling for an explanation of the exception mechanism. This chapter describesthe protection mechanism and the violations which lead to exceptions.

The following sections describe the protection mechanism available in protected mode. Refer toChapter 16, 8086 Emulation for information on protection in real-address and virtual-8086mode.

4-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (106)

PROTECTION

data

tor.)

the along

f a

4.1. ENABLING AND DISABLING SEGMENT AND PAGE PROTECTION

Setting the PE flag in register CR0 causes the processor to switch to protected mode, which inturn enables the segment-protection mechanism. Once in protected mode, there is no control bitfor turning the protection mechanism on or off. The part of the segment-protection mechanismthat is based on privilege levels can essentially be disabled while still in protected mode byassigning a privilege level of 0 (most privileged) to all segment selectors and segment descrip-tors. This action disables the privilege level protection barriers between segments, but otherprotection checks such as limit checking and type checking are still carried out.

Page-level protection is automatically enabled when paging is enabled (by setting the PG flagin register CR0). Here again there is no mode bit for turning off page-level protection oncepaging is enabled. However, page-level protection can be disabled by performing the followingoperations:

• Clear the WP flag in control register CR0.

• Set the read/write (R/W) and user/supervisor (U/S) flags for each page-directory and page-table entry.

This action makes each page a writable, user page, which in effect disables page-levelprotection.

4.2. FIELDS AND FLAGS USED FOR SEGMENT-LEVEL AND PAGE-LEVEL PROTECTION

The processor’s protection mechanism uses the following fields and flags in the systemstructures to control access to segments and pages:

• Descriptor type (S) flag—(Bit 12 in the second doubleword of a segment descriptor.)Determines if the segment descriptor is for a system segment or a code or data segment.

• Type field—(Bits 8 through 11 in the second doubleword of a segment descripDetermines the type of code, data, or system segment.

• Limit field—(Bits 0 through 15 of the first doubleword and bits 16 through 19 of second doubleword of a segment descriptor.) Determines the size of the segment,with the G flag and E flag (for data segments).

• G flag—(Bit 23 in the second doubleword of a segment descriptor.) Determines the size ofthe segment, along with the limit field and E flag (for data segments).

• E flag—(Bit 10 in the second doubleword of a data-segment descriptor.) Determines thesize of the segment, along with the limit field and G flag.

• Descriptor privilege level (DPL) field—(Bits 13 and 14 in the second doubleword osegment descriptor.) Determines the privilege level of the segment.

• Requested privilege level (RPL) field. (Bits 0 and 1 of any segment selector.) Specifies therequested privilege level of a segment selector.

4-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (107)

PROTECTION

• Current privilege level (CPL) field. (Bits 0 and 1 of the CS segment register.) Indicates theprivilege level of the currently executing program or procedure. The term current privilegelevel (CPL) refers to the setting of this field.

• User/supervisor (U/S) flag. (Bit 2 of a page-directory or page-table entry.) Determines thetype of page: user or supervisor.

• Read/write (R/W) flag. (Bit 1 of a page-directory or page-table entry.) Determines the typeof access allowed to a page: read only or read-write.

Figure 4-1 shows the location of the various fields and flags in the data, code, and system-segment descriptors; Figure 3-6 in Chapter 3, Protected-Mode Memory Management shows thelocation of the RPL (or CPL) field in a segment selector (or the CS register); and Figure 3-14 inChapter 3, Protected-Mode Memory Management shows the location of the U/S and R/W flagsin the page-directory and page-table entries.

4-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (108)

PROTECTION

Many different styles of protection schemes can be implemented with these fields and flags.When the operating system creates a descriptor, it places values in these fields and flags inkeeping with the particular protection style chosen for an operating system or executive. Appli-cation program do not generally access or modify these fields and flags.

The following sections describe how the processor uses these fields and flags to perform thevarious categories of checks described in the introduction to this chapter.

Figure 4-1. Descriptor Fields Used for Protection

Base 23:16

31 24 23 22 21 20 19 16 15 1314 12 11 8 7 0

PBase 31:24 GDPL

Type

10 4

31 16 15 0

Base Address 15:00 Segment Limit 15:00 0

Base 23:16AVL

Limit19:16B

AWE0

Data-Segment Descriptor

31 24 23 22 21 20 19 16 15 1314 12 11 8 7 0

PBase 31:24 GDPL

Type

10 4

31 16 15 0

Base Address 15:00 Segment Limit 15:00 0

Base 23:16AVL

Limit19:16D

ARC1

Code-Segment Descriptor

31 24 23 22 21 20 19 16 15 1314 12 11 8 7 0

PBase 31:24 GDPL

Type0 4

31 16 15 0

Base Address 15:00 Segment Limit 15:00 0

Limit19:16

System-Segment Descriptor

A

BCDDPL

Accessed

BigConformingDefaultDescriptor Privilege Level

Reserved

EGRLIMITWP

Expansion DirectionGranularityReadableSegment LimitWritablePresent

AVL Available to Sys. Programmer’s

4-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (109)

PROTECTION

rpretedessed thewn

s, and of thenother

. Theevents. The

riptorsreventgment

in

4.3. LIMIT CHECKING

The limit field of a segment descriptor prevents programs or procedures from addressingmemory locations outside the segment. The effective value of the limit depends on the settingof the G (granularity) flag (refer to Figure 4-1). For data segments, the limit also depends on theE (expansion direction) flag and the B (default stack pointer size and/or upper bound) flag. TheE flag is one of the bits in the type field when the segment descriptor is for a data-segment type.

When the G flag is clear (byte granularity), the effective limit is the value of the 20-bit limit fieldin the segment descriptor. Here, the limit ranges from 0 to FFFFFH (1 MByte). When the G flagis set (4-KByte page granularity), the processor scales the value in the limit field by a factor of2^12 (4 KBytes). In this case, the effective limit ranges from FFFH (4 KBytes) to FFFFFFFFH(4 GBytes). Note that when scaling is used (G flag is set), the lower 12 bits of a segment offset(address) are not checked against the limit; for example, note that if the segment limit is 0,offsets 0 through FFFH are still valid.

For all types of segments except expand-down data segments, the effective limit is the lastaddress that is allowed to be accessed in the segment, which is one less than the size, in bytes,of the segment. The processor causes a general-protection exception any time an attempt is madeto access the following addresses in a segment:

• A byte at an offset greater than the effective limit

• A word at an offset greater than the (effective-limit – 1)

• A doubleword at an offset greater than the (effective-limit – 3)

• A quadword at an offset greater than the (effective-limit – 7)

For expand-down data segments, the segment limit has the same function but is intedifferently. Here, the effective limit specifies the last address that is not allowed to be accwithin the segment; the range of valid offsets is from (effective-limit + 1) to FFFFFFFFH ifB flag is set and from (effective-limit + 1) to FFFFH if the B flag is clear. An expand-dosegment has maximum size when the segment limit is 0.

Limit checking catches programming errors such as runaway code, runaway subscriptinvalid pointer calculations. These errors are detected when they occur, so identificationcause is easier. Without limit checking, these errors could overwrite code or data in asegment.

In addition to checking segment limits, the processor also checks descriptor table limitsGDTR and IDTR registers contain 16-bit limit values that the processor uses to prprograms from selecting a segment descriptors outside the respective descriptor tableLDTR and task registers contain 32-bit segment limit value (read from the segment descfor the current LDT and TSS, respectively). The processor uses these segment limits to paccesses beyond the bounds of the current LDT and TSS. Refer to Section 3.5.1., “SeDescriptor Tables” in Chapter 3, Protected-Mode Memory Management for more informationon the GDT and LDT limit fields; refer to Section 5.8., “Interrupt Descriptor Table (IDT)”Chapter 5, Interrupt and Exception Handling for more information on the IDT limit field; andrefer to Section 6.2.3., “Task Register” in Chapter 6, Task Management for more information onthe TSS segment limit field.

4-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (110)

PROTECTION

gments

ister.

r a TSS.

4.4. TYPE CHECKING

Segment descriptors contain type information in two places:

• The S (descriptor type) flag.

• The type field.

The processor uses this information to detect programming errors that result in an attempt to usea segment or gate in an incorrect or unintended manner.

The S flag indicates whether a descriptor is a system type or a code or data type. The type fieldprovides 4 additional bits for use in defining various types of code, data, and system descriptors.Table 3-1 in Chapter 3, Protected-Mode Memory Management shows the encoding of the typefield for code and data descriptors; Table 3-2 in Chapter 3, Protected-Mode Memory Manage-ment shows the encoding of the field for system descriptors.

The processor examines type information at various times while operating on segment selectorsand segment descriptors. The following list gives examples of typical operations where typechecking is performed. This list is not exhaustive.

• When a segment selector is loaded into a segment register. Certain segment registerscan contain only certain descriptor types, for example:

— The CS register only can be loaded with a selector for a code segment.

— Segment selectors for code segments that are not readable or for system secannot be loaded into data-segment registers (DS, ES, FS, and GS).

— Only segment selectors of writable data segments can be loaded into the SS reg

• When a segment selector is loaded into the LDTR or task register.

— The LDTR can only be loaded with a selector for an LDT.

— The task register can only be loaded with a segment selector for a TSS.

• When instructions access segments whose descriptors are already loaded intosegment registers. Certain segments can be used by instructions only in certain predefinedways, for example:

— No instruction may write into an executable segment.

— No instruction may write into a data segment if it is not writable.

— No instruction may read an executable segment unless the readable flag is set.

• When an instruction operand contains a segment selector. Certain instructions canaccess segment or gates of only a particular type, for example:

— A far CALL or far JMP instruction can only access a segment descriptor foconforming code segment, nonconforming code segment, call gate, task gate, or

— The LLDT instruction must reference a segment descriptor for an LDT.

— The LTR instruction must reference a segment descriptor for a TSS.

4-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (111)

PROTECTION

TSS,

code

thep to

ng thegate)e is

ted; if

callgment

tion-ks that

omati-MP

ecks

xcep-ut anyegmenters withegisters

— The LAR instruction must reference a segment or gate descriptor for an LDT, call gate, task gate, code segment, or data segment.

— The LSL instruction must reference a segment descriptor for a LDT, TSS, segment, or data segment.

— IDT entries must be interrupt, trap, or task gates.

• During certain internal operations. For example:

— On a far call or far jump (executed with a far CALL or far JMP instruction), processor determines the type of control transfer to be carried out (call or jumanother code segment, a call or jump through a gate, or a task switch) by checkitype field in the segment (or gate) descriptor pointed to by the segment (or selector given as an operand in the CALL or JMP instruction. If the descriptor typfor a code segment or call gate, a call or jump to another code segment is indicathe descriptor type is for a TSS or task gate, a task switch is indicated.

— On a call or jump through a call gate (or on an interrupt- or exception-handlerthrough a trap or interrupt gate), the processor automatically checks that the sedescriptor being pointed to by the gate is for a code segment.

— On a call or jump to a new task through a task gate (or on an interrupt- or excephandler call to a new task through a task gate), the processor automatically checthe segment descriptor being pointed to by the task gate is for a TSS.

— On a call or jump to a new task by a direct reference to a TSS, the processor autcally checks that the segment descriptor being pointed to by the CALL or Jinstruction is for a TSS.

— On return from a nested task (initiated by an IRET instruction), the processor chthat the previous task link field in the current TSS points to a TSS.

4.4.1. Null Segment Selector Checking

Attempting to load a null segment selector (refer to Section 3.4.1. in Chapter 3, Protected-ModeMemory Management) into the CS or SS segment register generates a general-protection etion (#GP). A null segment selector can be loaded into the DS, ES, FS, or GS register, battempt to access a segment through one of these registers when it is loaded with a null sselector results in a #GP exception being generated. Loading unused data-segment regista null segment selector is a useful method of detecting accesses to unused segment rand/or preventing unwanted accesses to data segments.

4-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (112)

PROTECTION

from 0rivilegee, data,l of an of the

rivilegetions.

excep-

ocessor

4.5. PRIVILEGE LEVELS

The processor’s segment-protection mechanism recognizes 4 privilege levels, numberedto 3. The greater numbers mean lesser privileges. Figure 4-2 shows how these levels of pcan be interpreted as rings of protection. The center (reserved for the most privileged codand stacks) is used for the segments containing the critical software, usually the kerneoperating system. Outer rings are used for less critical software. (Systems that use only 24 possible privilege levels should use levels 0 and 3.)

The processor uses privilege levels to prevent a program or task operating at a lesser plevel from accessing a segment with a greater privilege, except under controlled situaWhen the processor detects a privilege level violation, it generates a general-protection tion (#GP).

To carry out privilege-level checks between code segments and data segments, the prrecognizes the following three types of privilege levels:

• Current privilege level (CPL). The CPL is the privilege level of the currently executingprogram or task. It is stored in bits 0 and 1 of the CS and SS segment registers. Normally,the CPL is equal to the privilege level of the code segment from which instructions arebeing fetched. The processor changes the CPL when program control is transferred to acode segment with a different privilege level. The CPL is treated slightly differently whenaccessing conforming code segments. Conforming code segments can be accessed fromany privilege level that is equal to or numerically greater (less privileged) than the DPL ofthe conforming code segment. Also, the CPL is not changed when the processor accesses aconforming code segment that has a different privilege level than the CPL.

• Descriptor privilege level (DPL). The DPL is the privilege level of a segment or gate. It isstored in the DPL field of the segment or gate descriptor for the segment or gate. When thecurrently executing code segment attempts to access a segment or gate, the DPL of the

Figure 4-2. Protection Rings

Level 0

Level 1

Level 2

Level 3

Protection Rings

Operating

Operating SystemServices

SystemKernel

Applications

4-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (113)

PROTECTION

ae DPLs the

ple, if of 0

ntly is the

or of a

ss the

ntly is the

L

d into arogramrately in

e loadedr (SS).

segment or gate is compared to the CPL and RPL of the segment or gate selector (asdescribed later in this section). The DPL is interpreted differently, depending on the type ofsegment or gate being accessed:

— Data segment. The DPL indicates the numerically highest privilege level thatprogram or task can have to be allowed to access the segment. For example, if thof a data segment is 1, only programs running at a CPL of 0 or 1 can accessegment.

— Nonconforming code segment (without using a call gate). The DPL indicates theprivilege level that a program or task must be at to access the segment. For examthe DPL of a nonconforming code segment is 0, only programs running at a CPLcan access the segment.

— Call gate. The DPL indicates the numerically highest privilege level that the curreexecuting program or task can be at and still be able to access the call gate. (Thissame access rule as for a data segment.)

— Conforming code segment and nonconforming code segment accessed through acall gate. The DPL indicates the numerically lowest privilege level that a programtask can have to be allowed to access the segment. For example, if the DPLconforming code segment is 2, programs running at a CPL of 0 or 1 cannot accesegment.

— TSS. The DPL indicates the numerically highest privilege level that the curreexecuting program or task can be at and still be able to access the TSS. (Thissame access rule as for a data segment.)

• Requested privilege level (RPL). The RPL is an override privilege level that is assignedto segment selectors. It is stored in bits 0 and 1 of the segment selector. The processorchecks the RPL along with the CPL to determine if access to a segment is allowed. Even ifthe program or task requesting access to a segment has sufficient privilege to access thesegment, access is denied if the RPL is not of sufficient privilege level. That is, if the RPLof a segment selector is numerically greater than the CPL, the RPL overrides the CPL, andvice versa. The RPL can be used to insure that privileged code does not access a segmenton behalf of an application program unless the program itself has access privileges for thatsegment. Refer to Section 4.10.4., “Checking Caller Access Privileges (ARPInstruction)” for a detailed description of the purpose and typical use of the RPL.

Privilege levels are checked when the segment selector of a segment descriptor is loadesegment register. The checks used for data access differ from those used for transfers of pcontrol among code segments; therefore, the two kinds of accesses are considered sepathe following sections.

4.6. PRIVILEGE LEVEL CHECKING WHEN ACCESSING DATA SEGMENTS

To access operands in a data segment, the segment selector for the data segment must binto the data-segment registers (DS, ES, FS, or GS) or into the stack-segment registe

4-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (114)

PROTECTION

ent’ser if theneral-

ning at

RPLs datais not

(Segment registers can be loaded with the MOV, POP, LDS, LES, LFS, LGS, and LSS instruc-tions.) Before the processor loads a segment selector into a segment register, it performs a priv-ilege check (refer to Figure 4-3) by comparing the privilege levels of the currently runningprogram or task (the CPL), the RPL of the segment selector, and the DPL of the segmsegment descriptor. The processor loads the segment selector into the segment registDPL is numerically greater than or equal to both the CPL and the RPL. Otherwise, a geprotection fault is generated and the segment register is not loaded.

Figure 4-4 shows four procedures (located in codes segments A, B, C, and D), each rundifferent privilege levels and each attempting to access the same data segment.

• The procedure in code segment A is able to access data segment E using segment selectorE1, because the CPL of code segment A and the RPL of segment selector E1 are equal tothe DPL of data segment E.

• The procedure in code segment B is able to access data segment E using segment selectorE2, because the CPL of code segment A and the RPL of segment selector E2 are bothnumerically lower than (more privileged) than the DPL of data segment E. A code segmentB procedure can also access data segment E using segment selector E1.

• The procedure in code segment C is not able to access data segment E using segmentselector E3 (dotted line), because the CPL of code segment C and the RPL of segmentselector E3 are both numerically greater than (less privileged) than the DPL of datasegment E. Even if a code segment C procedure were to use segment selector E1 or E2,such that the RPL would be acceptable, it still could not access data segment E because itsCPL is not privileged enough.

• The procedure in code segment D should be able to access data segment E because codesegment D’s CPL is numerically less than the DPL of data segment E. However, theof segment selector E3 (which the code segment D procedure is using to accessegment E) is numerically greater than the DPL of data segment E, so access

Figure 4-3. Privilege Check for Data Access

CPL

RPL

DPL

PrivilegeCheck

Data-Segment Descriptor

CS Register

Segment SelectorFor Data Segment

4-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (115)

PROTECTION

allowed. If the code segment D procedure were to use segment selector E1 or E2 to accessthe data segment, access would be allowed.

As demonstrated in the previous examples, the addressable domain of a program or task variesas its CPL changes. When the CPL is 0, data segments at all privilege levels are accessible; whenthe CPL is 1, only data segments at privilege levels 1 through 3 are accessible; when the CPL is3, only data segments at privilege level 3 are accessible.

The RPL of a segment selector can always override the addressable domain of a program or task.When properly used, RPLs can prevent problems caused by accidental (or intensional) use ofsegment selectors for privileged data segments by less privileged programs or procedures.

It is important to note that the RPL of a segment selector for a data segment is under softwarecontrol. For example, an application program running at a CPL of 3 can set the RPL for a data-segment selector to 0. With the RPL set to 0, only the CPL checks, not the RPL checks, willprovide protection against deliberate, direct attempts to violate privilege-level security for thedata segment. To prevent these types of privilege-level-check violations, a program or procedurecan check access privileges whenever it receives a data-segment selector from another proce-dure (refer to Section 4.10.4., “Checking Caller Access Privileges (ARPL Instruction)”).

Figure 4-4. Examples of Accessing Data Segments From Various Privilege Levels

Data

Lowest Privilege

Highest Privilege

Segment E

3

2

1

CPL=1

CPL=3

CPL=0

DPL=2CPL=2

Segment Sel. E3RPL=3

Segment Sel. E1RPL=2

Segment Sel. E2RPL=1

CodeSegment C

CodeSegment A

CodeSegment B

CodeSegment D

4-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (116)

PROTECTION

4.6.1. Accessing Data in Code Segments

In some instances it may be desirable to access data structures that are contained in a codesegment. The following methods of accessing data in code segments are possible:

• Load a data-segment register with a segment selector for a nonconforming, readable, codesegment.

• Load a data-segment register with a segment selector for a conforming, readable, codesegment.

• Use a code-segment override prefix (CS) to read a readable, code segment whose selectoris already loaded in the CS register.

The same rules for accessing data segments apply to method 1. Method 2 is always valid becausethe privilege level of a conforming code segment is effectively the same as the CPL, regardlessof its DPL. Method 3 is always valid because the DPL of the code segment selected by the CSregister is the same as the CPL.

4.7. PRIVILEGE LEVEL CHECKING WHEN LOADING THE SS REGISTER

Privilege level checking also occurs when the SS register is loaded with the segment selector fora stack segment. Here all privilege levels related to the stack segment must match the CPL; thatis, the CPL, the RPL of the stack-segment selector, and the DPL of the stack-segment descriptormust be the same. If the RPL and DPL are not equal to the CPL, a general-protection exception(#GP) is generated.

4.8. PRIVILEGE LEVEL CHECKING WHEN TRANSFERRING PROGRAM CONTROL BETWEEN CODE SEGMENTS

To transfer program control from one code segment to another, the segment selector for thedestination code segment must be loaded into the code-segment register (CS). As part of thisloading process, the processor examines the segment descriptor for the destination code segmentand performs various limit, type, and privilege checks. If these checks are successful, the CSregister is loaded, program control is transferred to the new code segment, and program execu-tion begins at the instruction pointed to by the EIP register.

Program control transfers are carried out with the JMP, CALL, RET, INT n, and IRET instruc-tions, as well as by the exception and interrupt mechanisms. Exceptions, interrupts, and theIRET instruction are special cases discussed in Chapter 5, Interrupt and Exception Handling.This chapter discusses only the JMP, CALL, and RET instructions.

A JMP or CALL instruction can reference another code segment in any of four ways:

• The target operand contains the segment selector for the target code segment.

• The target operand points to a call-gate descriptor, which contains the segment selector forthe target code segment.

4-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (117)

PROTECTION

Taskl

the JMP,

r does

l gate, 4-5):

criptor

• The target operand points to a TSS, which contains the segment selector for the target codesegment.

• The target operand points to a task gate, which points to a TSS, which in turn contains thesegment selector for the target code segment.

The following sections describe first two types of references. Refer to Section 6.3., “Switching” in Chapter 6, Task Management for information on transferring program controthrough a task gate and/or TSS.

4.8.1. Direct Calls or Jumps to Code Segments

The near forms of the JMP, CALL, and RET instructions transfer program control withincurrent code segment, so privilege-level checks are not performed. The far forms of theCALL, and RET instructions transfer control to other code segments, so the processoperform privilege-level checks.

When transferring program control to another code segment without going through a calthe processor examines four kinds of privilege level and type information (refer to Figure

• The CPL. (Here, the CPL is the privilege level of the calling code segment; that is, the codesegment that contains the procedure that is making the call or jump.)

• The DPL of the segment descriptor for the destination code segment that contains thecalled procedure.

• The RPL of the segment selector of the destination code segment.

• The conforming (C) flag in the segment descriptor for the destination code segment, whichdetermines whether the segment is a conforming (C flag is set) or nonconforming (C flag isclear) code segment. (Refer to Section 3.4.3.1., “Code- and Data-Segment Des

Figure 4-5. Privilege Check for Control Transfer Without Using a Gate

CPL

RPL

DPL

PrivilegeCheck

CS Register

Segment SelectorFor Code Segment

Destination CodeSegment Descriptor

C

4-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (118)

PROTECTION

g of the

e equall-protec-

fore, aselectorhe DPLin codedifferent

Types” in Chapter 3, Protected-Mode Memory Management for more information aboutthis flag.)

The rules that the processor uses to check the CPL, RPL, and DPL depends on the settinC flag, as described in the following sections.

4.8.1.1. ACCESSING NONCONFORMING CODE SEGMENTS

When accessing nonconforming code segments, the CPL of the calling procedure must bto the DPL of the destination code segment; otherwise, the processor generates a generation exception (#GP).

For example, in Figure 4-6, code segment C is a nonconforming code segment. Thereprocedure in code segment A can call a procedure in code segment C (using segment C1), because they are at the same privilege level (the CPL of code segment A is equal to tof code segment C). However, a procedure in code segment B cannot call a procedure segment C (using segment selector C2 or C1), because the two code segments are at privilege levels.

Figure 4-6. Examples of Accessing Conforming and Nonconforming Code Segments From Various Privilege Levels

CodeSegment D

CodeSegment CCode

Segment A

Lowest Privilege

Highest Privilege

CPL=3

CodeSegment B

NonconformingCode Segment

ConformingCode Segment

3

2

1

CPL=2DPL=2

DPL=3

Segment Sel. D1RPL=2

Segment Sel. D2RPL=3

Segment Sel. C2RPL=3

Segment Sel. C1RPL=2

4-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (119)

PROTECTION

The RPL of the segment selector that points to a nonconforming code segment has a limitedeffect on the privilege check. The RPL must be numerically less than or equal to the CPL of thecalling procedure for a successful control transfer to occur. So, in the example in Figure 4-6, theRPLs of segment selectors C1 and C2 could legally be set to 0, 1, or 2, but not to 3.

When the segment selector of a nonconforming code segment is loaded into the CS register, theprivilege level field is not changed; that is, it remains at the CPL (which is the privilege level ofthe calling procedure). This is true, even if the RPL of the segment selector is different from theCPL.

4.8.1.2. ACCESSING CONFORMING CODE SEGMENTS

When accessing conforming code segments, the CPL of the calling procedure may be numeri-cally equal to or greater than (less privileged) the DPL of the destination code segment; theprocessor generates a general-protection exception (#GP) only if the CPL is less than the DPL.(The segment selector RPL for the destination code segment is not checked if the segment is aconforming code segment.)

In the example in Figure 4-6, code segment D is a conforming code segment. Therefore, callingprocedures in both code segment A and B can access code segment D (using either segmentselector D1 or D2, respectively), because they both have CPLs that are greater than or equal tothe DPL of the conforming code segment. For conforming code segments, the DPL repre-sents the numerically lowest privilege level that a calling procedure may be at to success-fully make a call to the code segment.

(Note that segments selectors D1 and D2 are identical except for their respective RPLs. Butsince RPLs are not checked when accessing conforming code segments, the two segment selec-tors are essentially interchangeable.)

When program control is transferred to a conforming code segment, the CPL does not change,even if the DPL of the destination code segment is less than the CPL. This situation is the onlyone where the CPL may be different from the DPL of the current code segment. Also, since theCPL does not change, no stack switch occurs.

Conforming segments are used for code modules such as math libraries and exception handlers,which support applications but do not require access to protected system facilities. Thesemodules are part of the operating system or executive software, but they can be executed atnumerically higher privilege levels (less privileged levels). Keeping the CPL at the level of acalling code segment when switching to a conforming code segment prevents an applicationprogram from accessing nonconforming code segments while at the privilege level (DPL) of aconforming code segment and thus prevents it from accessing more privileged data.

Most code segments are nonconforming. For these segments, program control can be transferredonly to code segments at the same level of privilege, unless the transfer is carried out through acall gate, as described in the following sections.

4-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (120)

PROTECTION

ixed-

in the

4.8.2. Gate Descriptors

To provide controlled access to code segments with different privilege levels, the processorprovides special set of descriptors called gate descriptors. There are four kinds of gatedescriptors:

• Call gates

• Trap gates

• Interrupt gates

• Task gates

Task gates are used for task switching and are discussed in Chapter 6, Task Management. Trapand interrupt gates are special kinds of call gates used for calling exception and interrupthandlers. The are described in Chapter 5, Interrupt and Exception Handling. This chapter isconcerned only with call gates.

4.8.3. Call Gates

Call gates facilitate controlled transfers of program control between different privilege levels.They are typically used only in operating systems or executives that use the privilege-levelprotection mechanism. Call gates are also useful for transferring program control between 16-bitand 32-bit code segments, as described in Section 17.4., “Transferring Control Among MSize Code Segments” in Chapter 17, Mixing 16-Bit and 32-Bit Code.

Figure 4-7 shows the format of a call-gate descriptor. A call-gate descriptor may reside GDT or in an LDT, but not in the interrupt descriptor table (IDT). It performs six functions:

• It specifies the code segment to be accessed.

• It defines an entry point for a procedure in the specified code segment.

• It specifies the privilege level required for a caller trying to access the procedure.

• If a stack switch occurs, it specifies the number of optional parameters to be copiedbetween stacks.

• It defines the size of values to be pushed onto the target stack: 16-bit gates force 16-bitpushes and 32-bit gates force 32-bit pushes.

• It specifies whether the call-gate descriptor is valid.

4-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (121)

PROTECTION

aram-tack tometergates.

resenterating

numberresentto 1, so

or JMP 4-8);et can

ll gate tor can becriptorin the

of a

The segment selector field in a call gate specifies the code segment to be accessed. The offsetfield specifies the entry point in the code segment. This entry point is generally to the firstinstruction of a specific procedure. The DPL field indicates the privilege level of the call gate,which in turn is the privilege level required to access the selected procedure through the gate.The P flag indicates whether the call-gate descriptor is valid. (The presence of the code segmentto which the gate points is indicated by the P flag in the code segment’s descriptor.) The peter count field indicates the number of parameters to copy from the calling procedures sthe new stack if a stack switch occurs (refer to Section 4.8.5., “Stack Switching”). The paracount specifies the number of words for 16-bit call gates and doublewords for 32-bit call

Note that the P flag in a gate descriptor is normally always set to 1. If it is set to 0, a not p(#NP) exception is generated when a program attempts to access the descriptor. The opsystem can use the P flag for special purposes. For example, it could be used to track the of times the gate is used. Here, the P flag is initially set to 0 causing a trap to the not-pexception handler. The exception handler then increments a counter and sets the P flag that on returning from the handler, the gate descriptor will be valid.

4.8.4. Accessing a Code Segment Through a Call Gate

To access a call gate, a far pointer to the gate is provided as a target operand in a CALL instruction. The segment selector from this pointer identifies the call gate (refer to Figurethe offset from the pointer is required, but not used or checked by the processor. (The offsbe set to any value.)

When the processor has accessed the call gate, it uses the segment selector from the calocate the segment descriptor for the destination code segment. (This segment descriptoin the GDT or the LDT.) It then combines the base address from the code-segment deswith the offset from the call gate to form the linear address of the procedure entry point code segment.

As shown in Figure 4-9, four different privilege levels are used to check the validity program control transfer through a call gate:

Figure 4-7. Call-Gate Descriptor

31 16 15 1314 12 11 8 7 0

POffset in Segment 31:16DPL

Type

04

31 16 15 0

Segment Selector Offset in Segment 15:00 0

Param.

0011

PDPL

Gate ValidDescriptor Privilege Level

Count

456

0 0 0

4-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (122)

PROTECTION

• The CPL (current privilege level).

• The RPL (requestor's privilege level) of the call gate’s selector.

• The DPL (descriptor privilege level) of the call gate descriptor.

• The DPL of the segment descriptor of the destination code segment.

The C flag (conforming) in the segment descriptor for the destination code segment is alsochecked.

Figure 4-8. Call-Gate Mechanism

OffsetSegment Selector

Far Pointer to Call Gate

Required but not used by processor

Call-GateDescriptor

Code-SegmentDescriptor

Descriptor Table

Offset

Base

Base

Offset

Base

Segment Selector

+

ProcedureEntry Point

4-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (123)

PROTECTION

The privilege checking rules are different depending on whether the control transfer was initi-ated with a CALL or a JMP instruction, as shown in Table 4-1.

The DPL field of the call-gate descriptor specifies the numerically highest privilege level fromwhich a calling procedure can access the call gate; that is, to access a call gate, the CPL of acalling procedure must be equal to or less than the DPL of the call gate. For example, in Figure4-12, call gate A has a DPL of 3. So calling procedures at all CPLs (0 through 3) can access thiscall gate, which includes calling procedures in code segments A, B, and C. Call gate B has aDPL of 2, so only calling procedures at a CPL or 0, 1, or 2 can access call gate B, which includescalling procedures in code segments B and C. The dotted line shows that a calling procedure incode segment A cannot access call gate B.

Figure 4-9. Privilege Check for Control Transfer with Call Gate

Table 4-1. Privilege Check Rules for Call Gates

Instruction Privilege Check Rules

CALL CPL ≤ call gate DPL; RPL ≤ call gate DPL

Destination conforming code segment DPL ≤ CPL

Destination nonconforming code segment DPL ≤ CPL

JMP CPL ≤ call gate DPL; RPL ≤ call gate DPL

Destination conforming code segment DPL ≤ CPL

Destination nonconforming code segment DPL = CPL

CPL

RPL

DPL

DPL

PrivilegeCheck

Call Gate (Descriptor)

Destination Code-

CS Register

Call-Gate Selector

Segment Descriptor

4-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (124)

PROTECTION

oreswitch

The RPL of the segment selector to a call gate must satisfy the same test as the CPL of the callingprocedure; that is, the RPL must be less than or equal to the DPL of the call gate. In the examplein Figure 4-12, a calling procedure in code segment C can access call gate B using gate selectorB2 or B1, but it could not use gate selector B3 to access call gate B.

If the privilege checks between the calling procedure and call gate are successful, the processorthen checks the DPL of the code-segment descriptor against the CPL of the calling procedure.Here, the privilege check rules vary between CALL and JMP instructions. Only CALL instruc-tions can use call gates to transfer program control to more privileged (numerically lower priv-ilege level) nonconforming code segments; that is, to nonconforming code segments with a DPLless than the CPL. A JMP instruction can use a call gate only to transfer program control to anonconforming code segment with a DPL equal to the CPL. CALL and JMP instruction can bothtransfer program control to a more privileged conforming code segment; that is, to a conformingcode segment with a DPL less than or equal to the CPL.

If a call is made to a more privileged (numerically lower privilege level) nonconforming desti-nation code segment, the CPL is lowered to the DPL of the destination code segment and a stackswitch occurs (refer to Section 4.8.5., “Stack Switching”). If a call or jump is made to a mprivileged conforming destination code segment, the CPL is not changed and no stack occurs.

Figure 4-10. Example of Accessing Call Gates At Various Privilege Levels

CodeSegment A

Stack SwitchNo StackSwitch Occurs Occurs

Lowest Privilege

Highest Privilege

3

2

1

CallGate A

CodeSegment B

CallGate B

CodeSegment C

CodeSegment D

CodeSegment E

NonconformingCode Segment

ConformingCode Segment

Gate Selector ARPL=3

Gate Selector B1RPL=2

Gate Selector B2RPL=1

CPL=3

CPL=2

CPL=1

DPL=3

DPL=2

DPL=0 DPL=0

Gate Selector B3RPL=3

4-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (125)

PROTECTION

ment’s from inter-

3) and used

eparatea stack

S andtically

nningaers are. Theyricallycalledl stackoes notedure

r all the Eachor) and

Call gates allow a single code segment to have procedures that can be accessed at different priv-ilege levels. For example, an operating system located in a code segment may have someservices which are intended to be used by both the operating system and application software(such as procedures for handling character I/O). Call gates for these procedures can be set upthat allow access at all privilege levels (0 through 3). More privileged call gates (with DPLs of0 or 1) can then be set up for other operating system services that are intended to be used onlyby the operating system (such as procedures that initialize device drivers).

4.8.5. Stack Switching

Whenever a call gate is used to transfer program control to a more privileged nonconformingcode segment (that is, when the DPL of the nonconforming destination code segment is less thanthe CPL), the processor automatically switches to the stack for the destination code segprivilege level. This stack switching is carried out to prevent more privileged procedurescrashing due to insufficient stack space. It also prevents less privileged procedures fromfering (by accident or intent) with more privileged procedures through a shared stack.

Each task must define up to 4 stacks: one for applications code (running at privilege level one for each of the privilege levels 2, 1, and 0 that are used. (If only two privilege levels are[3 and 0], then only two stacks must be defined.) Each of these stacks is located in a ssegment and is identified with a segment selector and an offset into the stack segment (pointer).

The segment selector and stack pointer for the privilege level 3 stack is located in the SESP registers, respectively, when privilege-level-3 code is being executed and is automastored on the called procedure’s stack when a stack switch occurs.

Pointers to the privilege level 0, 1, and 2 stacks are stored in the TSS for the currently rutask (refer to Figure 6-2 in Chapter 6, Task Management). Each of these pointers consists of segment selector and a stack pointer (loaded into the ESP register). These initial pointstrictly read-only values. The processor does not change them while the task is runningare used only to create new stacks when calls are made to more privileged levels (numelower privilege levels). These stacks are disposed of when a return is made from the procedure. The next time the procedure is called, a new stack is created using the initiapointer. (The TSS does not specify a stack for privilege level 3 because the processor dallow a transfer of program control from a procedure running at a CPL of 0, 1, or 2 to a procrunning at a CPL of 3, except on a return.)

The operating system is responsible for creating stacks and stack-segment descriptors foprivilege levels to be used and for loading initial pointers for these stacks into the TSS.stack must be read/write accessible (as specified in the type field of its segment descriptmust contain enough space (as specified in the limit field) to hold the following items:

• The contents of the SS, ESP, CS, and EIP registers for the calling procedure.

• The parameters and temporary variables required by the called procedure.

• The EFLAGS register and error code, when implicit calls are made to an exception orinterrupt handler.

4-21

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (126)

PROTECTION

create

cessorre at a

e new

m thestackerated.

ates an

gisters.

calling

frompied.

s) onto

r from called

The stack will need to require enough space to contain many frames of these items, becauseprocedures often call other procedures, and an operating system may support nesting of multipleinterrupts. Each stack should be large enough to allow for the worst case nesting scenario at itsprivilege level.

(If the operating system does not use the processor’s multitasking mechanism, it still mustat least one TSS for this stack-related purpose.)

When a procedure call through a call gate results in a change in privilege level, the properforms the following steps to switch stacks and begin execution of the called procedunew privilege level:

1. Uses the DPL of the destination code segment (the new CPL) to select a pointer to thstack (segment selector and stack pointer) from the TSS.

2. Reads the segment selector and stack pointer for the stack to be switched to frocurrent TSS. Any limit violations detected while reading the stack-segment selector, pointer, or stack-segment descriptor cause an invalid TSS (#TS) exception to be gen

3. Checks the stack-segment descriptor for the proper privileges and type and generinvalid TSS (#TS) exception if violations are detected.

4. Temporarily saves the current values of the SS and ESP registers.

5. Loads the segment selector and stack pointer for the new stack in the SS and ESP re

6. Pushes the temporarily saved values for the SS and ESP registers (for the procedure) onto the new stack (refer to Figure 4-11).

7. Copies the number of parameter specified in the parameter count field of the call gatethe calling procedure’s stack to the new stack. If the count is 0, no parameters are co

8. Pushes the return instruction pointer (the current contents of the CS and EIP registerthe new stack.

9. Loads the segment selector for the new code segment and the new instruction pointethe call gate into the CS and EIP registers, respectively, and begins execution of theprocedure.

Refer to the description of the CALL instruction in Chapter 3, Instruction Set Reference, in theIntel Architecture Software Developer’s Manual, Volume 2, for a detailed description of the priv-ilege level checks and other protection checks that the processor performs on a far call througha call gate.

4-22

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (127)

PROTECTION

ure. Ifrs can be used tod proce-

e level, from JMP stack.

e, theointerof the

for theormal CALL

re the

The parameter count field in a call gate specifies the number of data items (up to 31) that theprocessor should copy from the calling procedure’s stack to the stack of the called procedmore than 31 data items need to be passed to the called procedure, one of the parametea pointer to a data structure, or the saved contents of the SS and ESP registers may beaccess parameters in the old stack space. The size of the data items passed to the calledure depends on the call gate size, as described in Section 4.8.3., “Call Gates”

4.8.6. Returning from a Called Procedure

The RET instruction can be used to perform a near return, a far return at the same privilegand a far return to a different privilege level. This instruction is intended to execute returnsprocedures that were called with a CALL instruction. It does not support returns from ainstruction, because the JMP instruction does not save a return instruction pointer on the

A near return only transfers program control within the current code segment; thereforprocessor performs only a limit check. When the processor pops the return instruction pfrom the stack into the EIP register, it checks that the pointer does not exceed the limit current code segment.

On a far return at the same privilege level, the processor pops both a segment selectorcode segment being returned to and a return instruction pointer from the stack. Under nconditions, these pointers should be valid, because they were pushed on the stack by theinstruction. However, the processor performs privilege checks to detect situations whecurrent procedure might have altered the pointer or failed to maintain the stack properly.

Figure 4-11. Stack Switching During an Interprivilege-Level Call

Parameter 1

Parameter 2

Parameter 3

Calling SS

Calling ESP

Parameter 1

Parameter 2

Parameter 3

Calling CS

Calling EIP

Called Procedure’s Stack

ESP

ESP

Calling Procedure’s Stack

4-23

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (128)

PROTECTION

1

e and code-

ires aRETo stepegister e bytell gate size

ith theSS andectedectioned for

nt (inst the

against not

S, andhan the a null

A far return that requires a privilege-level change is only allowed when returning to a less priv-ileged level (that is, the DPL of the return code segment is numerically greater than the CPL).The processor uses the RPL field from the CS register value saved for the calling procedure(refer to Figure 4-11) to determine if a return to a numerically higher privilege level is required.If the RPL is numerically greater (less privileged) than the CPL, a return across privilege levelsoccurs.

The processor performs the following steps when performing a far return to a calling procedure(refer to Figures 4-2 and 4-4 in the Intel Architecture Software Developer’s Manual, Volume ,for an illustration of the stack contents prior to and after a return):

1. Checks the RPL field of the saved CS register value to determine if a privilege levelchange is required on the return.

2. Loads the CS and EIP registers with the values on the called procedure’s stack. (Typprivilege level checks are performed on the code-segment descriptor and RPL of thesegment selector.)

3. (If the RET instruction includes a parameter count operand and the return requprivilege level change.) Adds the parameter count (in bytes obtained from the instruction) to the current ESP register value (after popping the CS and EIP values), tpast the parameters on the called procedure’s stack. The resulting value in the ESP rpoints to the saved SS and ESP values for the calling procedure’s stack. (Note that thcount in the RET instruction must be chosen to match the parameter count in the cathat the calling procedure referenced when it made the original call multiplied by theof the parameters.)

4. (If the return requires a privilege level change.) Loads the SS and ESP registers wsaved SS and ESP values and switches back to the calling procedure’s stack. The ESP values for the called procedure’s stack are discarded. Any limit violations detwhile loading the stack-segment selector or stack pointer cause a general-protexception (#GP) to be generated. The new stack-segment descriptor is also checktype and privilege violations.

5. (If the RET instruction includes a parameter count operand.) Adds the parameter coubytes obtained from the RET instruction) to the current ESP register value, to step paparameters on the calling procedure’s stack. The resulting ESP value is not checked the limit of the stack segment. If the ESP value is beyond the limit, that fact isrecognized until the next stack operation.

6. (If the return requires a privilege level change.) Checks the contents of the DS, ES, FGS segment registers. If any of these registers refer to segments whose DPL is less tnew CPL (excluding conforming code segments), the segment register is loaded withsegment selector.

Refer to the description of the RET instruction in Chapter 3, Instruction Set Reference, of theIntel Architecture Software Developer’s Manual, Volume 2, for a detailed description of the priv-ilege level checks and other protection checks that the processor performs on a far return.

4-24

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (129)

PROTECTION

e byadingne ofGP) is

rchi- Intel

C and

tectionnsists

4.9. PRIVILEGED INSTRUCTIONS

Some of the system instructions (called “privileged instructions” are protected from usapplication programs. The privileged instructions control system functions (such as the loof system registers). They can be executed only when the CPL is 0 (most privileged). If othese instructions is executed when the CPL is not 0, a general-protection exception (#generated. The following system instructions are privileged instructions:

• LGDT—Load GDT register.

• LLDT—Load LDT register.

• LTR—Load task register.

• LIDT—Load IDT register.

• MOV (control registers)—Load and store control registers.

• LMSW—Load machine status word.

• CLTS—Clear task-switched flag in register CR0.

• MOV (debug registers)—Load and store debug registers.

• INVD—Invalidate cache, without writeback.

• WBINVD—Invalidate cache, with writeback.

• INVLPG—Invalidate TLB entry.

• HLT—Halt processor.

• RDMSR—Read Model-Specific Registers.

• WRMSR—Write Model-Specific Registers.

• RDPMC—Read Performance-Monitoring Counter.

• RDTSC—Read Time-Stamp Counter.

Some of the privileged instructions are available only in the more recent families of Intel Atecture processors (refer to Section 18.7., “New Instructions In the Pentium® and LaterArchitecture Processors”, in Chapter 18, Intel Architecture Compatibility).

The PCE and TSD flags in register CR4 (bits 4 and 2, respectively) enable the RDPMRDTSC instructions, respectively, to be executed at any CPL.

4.10. POINTER VALIDATION

When operating in protected mode, the processor validates all pointers to enforce probetween segments and maintain isolation between privilege levels. Pointer validation coof the following checks:

1. Checking access rights to determine if the segment type is compatible with its use.

2. Checking read/write rights

4-25

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (130)

PROTECTION

qual toon is

ccessstruc-e to bes:

criptor

r TSS

tor ishan or

gmentre XLAGSvalid and

access

3. Checking if the pointer offset exceeds the segment limit.

4. Checking if the supplier of the pointer is allowed to access the segment.

5. Checking the offset alignment.

The processor automatically performs first, second, and third checks during instruction execu-tion. Software must explicitly request the fourth check by issuing an ARPL instruction. The fifthcheck (offset alignment) is performed automatically at privilege level 3 if alignment checking isturned on. Offset alignment does not affect isolation of privilege levels.

4.10.1. Checking Access Rights (LAR Instruction)

When the processor accesses a segment using a far pointer, it performs an access rights checkon the segment descriptor pointed to by the far pointer. This check is performed to determine iftype and privilege level (DPL) of the segment descriptor are compatible with the operation to beperformed. For example, when making a far call in protected mode, the segment-descriptor typemust be for a conforming or nonconforming code segment, a call gate, a task gate, or a TSS.Then, if the call is to a nonconforming code segment, the DPL of the code segment must be equalto the CPL, and the RPL of the code segment’s segment selector must be less than or ethe DPL. If type or privilege level are found to be incompatible, the appropriate exceptigenerated.

To prevent type incompatibility exceptions from being generated, software can check the arights of a segment descriptor using the LAR (load access rights) instruction. The LAR intion specifies the segment selector for the segment descriptor whose access rights archecked and a destination register. The instruction then performs the following operation

1. Check that the segment selector is not null.

2. Checks that the segment selector points to a segment descriptor that is within the destable limit (GDT or LDT).

3. Checks that the segment descriptor is a code, data, LDT, call gate, task gate, osegment-descriptor type.

4. If the segment is not a conforming code segment, checks if the segment descripvisible at the CPL (that is, if the CPL and the RPL of the segment selector are less tequal to the DPL).

5. If the privilege level and type checks pass, loads the second doubleword of the sedescriptor into the destination register (masked by the value 00FXFF00H, wheindicates that the corresponding 4 bits are undefined) and sets the ZF flag in the EFregister. If the segment selector is not visible at the current privilege level or is an intype for the LAR instruction, the instruction does not modify the destination registerclears the ZF flag.

Once loaded in the destination register, software can preform additional checks on the rights information.

4-26

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (131)

PROTECTION

4.10.2. Checking Read/Write Rights (VERR and VERW Instructions)

When the processor accesses any code or data segment it checks the read/write privilegesassigned to the segment to verify that the intended read or write operation is allowed. Softwarecan check read/write rights using the VERR (verify for reading) and VERW (verify for writing)instructions. Both these instructions specify the segment selector for the segment being checked.The instructions then perform the following operations:

1. Check that the segment selector is not null.

2. Checks that the segment selector points to a segment descriptor that is within the descriptortable limit (GDT or LDT).

3. Checks that the segment descriptor is a code or data-segment descriptor type.

4. If the segment is not a conforming code segment, checks if the segment descriptor isvisible at the CPL (that is, if the CPL and the RPL of the segment selector are less than orequal to the DPL).

5. Checks that the segment is readable (for the VERR instruction) or writable (for theVERW) instruction.

The VERR instruction sets the ZF flag in the EFLAGS register if the segment is visible at theCPL and readable; the VERW sets the ZF flag if the segment is visible and writable. (Codesegments are never writable.) The ZF flag is cleared if any of these checks fail.

4-27

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (132)

PROTECTION

privi-called said to

ationsystemm (theroce-sociated levelting-ment on

le, an selectorgment

it does

4.10.3. Checking That the Pointer Offset Is Within Limits (LSL Instruction)

When the processor accesses any segment it performs a limit check to insure that the offset iswithin the limit of the segment. Software can perform this limit check using the LSL (loadsegment limit) instruction. Like the LAR instruction, the LSL instruction specifies the segmentselector for the segment descriptor whose limit is to be checked and a destination register. Theinstruction then performs the following operations:

1. Check that the segment selector is not null.

2. Checks that the segment selector points to a segment descriptor that is within the descriptortable limit (GDT or LDT).

3. Checks that the segment descriptor is a code, data, LDT, or TSS segment-descriptor type.

4. If the segment is not a conforming code segment, checks if the segment descriptor isvisible at the CPL (that is, if the CPL and the RPL of the segment selector less than orequal to the DPL).

5. If the privilege level and type checks pass, loads the unscrambled limit (the limit scaledaccording to the setting of the G flag in the segment descriptor) into the destination registerand sets the ZF flag in the EFLAGS register. If the segment selector is not visible at thecurrent privilege level or is an invalid type for the LSL instruction, the instruction does notmodify the destination register and clears the ZF flag.

Once loaded in the destination register, software can compare the segment limit with the offsetof a pointer.

4.10.4. Checking Caller Access Privileges (ARPL Instruction)

The requestor’s privilege level (RPL) field of a segment selector is intended to carry the lege level of a calling procedure (the calling procedure’s CPL) to a called procedure. The procedure then uses the RPL to determine if access to a segment is allowed. The RPL is“weaken” the privilege level of the called procedure to that of the RPL.

Operating-system procedures typically use the RPL to prevent less privileged applicprograms from accessing data located in more privileged segments. When an operating-procedure (the called procedure) receives a segment selector from an application progracalling procedure), it sets the segment selector’s RPL to the privilege level of the calling pdure. Then, when the operating system uses the segment selector to access its assegment, the processor performs privilege checks using the calling procedure’s privilege(stored in the RPL) rather than the numerically lower privilege level (the CPL) of the operasystem procedure. The RPL thus insures that the operating system does not access a segbehalf of an application program unless that program itself has access to the segment.

Figure 4-12 shows an example of how the processor uses the RPL field. In this exampapplication program (located in code segment A) possesses a segment selector (segmentD1) that points to a privileged data structure (that is, a data structure located in a data seD at privilege level 0). The application program cannot access data segment D, because

4-28

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (133)

PROTECTION

cess byPL of

rogramment D,egmenty value,ess a

not have sufficient privilege, but the operating system (located in code segment C) can. So, inan attempt to access data segment D, the application program executes a call to the operatingsystem and passes segment selector D1 to the operating system as a parameter on the stack.Before passing the segment selector, the (well behaved) application program sets the RPL of thesegment selector to its current privilege level (which in this example is 3). If the operatingsystem attempts to access data segment D using segment selector D1, the processor comparesthe CPL (which is now 0 following the call), the RPL of segment selector D1, and the DPL ofdata segment D (which is 0). Since the RPL is greater than the DPL, access to data segment Dis denied. The processor’s protection mechanism thus protects data segment D from acthe operating system, because application program’s privilege level (represented by the Rsegment selector B) is greater than the DPL of data segment D.

Now assume that instead of setting the RPL of the segment selector to 3, the application psets the RPL to 0 (segment selector D2). The operating system can now access data segbecause its CPL and the RPL of segment selector D2 are both equal to the DPL of data sD. Because the application program is able to change the RPL of a segment selector to anit can potentially use a procedure operating at a numerically lower privilege level to acc

Figure 4-12. Use of RPL to Weaken Privilege Level of Called Procedure

Passed as a parameter on

the stack.

Access

allowed

Accessallowed

Application Program

OperatingSystem

Lowest Privilege

Highest Privilege

3

2

1

DataSegment D

not

Segment Sel. D1RPL=3

Segment Sel. D2RPL=0

Gate Selector BRPL=3

CodeSegment A

CPL=3

CodeSegment C

DPL=0

CallGate B

DPL=3

DPL=0

4-29

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (134)

PROTECTION

, oper-ment

gmentlevel)lector

used. it usesof the). If the RPLmentighers by

RPLelectorn copy

ARPL

lag innceseptions

g is

ction isem or pagesn areents.

ce isemory-fault

protected data structure. This ability to lower the RPL of a segment selector breaches theprocessor’s protection mechanism.

Because a called procedure cannot rely on the calling procedure to set the RPL correctlyating-system procedures (executing at numerically lower privilege-levels) that receive segselectors from numerically higher privilege-level procedures need to test the RPL of the seselector to determine if it is at the appropriate level. The ARPL (adjust requested privilege instruction is provided for this purpose. This instruction adjusts the RPL of one segment seto match that of another segment selector.

The example in Figure 4-12 demonstrates how the ARPL instruction is intended to be When the operating-system receives segment selector D2 from the application program,the ARPL instruction to compare the RPL of the segment selector with the privilege level application program (represented by the code-segment selector pushed onto the stackRPL is less than application program’s privilege level, the ARPL instruction changes theof the segment selector to match the privilege level of the application program (segselector D1). Using this instruction thus prevents a procedure running at a numerically hprivilege level from accessing numerically lower privilege-level (more privileged) segmentlowering the RPL of a segment selector.

Note that the privilege level of the application program can be determined by reading thefield of the segment selector for the application-program’s code segment. This segment sis stored on the stack as part of the call to the operating system. The operating system cathe segment selector from the stack into a register for use as an operand for the instruction.

4.10.5. Checking Alignment

When the CPL is 3, alignment of memory references can be checked by setting the AM fthe CR0 register and the AC flag in the EFLAGS register. Unaligned memory referegenerate alignment exceptions (#AC). The processor does not generate alignment excwhen operating at privilege level 0, 1, or 2. Refer to Table 5-7 in Chapter 5, Interrupt and Excep-tion Handling for a description of the alignment requirements when alignment checkinenabled.

4.11. PAGE-LEVEL PROTECTION

Page-level protection can be used alone or applied to segments. When page-level proteused with the flat memory model, it allows supervisor code and data (the operating systexecutive) to be protected from user code and data (application programs). It also allowscontaining code to be write protected. When the segment- and page-level protectiocombined, page-level read/write protection allows more protection granularity within segm

With page-level protection (as with segment-level protection) each memory referenchecked to verify that protection checks are satisfied. All checks are made before the mcycle is started, and any violation prevents the cycle from starting and results in a page

4-30

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (135)

PROTECTION

tive,s page

sor isL ofhen inr CR0

e set upgmentshe dataSection thess spaceervisorments.y the

exception being generated. Because checks are performed in parallel with address translation,there is no performance penalty.

The processor performs two page-level protection checks:

• Restriction of addressable domain (supervisor and user modes).

• Page type (read only or read/write).

Violations of either of these checks results in a page-fault exception being generated. Refer toChapter 5, Interrupt and Exception Handling for an explanation of the page-fault exceptionmechanism. This chapter describes the protection violations which lead to page-fault excep-tions.

4.11.1. Page-Protection Flags

Protection information for pages is contained in two flags in a page-directory or page-table entry(refer to Figure 3-14 in Chapter 3, Protected-Mode Memory Management): the read/write flag(bit 1) and the user/supervisor flag (bit 2). The protection checks are applied to both first- andsecond-level page tables (that is, page directories and page tables).

4.11.2. Restricting Addressable Domain

The page-level protection mechanism allows restricting access to pages based on two privilegelevels:

• Supervisor mode (U/S flag is 0)—(Most privileged) For the operating system or execuother system software (such as device drivers), and protected system data (such atables).

• User mode (U/S flag is 1)—(Least privileged) For application code and data.

The segment privilege levels map to the page privilege levels as follows. If the procescurrently operating at a CPL of 0, 1, or 2, it is in supervisor mode; if it is operating at a CP3, it is in user mode. When the processor is in supervisor mode, it can access all pages; wuser mode, it can access only user-level pages. (Note that the WP flag in control registemodifies the supervisor permissions, as described in Section 4.11.3., “Page Type”)

Note that to use the page-level protection mechanism, code and data segments must bfor at least two segment-based privilege levels: level 0 for supervisor code and data seand level 3 for user code and data segments. (In this model, the stacks are placed in tsegments.) To minimize the use of segments, a flat memory model can be used (refer to 3.2.1., “Basic Flat Model” in Section 3, “Protected-Mode Memory Management”). Here,user and supervisor code and data segments all begin at address zero in the linear addreand overlay each other. With this arrangement, operating-system code (running at the suplevel) and application code (running at the user level) can execute as if there are no segProtection between operating-system and application code and data is provided bprocessor’s page-level protection mechanism.

4-31

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (136)

PROTECTION

te-nablesrotecttems,ated, task an-writehe samely when

as read- page at

) mayks thews thelag is

esses,

4.11.3. Page Type

The page-level protection mechanism recognizes two page types:

• Read-only access (R/W flag is 0).

• Read/write access (R/W flag is 1).

When the processor is in supervisor mode and the WP flag in register CR0 is clear (its statefollowing reset initialization), all pages are both readable and writable (write-protection isignored). When the processor is in user mode, it can write only to user-mode pages that areread/write accessible. User-mode pages which are read/write or read-only are readable; super-visor-mode pages are neither readable nor writable from user mode. A page-fault exception isgenerated on any attempt to violate the protection rules.

The P6 family, Pentium®, and Intel486™ processors allow user-mode pages to be wriprotected against supervisor-mode access. Setting the WP flag in register CR0 to 1 esupervisor-mode sensitivity to user-mode, write-protected pages. This supervisor write-pfeature is useful for implementing a “copy-on-write” strategy used by some operating syssuch as UNIX*, for task creation (also called forking or spawning). When a new task is creit is possible to copy the entire address space of the parent task. This gives the childcomplete, duplicate set of the parent's segments and pages. An alternative copy-ostrategy saves memory space and time by mapping the child's segments and pages to tsegments and pages used by the parent task. A private copy of a page gets created onone of the tasks writes to the page. By using the WP flag and marking the shared pages only, the supervisor can detect an attempt to write to a user-level page, and can copy thethat time.

4.11.4. Combining Protection of Both Levels of Page Tables

For any one page, the protection attributes of its page-directory entry (first-level page tablediffer from those of its page-table entry (second-level page table). The processor checprotection for a page in both its page-directory and the page-table entries. Table 4-2 shoprotection provided by the possible combinations of protection attributes when the WP fclear.

4.11.5. Overrides to Page Protection

The following types of memory accesses are checked as if they are privilege-level 0 accregardless of the CPL at which the processor is currently operating:

• Access to segment descriptors in the GDT, LDT, or IDT.

• Access to an inner-privilege-level stack during an inter-privilege-level call or a call to inexception or interrupt handler, when a change of privilege level occurs.

4-32

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (137)

PROTECTION

4.12. COMBINING PAGE AND SEGMENT PROTECTION

When paging is enabled, the processor evaluates segment protection first, then evaluates pageprotection. If the processor detects a protection violation at either the segment level or the pagelevel, the memory access is not carried out and an exception is generated. If an exception isgenerated by segmentation, no paging exception is generated.

Page-level protections cannot be used to override segment-level protection. For example, a codesegment is by definition not writable. If a code segment is paged, setting the R/W flag for thepages to read-write does not make the pages writable. Attempts to write into the pages will beblocked by segment-level protection checks.

Page-level protection can be used to enhance segment-level protection. For example, if a largeread-write data segment is paged, the page-protection mechanism can be used to write-protectindividual pages.

NOTE:

* If the WP flag of CR0 is set, the access type is determined by the R/W flags of the page-directory andpage-table entries.

Table 4-2. Combined Page-Directory and Page-Table Protection

Page-Directory Entry Page-Table Entry Combined Effect

Privilege Access Type Privilege Access Type Privilege Access Type

User Read-Only User Read-Only User Read-Only

User Read-Only User Read-Write User Read-Only

User Read-Write User Read-Only User Read-Only

User Read-Write User Read-Write User Read/Write

User Read-Only Supervisor Read-Only Supervisor Read/Write*

User Read-Only Supervisor Read-Write Supervisor Read/Write*

User Read-Write Supervisor Read-Only Supervisor Read/Write*

User Read-Write Supervisor Read-Write Supervisor Read/Write

Supervisor Read-Only User Read-Only Supervisor Read/Write*

Supervisor Read-Only User Read-Write Supervisor Read/Write*

Supervisor Read-Write User Read-Only Supervisor Read/Write*

Supervisor Read-Write User Read-Write Supervisor Read/Write

Supervisor Read-Only Supervisor Read-Only Supervisor Read/Write*

Supervisor Read-Only Supervisor Read-Write Supervisor Read/Write*

Supervisor Read-Write Supervisor Read-Only Supervisor Read/Write*

Supervisor Read-Write Supervisor Read-Write Supervisor Read/Write

4-33

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (138)

PROTECTION

4-34

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (139)

5

Interrupt and Exception Handling

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (140)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (141)

INTERRUPT AND EXCEPTION HANDLING

n oper-t and

, real-

grams handle can alsoorcessorernal

ptionscutive.ure orandler.

rruptedloss ofaused

n oper- cause

mode.

CHAPTER 5INTERRUPT AND EXCEPTION HANDLING

This chapter describes the processor’s interrupt and exception-handling mechanism, wheating in protected mode. Most of the information provided here also applies to the interrupexception mechanism used in real-address or virtual-8086 mode. Refer to Chapter 168086Emulation for a description of the differences in the interrupt and exception mechanism foraddress and virtual-8086 mode.

5.1. INTERRUPT AND EXCEPTION OVERVIEW

Interrupts and exceptions are forced transfers of execution from the currently running proor task to a special procedure or task called a handler. Interrupts typically occur at random timeduring the execution of a program, in response to signals from hardware. They are used toevents external to the processor, such as requests to service peripheral devices. Softwaregenerate interrupts by executing the INT n instruction. Exceptions occur when the processdetects an error condition while executing an instruction, such as division by zero. The prodetects a variety of error conditions including protection violations, page faults, and intmachine faults. The machine-check architecture of the P6 family and Pentium® processorsalso permits a machine-check exception to be generated when internal hardware errors and buserrors are detected.

The processor’s interrupt and exception-handling mechanism allows interrupts and exceto be handled transparently to application programs and the operating system or exeWhen an interrupt is received or an exception is detected, the currently running procedtask is automatically suspended while the processor executes an interrupt or exception hWhen execution of the handler is complete, the processor resumes execution of the inteprocedure or task. The resumption of the interrupted procedure or task happens without program continuity, unless recovery from an exception was not possible or an interrupt cthe currently running program to be terminated.

This chapter describes the processor’s interrupt and exception-handling mechanism, wheating in protected mode. A detailed description of the exceptions and the conditions thatthem to be generated is given at the end of this chapter. Refer to Chapter 16, 8086 Emulation fora description of the interrupt and exception mechanism for real-address and virtual-8086

5.1.1. Sources of Interrupts

The processor receives interrupts from two sources:

• External (hardware generated) interrupts.

• Software-generated interrupts.

5-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (142)

INTERRUPT AND EXCEPTION HANDLING

Inter-

INTRds fromuch as pin

IC’s

l inter-serialendsan also

tium

ske local

5.1.1.1. EXTERNAL INTERRUPTS

External interrupts are received through pins on the processor or through the local APIC serialbus. The primary interrupt pins on a P6 family or Pentium® processor are the LINT[1:0] pins,which are connected to the local APIC (refer to Section 7.5., “Advanced Programmable rupt Controller (APIC)” in Chapter 7, Multiple-Processor Management). When the local APICis disabled, these pins are configured as INTR and NMI pins, respectively. Asserting the pin signals the processor that an external interrupt has occurred, and the processor reathe system bus the interrupt vector number provided by an external interrupt controller, san 8259A (refer to Section 5.2., “Exception and Interrupt Vectors”). Asserting the NMIsignals a nonmaskable interrupt (NMI), which is assigned to interrupt vector 2.

When the local APIC is enabled, the LINT[1:0] pins can be programmed through the APvector table to be associated with any of the processor’s exception or interrupt vectors.

The processor’s local APIC can be connected to a system-based I/O APIC. Here, externarupts received at the I/O APIC’s pins can be directed to the local APIC through the APIC bus (pins PICD[1:0]). The I/O APIC determines the vector number of the interrupt and sthis number to the local APIC. When a system contains multiple processors, processors csend interrupts to one another by means of the APIC serial bus.

The LINT[1:0] pins are not available on the Intel486™ processor and the earlier Pen®

processors that do not contain an on-chip local APIC. Instead these processors have dedicatedNMI and INTR pins. With these processors, external interrupts are typically generated by asystem-based interrupt controller (8259A), with the interrupts being signaled through the INTRpin.

Note that several other pins on the processor cause a processor interrupt to occur; however, theseinterrupts are not handled by the interrupt and exception mechanism described in this chapter.These pins include the RESET#, FLUSH#, STPCLK#, SMI#, R/S#, and INIT# pins. Which ofthese pins are included on a particular Intel Architecture processor is implementation dependent.The functions of these pins are described in the data books for the individual processors. TheSMI# pin is also described in Chapter 12, System Management Mode (SMM).

5.1.1.2. MASKABLE HARDWARE INTERRUPTS

Any external interrupt that is delivered to the processor by means of the INTR pin or throughthe local APIC is called a maskable hardware interrupt. The maskable hardware interruptsthat can be delivered through the INTR pin include all Intel Architecture defined interruptvectors from 0 through 255; those that can be delivered through the local APIC include interruptvectors 16 through 255.

All maskable hardware interrupts can be masked as a group. Use the single IF flag in theEFLAGS register (refer to Section 5.6.1., “Masking Maskable Hardware Interrupts”) to mathese maskable interrupts. Note that when interrupts 0 through 15 are delivered through thAPIC, the APIC indicates the receipt of an illegal vector.

5-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (143)

INTERRUPT AND EXCEPTION HANDLING

not bember

the

e

are.ecificxcep-

mita-ree theer, theption for excep- If thep off

cation.

5.1.1.3. SOFTWARE-GENERATED INTERRUPTS

The INT n instruction permits interrupts to be generated from within software by supplying theinterrupt vector number as an operand. For example, the INT 35 instruction forces an implicitcall to the interrupt handler for interrupt 35.

Any of the interrupt vectors from 0 to 255 can be used as a parameter in this instruction. If theprocessor’s predefined NMI vector is used, however, the response of the processor will the same as it would be from an NMI interrupt generated in the normal manner. If vector nu2 (the NMI vector) is used in this instruction, the NMI interrupt handler is called, butprocessor’s NMI-handling hardware is not activated.

Note that interrupts generated in software with the INT n instruction cannot be masked by thIF flag in the EFLAGS register.

5.1.2. Sources of Exceptions

The processor receives exceptions from three sources:

• Processor-detected program-error exceptions.

• Software-generated exceptions.

• Machine-check exceptions.

5.1.2.1. PROGRAM-ERROR EXCEPTIONS

The processor generates one or more exceptions when it detects program errors during theexecution in an application program or the operating system or executive. The Intel Architecturedefines a vector number for each processor-detectable exception. The exceptions are furtherclassified as faults, traps, and aborts (refer to Section 5.3., “Exception Classifications”).

5.1.2.2. SOFTWARE-GENERATED EXCEPTIONS

The INTO, INT 3, and BOUND instructions permit exceptions to be generated in softwThese instructions allow checks for specific exception conditions to be performed at sppoints in the instruction stream. For example, the INT 3 instruction causes a breakpoint etion to be generated.

The INT n instruction can be used to emulate a specific exception in software, with one lition. If the n operand in the INT n instruction contains a vector for one of the Intel Architectuexceptions, the processor will generate an interrupt to that vector, which will in turn invokexception handler associated with that vector. Because this is actually an interrupt, howevprocessor does not push an error code onto the stack, even if a hardware-generated excethat vector normally produces one. For those exceptions that produce an error code, thetion handler will attempt to pop an error code from the stack while handling the exception.INT n instruction was used to emulate the generation of an exception, the handler will poand discard the EIP (in place of the missing error code), sending the return to the wrong lo

5-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (144)

INTERRUPT AND EXCEPTION HANDLING

ption

e alsoe stack

pt. Notge are

upts are and tonisms

nd or task

ected,lt is begin-tentsruc-

d as someof the stackred astates are

5.1.2.3. MACHINE-CHECK EXCEPTIONS

The P6 family and Pentium® processors provide both internal and external machine-checkmechanisms for checking the operation of the internal chip hardware and bus transactions.These mechanisms constitute extended (implementation dependent) exception mechanisms.When a machine-check error is detected, the processor signals a machine-check exception(vector 18) and returns an error code. Refer to “Interrupt 18—Machine Check Exce(#MC)” at the end of this chapter and Chapter 13, Machine-Check Architecture, for a detaileddescription of the machine-check mechanism.

5.2. EXCEPTION AND INTERRUPT VECTORS

The processor associates an identification number, called a vector, with each exception andinterrupt. Table 5-1 shows the assignment of exception and interrupt vectors. This tablgives the exception type for each vector, indicates whether an error code is saved on thfor an exception, and gives the source of the exception or interrupt.

The vectors in the range 0 through 31 are assigned to the exceptions and the NMI interruall of these vectors are currently used by the processor. Unassigned vectors in this ranreserved for possible future uses. Do not use the reserved vectors.

The vectors in the range 32 to 255 are designated as user-defined interrupts. These interrnot reserved by the Intel Architecture and are generally assigned to external I/O devicespermit them to signal the processor through one of the external hardware interrupt mechadescribed in Section 5.1.1., “Sources of Interrupts”

5.3. EXCEPTION CLASSIFICATIONS

Exceptions are classified as faults, traps, or aborts depending on the way they are reported awhether the instruction that caused the exception can be restarted with no loss of programcontinuity.

Faults A fault is an exception that can generally be corrected and that, once corrallows the program to be restarted with no loss of continuity. When a faureported, the processor restores the machine state to the state prior to thening of execution of the faulting instruction. The return address (saved conof the CS and EIP registers) for the fault handler points to the faulting insttion, rather than the instruction following the faulting instruction.

Note: There are a small subset of exceptions that are normally reportefaults, but under architectural corner cases, they are not restartable andprocessor context will be lost. An example of these cases is the execution POPAD instruction where the stack frame crosses over the the end of thesegment. The exception handler will see that the CS:EIP has been restoif the POPAD instruction had not executed however internal processor s(general purpose registers) will have been modified. These corner case

5-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (145)

INTERRUPT AND EXCEPTION HANDLING

considered programming errors and an application causeing this class ofexceptions will likely be terminated by the operating system.

Traps A trap is an exception that is reported immediately following the execution ofthe trapping instruction. Traps allow execution of a program or task to becontinued without loss of program continuity. The return address for the traphandler points to the instruction to be executed after the trapping instruction.

Aborts An abort is an exception that does not always report the precise location of theinstruction causing the exception and does not allow restart of the program ortask that caused the exception. Aborts are used to report severe errors, such ashardware errors and inconsistent or illegal values in system tables.

5-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (146)

INTERRUPT AND EXCEPTION HANDLING

NOTES:1. The UD2 instruction was introduced in the Pentium® Pro processor.2. Intel Architecture processors after the Intel386™ processor do not generate this exception.3. This exception was introduced in the Intel486™ processor.4. This exception was introduced in the Pentium® processor and enhanced in the P6 family processors.5. This exception was introduced in the Pentium® III processor.

Table 5-1. Protected-Mode Exceptions and Interrupts

Vector No.

Mne-monic Description Type

Error Code Source

0 #DE Divide Error Fault No DIV and IDIV instructions.

1 #DB Debug Fault/ Trap

No Any code or data reference or the INT 1 instruction.

2 — NMI Interrupt Interrupt No Nonmaskable external interrupt.

3 #BP Breakpoint Trap No INT 3 instruction.

4 #OF Overflow Trap No INTO instruction.

5 #BR BOUND Range Exceeded Fault No BOUND instruction.

6 #UD Invalid Opcode (Undefined Opcode)

Fault No UD2 instruction or reserved opcode.1

7 #NM Device Not Available (No Math Coprocessor)

Fault No Floating-point or WAIT/FWAIT instruction.

8 #DF Double Fault Abort Yes (Zero)

Any instruction that can generate an exception, an NMI, or an INTR.

9 Coprocessor Segment Overrun (reserved)

Fault No Floating-point instruction.2

10 #TS Invalid TSS Fault Yes Task switch or TSS access.

11 #NP Segment Not Present Fault Yes Loading segment registers or accessing system segments.

12 #SS Stack-Segment Fault Fault Yes Stack operations and SS register loads.

13 #GP General Protection Fault Yes Any memory reference and other protection checks.

14 #PF Page Fault Fault Yes Any memory reference.

15 — (Intel reserved. Do not use.) No

16 #MF Floating-Point Error (Math Fault)

Fault No Floating-point or WAIT/FWAIT instruction.

17 #AC Alignment Check Fault Yes (Zero)

Any data reference in memory.3

18 #MC Machine Check Abort No Error codes (if any) and source are model dependent.4

19 #XF Streaming SIMD Extensions Fault No SIMD floating-point instructions5

20-31 — Intel reserved. Do not use.

32-255

— User Defined (Nonreserved) Interrupts

Interrupt External interrupt or INT n instruction.

5-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (147)

INTERRUPT AND EXCEPTION HANDLING

rder”

iven

5.4. PROGRAM OR TASK RESTART

To allow restarting of program or task following the handling of an exception or an interrupt, allexceptions except aborts are guaranteed to report the exception on a precise instructionboundary, and all interrupts are guaranteed to be taken on an instruction boundary.

For fault-class exceptions, the return instruction pointer that the processor saves when it gener-ates the exception points to the faulting instruction. So, when a program or task is restartedfollowing the handling of a fault, the faulting instruction is restarted (re-executed). Restartingthe faulting instruction is commonly used to handle exceptions that are generated when accessto an operand is blocked. The most common example of a fault is a page-fault exception (#PF)that occurs when a program or task references an operand in a page that is not in memory. Whena page-fault exception occurs, the exception handler can load the page into memory and resumeexecution of the program or task by restarting the faulting instruction. To insure that this instruc-tion restart is handled transparently to the currently executing program or task, the processorsaves the necessary registers and stack pointers to allow it to restore itself to its state prior to theexecution of the faulting instruction.

For trap-class exceptions, the return instruction pointer points to the instruction following thetrapping instruction. If a trap is detected during an instruction which transfers execution, thereturn instruction pointer reflects the transfer. For example, if a trap is detected while executinga JMP instruction, the return instruction pointer points to the destination of the JMP instruction,not to the next address past the JMP instruction. All trap exceptions allow program or task restartwith no loss of continuity. For example, the overflow exception is a trapping exception. Here,the return instruction pointer points to the instruction following the INTO instruction that testedthe OF (overflow) flag in the EFLAGS register. The trap handler for this exception resolves theoverflow condition. Upon return from the trap handler, program or task execution continues atthe next instruction following the INTO instruction.

The abort-class exceptions do not support reliable restarting of the program or task. Aborthandlers generally are designed to collect diagnostic information about the state of the processorwhen the abort exception occurred and then shut down the application and system as gracefullyas possible.

Interrupts rigorously support restarting of interrupted programs and tasks without loss of conti-nuity. The return instruction pointer saved for an interrupt points to the next instruction to beexecuted at the instruction boundary where the processor took the interrupt. If the instructionjust executed has a repeat prefix, the interrupt is taken at the end of the current iteration with theregisters set to execute the next iteration.

The ability of a P6 family processor to speculatively execute instructions does not affect thetaking of interrupts by the processor. Interrupts are taken at instruction boundaries locatedduring the retirement phase of instruction execution; so they are always taken in the “in-oinstruction stream. Refer to Chapter 2, Introduction to the Intel Architecture, in the Intel Archi-tecture Software Developer’s Manual, Volume 1, for more information about the P6 familyprocessors’ microarchitecture and its support for out-of-order instruction execution.

Note that the Pentium® processor and earlier Intel Architecture processors also perform varyingamounts of prefetching and preliminary decoding of instructions; however, here also exceptionsand interrupts are not signaled until actual “in-order” execution of the instructions. For a g

5-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (148)

INTERRUPT AND EXCEPTION HANDLING

dling

the IF

nvokeMIrough

NMIventsr beection

cessor.

n thewareNTRag isormal

code sample, the signaling of exceptions will occur uniformly when the code is executed on anyfamily of Intel Architecture processors (except where new exceptions or new opcodes have beendefined).

5.5. NONMASKABLE INTERRUPT (NMI)

The nonmaskable interrupt (NMI) can be generated in either of two ways:

• External hardware asserts the NMI pin.

• The processor receives a message on the APIC serial bus of delivery mode NMI.

When the processor receives a NMI from either of these sources, the processor handles it imme-diately by calling the NMI handler pointed to by interrupt vector number 2. The processor alsoinvokes certain hardware conditions to insure that no other interrupts, including NMI interrupts,are received until the NMI handler has completed executing (refer to Section 5.5.1., “HanMultiple NMIs”).

Also, when an NMI is received from either of the above sources, it cannot be masked by flag in the EFLAGS register.

It is possible to issue a maskable hardware interrupt (through the INTR pin) to vector 2 to ithe NMI interrupt handler; however, this interrupt will not truly be an NMI interrupt. A true Ninterrupt that activates the processor’s NMI-handling hardware can only be delivered thone of the mechanisms listed above.

5.5.1. Handling Multiple NMIs

While an NMI interrupt handler is executing, the processor disables additional calls to thehandler until the next IRET instruction is executed. This blocking of subsequent NMIs prestacking up calls to the NMI handler. It is recommended that the NMI interrupt handleaccessed through an interrupt gate to disable maskable hardware interrupts (refer to S5.6.1., “Masking Maskable Hardware Interrupts”).

5.6. ENABLING AND DISABLING INTERRUPTS

The processor inhibits the generation of some interrupts, depending on the state of the proand of the IF and RF flags in the EFLAGS register, as described in the following sections

5.6.1. Masking Maskable Hardware Interrupts

The IF flag can disable the servicing of maskable hardware interrupts received oprocessor’s INTR pin or through the local APIC (refer to Section 5.1.1.2., “Maskable HardInterrupts”). When the IF flag is clear, the processor inhibits interrupts delivered to the Ipin or through the local APIC from generating an internal interrupt request; when the IF flset, interrupts delivered to the INTR or through the local APIC pin are processed as n

5-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (149)

INTERRUPT AND EXCEPTION HANDLING

-eption may not

rrupt-PL is

ey arectionsg in086

Flags

clear,lag is

external interrupts. The IF flag does not affect nonmaskable interrupts (NMIs) delivered to theNMI pin or delivery mode NMI messages delivered through the APIC serial bus, nor does itaffect processor generated exceptions. As with the other flags in the EFLAGS register, theprocessor clears the IF flag in response to a hardware reset.

The fact that the group of maskable hardware interrupts includes the reserved interrupt andexception vectors 0 through 32 can potentially cause confusion. Architecturally, when the IFflag is set, an interrupt for any of the vectors from 0 through 32 can be delivered to the processorthrough the INTR pin and any of the vectors from 16 through 32 can be delivered through thelocal APIC. The processor will then generate an interrupt and call the interrupt or exceptionhandler pointed to by the vector number. So for example, it is possible to invoke the page-faulthandler through the INTR pin (by means of vector 14); however, this is not a true page-faultexception. It is an interrupt. As with the INT n instruction (refer to Section 5.1.2.2., “SoftwareGenerated Exceptions”), when an interrupt is generated through the INTR pin to an excvector, the processor does not push an error code on the stack, so the exception handleroperate correctly.

The IF flag can be set or cleared with the STI (set interrupt-enable flag) and CLI (clear inteenable flag) instructions, respectively. These instructions may be executed only if the Cequal to or less than the IOPL. A general-protection exception (#GP) is generated if thexecuted when the CPL is greater than the IOPL. (The effect of the IOPL on these instruis modified slightly when the virtual mode extension is enabled by setting the VME flacontrol register CR4, refer to Section 16.3., “Interrupt and Exception Handling in Virtual-8Mode” in Chapter 16, 8086 Emulation.)

The IF flag is also affected by the following operations:

• The PUSHF instruction stores all flags on the stack, where they can be examined andmodified. The POPF instruction can be used to load the modified flags back into theEFLAGS register.

• Task switches and the POPF and IRET instructions load the EFLAGS register; therefore,they can be used to modify the setting of the IF flag.

• When an interrupt is handled through an interrupt gate, the IF flag is automatically cleared,which disables maskable hardware interrupts. (If an interrupt is handled through a trapgate, the IF flag is not cleared.)

Refer to the descriptions of the CLI, STI, PUSHF, POPF, and IRET instructions in Chapter 3,Instruction Set Reference, of the Intel Architecture Software Developer’s Manual, Volume 2, fora detailed description of the operations these instructions are allowed to perform on the IF flag.

5.6.2. Masking Instruction Breakpoints

The RF (resume) flag in the EFLAGS register controls the response of the processor to instruc-tion-breakpoint conditions (refer to the description of the RF flag in Section 2.3., “System and Fields in the EFLAGS Register” in Chapter 2, System Architecture Overview). When set, itprevents an instruction breakpoint from generating a debug exception (#DB); when instruction breakpoints will generate debug exceptions. The primary function of the RF f

5-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (150)

INTERRUPT AND EXCEPTION HANDLING

15,

ple:

registerhe stack

ep trapctionIf theended

essoreptionchitec-cessor whichowerxcep-ogram

to prevent the processor from going into a debug exception loop on an instruction-breakpoint.Refer to Section 15.3.1.1., “Instruction-Breakpoint Exception Condition”, in Chapter Debugging and Performance Monitoring, for more information on the use of this flag.

5.6.3. Masking Exceptions and Interrupts When Switching Stacks

To switch to a different stack segment, software often uses a pair of instructions, for exam

MOV SS, AX

MOV ESP, StackTop

If an interrupt or exception occurs after the segment selector has been loaded into the SS but before the ESP register has been loaded, these two parts of the logical address into tspace are inconsistent for the duration of the interrupt or exception handler.

To prevent this situation, the processor inhibits interrupts, debug exceptions, and single-stexceptions after either a MOV to SS instruction or a POP to SS instruction, until the instruboundary following the next instruction is reached. All other faults may still be generated. LSS instruction is used to modify the contents of the SS register (which is the recommmethod of modifying this register), this problem does not occur.

5.7. PRIORITY AMONG SIMULTANEOUS EXCEPTIONS AND INTERRUPTS

If more than one exception or interrupt is pending at an instruction boundary, the procservices them in a predictable order. Table 5-3 shows the priority among classes of excand interrupt sources. While priority among these classes is consistent throughout the arture, exceptions within each class are implementation-dependent and may vary from proto processor. The processor first services a pending exception or interrupt from the classhas the highest priority, transferring execution to the first instruction of the handler. Lpriority exceptions are discarded; lower priority interrupts are held pending. Discarded etions are re-generated when the interrupt handler returns execution to the point in the pror task where the exceptions and/or interrupts occurred.

The Pentium® III processor added the SIMD floating-point execution unit. The SIMD floating-point execution unit can generate exceptions as well. Since the SIMD floating-point executionunit utilizes a 4-wide register set an exception may result from more than one operand within aSIMD floating-point register. Hence the Pentium® III processor handles these exceptionsaccording to a predetermined precedence. When a sub-operand of a packed instruction generatestwo or more exception conditions, the exception precedence sometimes results in the higherpriority exception being handled and the lower priority exceptions being ignored. Prioritizationof exceptions is performed only on a sub-operand basis, and not between suboperands. Forexample, an invalid exception generated by one sub-operand will not prevent the reporting of adivide-by-zero exception generated by another sub-operand. Table 5-2 shows the precedence forStreaming SIMD Extensions numeric exceptions. The table reflects the order in which interruptsare handled upon simultaneous recognition by the processor (for example, when multiple inter-rupts are pending at an instruction boundary). However, the table does not necessarily reflect the

5-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (151)

INTERRUPT AND EXCEPTION HANDLING

order in which interrupts will be recognized by the processor if received simultaneously at theprocessor pins.

1. Though this is not an exception, the handling of a QNaN operand has precedence over lower priorityexceptions. For example, a QNaN divided by zero results in a QNaN, not a zero-divide exception.

2. If masked, then instruction execution continues, and a lower priority exception can occur as well.

5.8. INTERRUPT DESCRIPTOR TABLE (IDT)

The interrupt descriptor table (IDT) associates each exception or interrupt vector with a gatedescriptor for the procedure or task used to service the associated exception or interrupt. Likethe GDT and LDTs, the IDT is an array of 8-byte descriptors (in protected mode). Unlike theGDT, the first entry of the IDT may contain a descriptor. To form an index into the IDT, theprocessor scales the exception or interrupt vector by eight (the number of bytes in a gatedescriptor). Because there are only 256 interrupt or exception vectors, the IDT need not containmore than 256 descriptors. It can contain fewer than 256 descriptors, because descriptors arerequired only for the interrupt and exception vectors that may occur. All empty descriptor slotsin the IDT should have the present flag for the descriptor set to 0.

Table 5-2. SIMD Floating-Point Exceptions Priority

Priority Description

1(Highest) Invalid operation exception due to SNaNoperand (or any NaN operand for max, min, orcertain compare and convert operations)

2 QNaN operand1

3 Any other invalid operation exception notmentioned above or a divide-by-zeroexception2

4 Denormal operand exception2

5 Numeric overflow and underflow exceptionspossibly in conjunction with the inexact resultexception2

6(Lowest) Inexact result exception

5-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (152)

INTERRUPT AND EXCEPTION HANDLING

NOTE:

1. For the Pentium® and Intel486™ processors, the Code Segment Limit Violation and the Code Page Faultexceptions are assigned to the priority 7.

The base addresses of the IDT should be aligned on an 8-byte boundary to maximize perfor-mance of cache line fills. The limit value is expressed in bytes and is added to the base addressto get the address of the last valid byte. A limit value of 0 results in exactly 1 valid byte. BecauseIDT entries are always eight bytes long, the limit should always be one less than an integralmultiple of eight (that is, 8N – 1).

Table 5-3. Priority Among Simultaneous Exceptions and Interrupts

Priority Descriptions

1 (Highest) Hardware Reset and Machine Checks- RESET- Machine Check

2 Trap on Task Switch- T flag in TSS is set

3 External Hardware Interventions- FLUSH- STOPCLK- SMI- INIT

4 Traps on the Previous Instruction- Breakpoints- Debug Trap Exceptions (TF flag set or data/I-O breakpoint)

5 External Interrupts- NMI Interrupts- Maskable Hardware Interrupts

6 Faults from Fetching Next Instruction - Code Breakpoint Fault- Code-Segment Limit Violation1

- Code Page Fault1

7 Faults from Decoding the Next Instruction- Instruction length > 15 bytes - Illegal Opcode - Coprocessor Not Available

8 (Lowest) Faults on Executing an Instruction- Floating-point exception- Overflow- Bound error- Invalid TSS- Segment Not Present- Stack fault- General Protection- Data Page Fault- Alignment Check- SIMD floating-point exception

5-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (153)

INTERRUPT AND EXCEPTION HANDLING

The IDT may reside anywhere in the linear address space. As shown in Figure 5-1, the processorlocates the IDT using the IDTR register. This register holds both a 32-bit base address and 16-bitlimit for the IDT.

The LIDT (load IDT register) and SIDT (store IDT register) instructions load and store thecontents of the IDTR register, respectively. The LIDT instruction loads the IDTR register withthe base address and limit held in a memory operand. This instruction can be executed onlywhen the CPL is 0. It normally is used by the initialization code of an operating system whencreating an IDT. An operating system also may use it to change from one IDT to another. TheSIDT instruction copies the base and limit value stored in IDTR to memory. This instruction canbe executed at any privilege level.

If a vector references a descriptor beyond the limit of the IDT, a general-protection exception(#GP) is generated.

5.9. IDT DESCRIPTORS

The IDT may contain any of three kinds of gate descriptors:

• Task-gate descriptor

• Interrupt-gate descriptor

• Trap-gate descriptor

Figure 5-1. Relationship of the IDTR and IDT

IDT LimitIDT Base Address

+Interrupt

Descriptor Table (IDT)

Gate for

0IDTR Register

Interrupt #n

Gate forInterrupt #3

Gate forInterrupt #2

Gate forInterrupt #1

151647

0310

8

16

(n−1)∗8

5-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (154)

INTERRUPT AND EXCEPTION HANDLING

s” inthe

handler

Figure 5-2 shows the formats for the task-gate, interrupt-gate, and trap-gate descriptors. Theformat of a task gate used in an IDT is the same as that of a task gate used in the GDT or an LDT(refer to Section 6.2.4., “Task-Gate Descriptor” in Chapter 6, Task Management). The task gatecontains the segment selector for a TSS for an exception and/or interrupt handler task.

Interrupt and trap gates are very similar to call gates (refer to Section 4.8.3., “Call GateChapter 4, Protection). They contain a far pointer (segment selector and offset) that processor uses to transfer execution to a handler procedure in an exception- or interrupt-

Figure 5-2. IDT Gate Descriptors

31 16 15 1314 12 8 7 0

POffset 31..16DPL

0 4

31 16 15 0

Segment Selector Offset 15..0 0

011D

Interrupt Gate

DPLOffsetPSelector

Descriptor Privilege LevelOffset to procedure entry pointSegment Present flagSegment Selector for destination code segment

31 16 15 1314 12 8 7 0

PDPL

0 4

31 16 15 0

TSS Segment Selector 0

1010

Task Gate

45

0 0 0

31 16 15 1314 12 8 7 0

POffset 31..16DPL

0 4

31 16 15 0

Segment Selector Offset 15..0 0

111D

Trap Gate45

0 0 0

Reserved

Size of gate: 1 = 32 bits; 0 = 16 bitsD

5-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (155)

INTERRUPT AND EXCEPTION HANDLING

roce-

ndles inter-e IDT.

terrupttors”

handlerg” in

at runsor the or then- or

ves the Figurexcep-

rupted

stack is alsoandler.)ed fromS, EIP,

ET (ororestored

PL.

code segment. These gates differ in the way the processor handles the IF flag in the EFLAGSregister (refer to Section 5.10.1.2., “Flag Usage By Exception- or Interrupt-Handler Pdure”).

5.10. EXCEPTION AND INTERRUPT HANDLING

The processor handles calls to exception- and interrupt-handlers similar to the way it hacalls with a CALL instruction to a procedure or a task. When responding to an exception orrupt, the processor uses the exception or interrupt vector as an index to a descriptor in thIf the index points to an interrupt gate or trap gate, the processor calls the exception or inhandler in a manner similar to a CALL to a call gate (refer to Section 4.8.2., “Gate Descripthrough Section 4.8.6., “Returning from a Called Procedure” in Chapter 4, Protection). If indexpoints to a task gate, the processor executes a task switch to the exception- or interrupt-task in a manner similar to a CALL to a task gate (refer to Section 6.3., “Task SwitchinChapter 6, Task Management).

5.10.1. Exception- or Interrupt-Handler Procedures

An interrupt gate or trap gate references an exception- or interrupt-handler procedure thin the context of the currently executing task (refer to Figure 5-3). The segment selector fgate points to a segment descriptor for an executable code segment in either the GDTcurrent LDT. The offset field of the gate descriptor points to the beginning of the exceptiointerrupt-handling procedure.

When the processor performs a call to the exception- or interrupt-handler procedure, it sacurrent states of the EFLAGS register, CS register, and EIP register on the stack (refer to5-4). (The CS and EIP registers provide a return instruction pointer for the handler.) If an etion causes an error code to be saved, it is pushed on the stack after the EIP value.

If the handler procedure is going to be executed at the same privilege level as the interprocedure, the handler uses the current stack.

If the handler procedure is going to be executed at a numerically lower privilege level, aswitch occurs. When a stack switch occurs, a stack pointer for the stack to be returned tosaved on the stack. (The SS and ESP registers provide a return stack pointer for the hThe segment selector and stack pointer for the stack to be used by the handler is obtainthe TSS for the currently executing task. The processor copies the EFLAGS, SS, ESP, Cand error code information from the interrupted procedure’s stack to the handler’s stack.

To return from an exception- or interrupt-handler procedure, the handler must use the IRIRETD) instruction. The IRET instruction is similar to the RET instruction except that it restthe saved flags into the EFLAGS register. The IOPL field of the EFLAGS register is resonly if the CPL is 0. The IF flag is changed only if the CPL is less than or equal to the IORefer to “IRET/IRETD—Interrupt Return” in Chapter 3 of the Intel Architecture SoftwareDeveloper’s Manual, Volume 2, for the complete operation performed by the IRET instruction.

If a stack switch occurred when calling the handler procedure, the IRET instruction switchesback to the interrupted procedure’s stack on the return.

5-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (156)

INTERRUPT AND EXCEPTION HANDLING

Figure 5-3. Interrupt Procedure Call

IDT

Interrupt or

Code Segment

Segment Selector

GDT or LDT

Segment

InterruptVector

BaseAddress

Destination

ProcedureInterrupt

+

Descriptor

Trap Gate

Offset

5-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (157)

INTERRUPT AND EXCEPTION HANDLING

privi-te thisption-

5.10.1.1. PROTECTION OF EXCEPTION- AND INTERRUPT-HANDLER PROCEDURES

The privilege-level protection for exception- and interrupt-handler procedures is similar to thatused for ordinary procedure calls when called through a call gate (refer to Section 4.8.4.,“Accessing a Code Segment Through a Call Gate” in Chapter 4, Protection). The processor doesnot permit transfer of execution to an exception- or interrupt-handler procedure in a lessleged code segment (numerically greater privilege level) than the CPL. An attempt to violarule results in a general-protection exception (#GP). The protection mechanism for exceand interrupt-handler procedures is different in the following ways:

• Because interrupt and exception vectors have no RPL, the RPL is not checked on implicitcalls to exception and interrupt handlers.

• The processor checks the DPL of the interrupt or trap gate only if an exception or interruptis generated with an INT n, INT 3, or INTO instruction. Here, the CPL must be less than orequal to the DPL of the gate. This restriction prevents application programs or proceduresrunning at privilege level 3 from using a software interrupt to access critical exception

Figure 5-4. Stack Usage on Transfers to Interrupt and Exception-Handling Routines

CS

Error Code

EFLAGSCS

EIPESP AfterTransfer to Handler

Error Code

ESP BeforeTransfer to Handler

EFLAGS

EIP

SS ESP

Stack Usage with NoPrivilege-Level Change

Stack Usage withPrivilege-Level Change

Interrupted Procedure’s

Interrupted Procedure’sand Handler’s Stack

Handler’s Stack

ESP AfterTransfer to Handler

Transfer to HandlerESP Before

Stack

5-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (158)

INTERRUPT AND EXCEPTION HANDLING

handlers, such as the page-fault handler, providing that those handlers are placed in moreprivileged code segments (numerically lower privilege level). For hardware-generatedinterrupts and processor-detected exceptions, the processor ignores the DPL of interruptand trap gates.

Because exceptions and interrupts generally do not occur at predictable times, these privilegerules effectively impose restrictions on the privilege levels at which exception and interrupt-handling procedures can run. Either of the following techniques can be used to avoid privilege-level violations.

• The exception or interrupt handler can be placed in a conforming code segment. Thistechnique can be used for handlers that only need to access data available on the stack (forexample, divide error exceptions). If the handler needs data from a data segment, the datasegment needs to be accessible from privilege level 3, which would make it unprotected.

• The handler can be placed in a nonconforming code segment with privilege level 0. Thishandler would always run, regardless of the CPL that the interrupted program or task isrunning at.

5.10.1.2. FLAG USAGE BY EXCEPTION- OR INTERRUPT-HANDLER PROCEDURE

When accessing an exception or interrupt handler through either an interrupt gate or a trap gate,the processor clears the TF flag in the EFLAGS register after it saves the contents of theEFLAGS register on the stack. (On calls to exception and interrupt handlers, the processor alsoclears the VM, RF, and NT flags in the EFLAGS register, after they are saved on the stack.)Clearing the TF flag prevents instruction tracing from affecting interrupt response. A subsequentIRET instruction restores the TF (and VM, RF, and NT) flags to the values in the saved contentsof the EFLAGS register on the stack.

The only difference between an interrupt gate and a trap gate is the way the processor handlesthe IF flag in the EFLAGS register. When accessing an exception- or interrupt-handling proce-dure through an interrupt gate, the processor clears the IF flag to prevent other interrupts frominterfering with the current interrupt handler. A subsequent IRET instruction restores the IF flagto its value in the saved contents of the EFLAGS register on the stack. Accessing a handlerprocedure through a trap gate does not affect the IF flag.

5.10.2. Interrupt Tasks

When an exception or interrupt handler is accessed through a task gate in the IDT, a task switchresults. Handling an exception or interrupt with a separate task offers several advantages:

• The entire context of the interrupted program or task is saved automatically.

• A new TSS permits the handler to use a new privilege level 0 stack when handling theexception or interrupt. If an exception or interrupt occurs when the current privilege level 0stack is corrupted, accessing the handler through a task gate can prevent a system crash byproviding the handler with a new privilege level 0 stack.

5-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (159)

INTERRUPT AND EXCEPTION HANDLING

s error

• The handler can be further isolated from other tasks by giving it a separate address space.This is done by giving it a separate LDT.

The disadvantage of handling an interrupt with a separate task is that the amount of machinestate that must be saved on a task switch makes it slower than using an interrupt gate, resultingin increased interrupt latency.

A task gate in the IDT references a TSS descriptor in the GDT (refer to Figure 5-5). A switch tothe handler task is handled in the same manner as an ordinary task switch (refer to Section 6.3.,“Task Switching” in Chapter 6, Task Management). The link back to the interrupted task istored in the previous task link field of the handler task’s TSS. If an exception caused ancode to be generated, this error code is copied to the stack of the new task.

Figure 5-5. Interrupt Task Switch

IDT

Task Gate

TSS for Interrupt-

TSS Selector

GDT

TSS Descriptor

InterruptVector

TSSBaseAddress

Handling Task

5-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (160)

INTERRUPT AND EXCEPTION HANDLING

rupt

lt inter- half oftion is before

When exception- or interrupt-handler tasks are used in an operating system, there are actuallytwo mechanisms that can be used to dispatch tasks: the software scheduler (part of the operatingsystem) and the hardware scheduler (part of the processor’s interrupt mechanism). The softwarescheduler needs to accommodate interrupt tasks that may be dispatched when interrupts areenabled.

5.11. ERROR CODE

When an exception condition is related to a specific segment, the processor pushes an error codeonto the stack of the exception handler (whether it is a procedure or task). The error code hasthe format shown in Figure 5-6. The error code resembles a segment selector; however, insteadof a TI flag and RPL field, the error code contains 3 flags:

EXT External event (bit 0). When set, indicates that an event external to theprogram caused the exception, such as a hardware interrupt.

IDT Descriptor location (bit 1). When set, indicates that the index portion of theerror code refers to a gate descriptor in the IDT; when clear, indicates that theindex refers to a descriptor in the GDT or the current LDT.

TI GDT/LDT (bit 2). Only used when the IDT flag is clear. When set, the TI flagindicates that the index portion of the error code refers to a segment or gatedescriptor in the LDT; when clear, it indicates that the index refers to adescriptor in the current GDT.

The segment selector index field provides an index into the IDT, GDT, or current LDT to thesegment or gate selector being referenced by the error code. In some cases the error code is null(that is, all bits in the lower word are clear). A null error code indicates that the error was notcaused by a reference to a specific segment or that a null segment descriptor was referenced inan operation.

The format of the error code is different for page-fault exceptions (#PF), refer to “Inter14—Page-Fault Exception (#PF)” in this chapter.

The error code is pushed on the stack as a doubleword or word (depending on the defaurupt, trap, or task gate size). To keep the stack aligned for doubleword pushes, the upperthe error code is reserved. Note that the error code is not popped when the IRET instrucexecuted to return from an exception handler, so the handler must remove the error codeexecuting a return.

Figure 5-6. Error Code

31 0

ReservedIDT

TI

123

Segment Selector IndexEXT

5-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (161)

INTERRUPT AND EXCEPTION HANDLING

Error codes are not pushed on the stack for exceptions that are generated externally (with theINTR or LINT[1:0] pins) or the INT n instruction, even if an error code is normally producedfor those exceptions.

5.12. EXCEPTION AND INTERRUPT REFERENCE

The following sections describe conditions which generate exceptions and interrupts. They arearranged in the order of vector numbers. The information contained in these sections are asfollows:

Exception Class Indicates whether the exception class is a fault, trap, or abort type.Some exceptions can be either a fault or trap type, depending onwhen the error condition is detected. (This section is not applicableto interrupts.)

Description Gives a general description of the purpose of the exception or inter-rupt type. It also describes how the processor handles the exceptionor interrupt.

Exception Error Code Indicates whether an error code is saved for the exception. If one issaved, the contents of the error code are described. (This section isnot applicable to interrupts.)

Saved Instruction Pointer Describes which instruction the saved (or return) instruction pointerpoints to. It also indicates whether the pointer can be used to restarta faulting instruction.

Program State Change Describes the effects of the exception or interrupt on the state of thecurrently running program or task and the possibilities of restartingthe program or task without loss of continuity.

5-21

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (162)

INTERRUPT AND EXCEPTION HANDLING

Interrupt 0—Divide Error Exception (#DE)

Exception Class Fault.

Description

Indicates the divisor operand for a DIV or IDIV instruction is 0 or that the result cannot be repre-sented in the number of bits specified for the destination operand.

Exception Error Code

None.

Saved Instruction Pointer

Saved contents of CS and EIP registers point to the instruction that generated the exception.

Program State Change

A program-state change does not accompany the divide error, because the exception occursbefore the faulting instruction is executed.

5-22

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (163)

INTERRUPT AND EXCEPTION HANDLING

ed the

ction

excep-xecu-

ction orwever,eliably.

Interrupt 1—Debug Exception (#DB)

Exception Class Trap or Fault. The exception handler can distinguish between traps or faults by examining the contents of DR6 and the other debug registers.

Description

Indicates that one or more of several debug-exception conditions has been detected. Whether theexception is a fault or a trap depends on the condition, as shown below:

Refer to Chapter 15, Debugging and Performance Monitoring, for detailed information aboutthe debug exceptions.

Exception Error Code

None. An exception handler can examine the debug registers to determine which conditioncaused the exception.

Saved Instruction Pointer

Fault—Saved contents of CS and EIP registers point to the instruction that generatexception.

Trap—Saved contents of CS and EIP registers point to the instruction following the instruthat generated the exception.

Program State Change

Fault—A program-state change does not accompany the debug exception, because thetion occurs before the faulting instruction is executed. The program can resume normal etion upon returning from the debug exception handler

Trap—A program-state change does accompany the debug exception, because the instrutask switch being executed is allowed to complete before the exception is generated. Hothe new state of the program is not corrupted and execution of the program can continue r

Exception Condition Exception Class

Instruction fetch breakpoint Fault

Data read or write breakpoint Trap

I/O read or write breakpoint Trap

General detect condition (in conjunction with in-circuit emulation) Fault

Single-step Trap

Task-switch Trap

Execution of INT 1 instruction Trap

5-23

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (164)

INTERRUPT AND EXCEPTION HANDLING

I pin This

ents ofrupt isn the

I isandlerbefore

Interrupt 2—NMI Interrupt

Exception Class Not applicable.

Description

The nonmaskable interrupt (NMI) is generated externally by asserting the processor’s NMor through an NMI request set by the I/O APIC to the local APIC on the APIC serial bus.interrupt causes the NMI interrupt handler to be called.

Exception Error Code

Not applicable.

Saved Instruction Pointer

The processor always takes an NMI interrupt on an instruction boundary. The saved contCS and EIP registers point to the next instruction to be executed at the point the intertaken. Refer to Section 5.4., “Program or Task Restart” for more information about wheprocessor takes NMI interrupts.

Program State Change

The instruction executing when an NMI interrupt is received is completed before the NMgenerated. A program or task can thus be restarted upon returning from an interrupt hwithout loss of continuity, provided the interrupt handler saves the state of the processor handling the interrupt and restores the processor’s state prior to a return.

5-24

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (165)

INTERRUPT AND EXCEPTION HANDLING

toption

debug

INT

Interrupt 3—Breakpoint Exception (#BP)

Exception Class Trap.

Description

Indicates that a breakpoint instruction (INT 3) was executed, causing a breakpoint trap to begenerated. Typically, a debugger sets a breakpoint by replacing the first opcode byte of aninstruction with the opcode for the INT 3 instruction. (The INT 3 instruction is one byte long,which makes it easy to replace an opcode in a code segment in RAM with the breakpointopcode.) The operating system or a debugging tool can use a data segment mapped to the samephysical address space as the code segment to place an INT 3 instruction in places where it isdesired to call the debugger.

With the P6 family, Pentium®, Intel486™, and Intel386™ processors, it is more convenientset breakpoints with the debug registers. (Refer to Section 15.3.2., “Breakpoint Exce(#BP)—Interrupt Vector 3”, in Chapter 15, Debugging and Performance Monitoring, for infor-mation about the breakpoint exception.) If more breakpoints are needed beyond what theregisters allow, the INT 3 instruction can be used.

The breakpoint (#BP) exception can also be generated by executing the INT n instruction withan operand of 3. The action of this instruction (INT 3) is slightly different than that of the 3 instruction (refer to “INTn/INTO/INT3—Call to Interrupt Procedure” in Chapter 3 of the IntelArchitecture Software Developer’s Manual, Volume 2).

Exception Error Code

None.

Saved Instruction Pointer

Saved contents of CS and EIP registers point to the instruction following the INT 3 instruction.

Program State Change

Even though the EIP points to the instruction following the breakpoint instruction, the state ofthe program is essentially unchanged because the INT 3 instruction does not affect any registeror memory locations. The debugger can thus resume the suspended program by replacing theINT 3 instruction that caused the breakpoint with the original opcode and decrementing thesaved contents of the EIP register. Upon returning from the debugger, program executionresumes with the replaced instruction.

5-25

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (166)

INTERRUPT AND EXCEPTION HANDLING

Interrupt 4—Overflow Exception (#OF)

Exception Class Trap.

Description

Indicates that an overflow trap occurred when an INTO instruction was executed. The INTOinstruction checks the state of the OF flag in the EFLAGS register. If the OF flag is set, an over-flow trap is generated.

Some arithmetic instructions (such as the ADD and SUB) perform both signed and unsignedarithmetic. These instructions set the OF and CF flags in the EFLAGS register to indicate signedoverflow and unsigned overflow, respectively. When performing arithmetic on signed operands,the OF flag can be tested directly or the INTO instruction can be used. The benefit of using theINTO instruction is that if the overflow exception is detected, an exception handler can be calledautomatically to handle the overflow condition.

Exception Error Code

None.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction following the INTOinstruction.

Program State Change

Even though the EIP points to the instruction following the INTO instruction, the state of theprogram is essentially unchanged because the INTO instruction does not affect any register ormemory locations. The program can thus resume normal execution upon returning from theoverflow exception handler.

5-26

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (167)

INTERRUPT AND EXCEPTION HANDLING

Interrupt 5—BOUND Range Exceeded Exception (#BR)

Exception Class Fault.

Description

Indicates that a BOUND-range-exceeded fault occurred when a BOUND instruction wasexecuted. The BOUND instruction checks that a signed array index is within the upper andlower bounds of an array located in memory. If the array index is not within the bounds of thearray, a BOUND-range-exceeded fault is generated.

Exception Error Code

None.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the BOUND instruction that generated theexception.

Program State Change

A program-state change does not accompany the bounds-check fault, because the operands forthe BOUND instruction are not modified. Returning from the BOUND-range-exceeded excep-tion handler causes the BOUND instruction to be restarted.

5-27

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (168)

INTERRUPT AND EXCEPTION HANDLING

n in

IMD

ram

es, even

Interrupt 6—Invalid Opcode Exception (#UD)

Exception Class Fault.

Description

Indicates that the processor did one of the following things:

• Attempted to execute a Streaming SIMD Extensions instruction in an Intel Architectureprocessor that does not support the Streaming SIMD Extensions.

• Attempted to execute a Streaming SIMD Extensions instruction when the OSFXSR bit isnot set (0) in CR4. Note this does not include the following Streaming SIMD Extensions:PAVGB, PAVGW, PEXTRW, PINSRW, PMAXSW, PMAXUB, PMINSW, PMINUB,PMOVMSKB, PMULHUW, PSADBW, PSHUFW, MASKMOVQ, MOVNTQ,PREFETCH and SFENCE.

• Attempted to execute a Streaming SIMD Extensions instruction in an Intel Architectureprocessor which causes a numeric exception when the OSXMMEXCPT bit is not set (0) inCR4.

• Attempted to execute an invalid or reserved opcode, including any MMX™ instructioan Intel Architecture processor that does not support the MMX™ architecture.

• Attempted to execute an MMX™ instruction or SIMD floating-point instruction when theEM flag in register CR0 is set. Note this does not include the following Streaming SExtensions: SFENCE and PREFETCH.

• Attempted to execute an instruction with an operand type that is invalid for its accompa-nying opcode; for example, the source operand for a LES instruction is not a memorylocation.

• Executed a UD2 instruction.

• Detected a LOCK prefix that precedes an instruction that may not be locked or one thatmay be locked but the destination operand is not a memory location.

• Attempted to execute an LLDT, SLDT, LTR, STR, LSL, LAR, VERR, VERW, or ARPLinstruction while in real-address or virtual-8086 mode.

• Attempted to execute the RSM instruction when not in SMM mode.

In the P6 family processors, this exception is not generated until an attempt is made to retire theresult of executing an invalid instruction; that is, decoding and speculatively attempting toexecute an invalid opcode does not generate this exception. Likewise, in the Pentium® processorand earlier Intel Architecture processors, this exception is not generated as the result ofprefetching and preliminary decoding of an invalid instruction. (Refer to Section 5.4., “Progor Task Restart” for general rules for taking of interrupts and exceptions.)

The opcodes D6 and F1 are undefined opcodes that are reserved by Intel. These opcodthough undefined, do not generate an invalid opcode exception.

5-28

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (169)

INTERRUPT AND EXCEPTION HANDLING

The UD2 instruction is guaranteed to generate an invalid opcode exception.

Exception Error Code

None.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction that generated the exception.

Program State Change

A program-state change does not accompany an invalid-opcode fault, because the invalidinstruction is not executed.

5-29

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (170)

INTERRUPT AND EXCEPTION HANDLING

ister

oreen theoating-uc-e FPUer 2,

orion ofnity flagamss-essor,

r the

Interrupt 7—Device Not Available Exception (#NM)

Exception Class Fault.

Description

Indicates one of the following things:

The device-not-available fault is generated by either of three conditions:

• The processor executed a floating-point instruction while the EM flag of register CR0 wasset.

• The processor executed a floating-point, MMX™ or SIMD floating-point (excludingprefetch, sfence or streaming store instructions) instruction while the TS flag of regCR0 was set.

• The processor executed a WAIT or FWAIT instruction while the MP and TS flags ofregister CR0 were set.

The EM flag is set when the processor does not have an internal floating-point unit. An excep-tion is then generated each time a floating-point instruction is encountered, allowing an excep-tion handler to call floating-point instruction emulation routines.

The TS flag indicates that a context switch (task switch) has occurred since the last time afloating-point, MMX™ or SIMD floating-point (excluding prefetch, sfence or streaming stinstructions) instruction was executed, but that the context of the FPU was not saved. WhTS flag is set, the processor generates a device-not-available exception each time a flpoint, MMX™ or SIMD floating-point (excluding prefetch, sfence or streaming store instrtions) instruction is encountered. The exception handler can then save the context of thbefore it executes the instruction. Refer to Section 2.5., “Control Registers”, in ChaptSystem Architecture Overview, for more information about the TS flag.

The MP flag in control register CR0 is used along with the TS flag to determine if WAITFWAIT instructions should generate a device-not-available exception. It extends the functthe TS flag to the WAIT and FWAIT instructions, giving the exception handler an opportuto save the context of the FPU before the WAIT or FWAIT instruction is executed. The MPis provided primarily for use with the Intel286 and Intel386™ DX processors. For progrrunning on the P6 family, Pentium®, or Intel486™ DX processors, or the Intel 487 SX coprocesors, the MP flag should always be set; for programs running on the Intel486™ SX procthe MP flag should be clear.

Exception Error Code

None.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the floating-point instruction oWAIT/FWAIT instruction that generated the exception.

5-30

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (171)

INTERRUPT AND EXCEPTION HANDLING

Program State Change

A program-state change does not accompany a device-not-available fault, because the instruc-tion that generated the exception is not executed.

If the EM flag is set, the exception handler can then read the floating-point instruction pointedto by the EIP and call the appropriate emulation routine.

If the MP and TS flags are set or the TS flag alone is set, the exception handler can save thecontext of the FPU, clear the TS flag, and continue execution at the interrupted floating-point orWAIT/FWAIT instruction.

5-31

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (172)

INTERRUPT AND EXCEPTION HANDLING

Interrupt 8—Double Fault Exception (#DF)

Exception Class Abort.

Description

Indicates that the processor detected a second exception while calling an exception handler fora prior exception. Normally, when the processor detects another exception while trying to callan exception handler, the two exceptions can be handled serially. If, however, the processorcannot handle them serially, it signals the double-fault exception. To determine when two faultsneed to be signaled as a double fault, the processor divides the exceptions into three classes:benign exceptions, contributory exceptions, and page faults (refer to Table 5-4).

Table 5-5 shows the various combinations of exception classes that cause a double fault to begenerated. A double-fault exception falls in the abort class of exceptions. The program or taskcannot be restarted or resumed. The double-fault handler can be used to collect diagnostic infor-mation about the state of the machine and/or, when possible, to shut the application and/orsystem down gracefully or restart the system.

A segment or page fault may be encountered while prefetching instructions; however, thisbehavior is outside the domain of Table 5-5. Any further faults generated while the processor isattempting to transfer control to the appropriate fault handler could still lead to a double-faultsequence.

Table 5-4. Interrupt and Exception Classes

Class Vector Number Description

Benign Exceptions and Interrupts 1 2 3 4 5 6 79

16171819AllAll

Debug ExceptionNMI InterruptBreakpointOverflowBOUND Range ExceededInvalid OpcodeDevice Not AvailableCoprocessor Segment OverrunFloating-Point ErrorAlignment CheckMachine CheckSIMD floating-point extensionsINT nINTR

Contributory Exceptions 010111213

Divide ErrorInvalid TSSSegment Not PresentStack FaultGeneral Protection

Page Faults 14 Page Fault

5-32

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (173)

INTERRUPT AND EXCEPTION HANDLING

If another exception occurs while attempting to call the double-fault handler, the processorenters shutdown mode. This mode is similar to the state following execution of an HLT instruc-tion. In this mode, the processor stops executing instructions until an NMI interrupt, SMI inter-rupt, hardware reset, or INIT# is received. The processor generates a special bus cycle toindicate that it has entered shutdown mode. Software designers may need to be aware of theresponse of hardware to receiving this signal. For example, hardware may turn on an indicatorlight on the front panel, generate an NMI interrupt to record diagnostic information, invoke resetinitialization, generate an INIT initialization, or generate an SMI.

If the shutdown occurs while the processor is executing an NMI interrupt handler, then only ahardware reset can restart the processor.

Exception Error Code

Zero. The processor always pushes an error code of 0 onto the stack of the double-fault handler.

Saved Instruction Pointer

The saved contents of CS and EIP registers are undefined.

Program State Change

A program-state following a double-fault exception is undefined. The program or task cannotbe resumed or restarted. The only available action of the double-fault exception handler is tocollect all possible context information for use in diagnostics and then close the applicationand/or shut down or reset the processor.

Table 5-5. Conditions for Generating a Double Fault

Second Exception

First Exception Benign Contributory Page Fault

Benign Handle Exceptions Serially

Handle Exceptions Serially

Handle Exceptions Serially

Contributory Handle Exceptions Serially

Generate a Double Fault Handle Exceptions Serially

Page Fault Handle Exceptions Serially

Generate a Double Fault Generate a Double Fault

5-33

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (174)

INTERRUPT AND EXCEPTION HANDLING

ted apro-p-pt 13.

eption.

ogramer is to

Interrupt 9—Coprocessor Segment Overrun

Exception Class Abort. (Intel reserved; do not use. Recent Intel Architecture proces-sors do not generate this exception.)

Description

Indicates that an Intel386™ CPU-based systems with an Intel 387 math coprocessor detecpage or segment violation while transferring the middle portion of an Intel 387 math cocessor operand. The P6 family, Pentium®, and Intel486™ processors do not generate this excetion; instead, this condition is detected with a general protection exception (#GP), interru

Exception Error Code

None.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction that generated the exc

Program State Change

A program-state following a coprocessor segment-overrun exception is undefined. The pror task cannot be resumed or restarted. The only available action of the exception handlsave the instruction pointer and reinitialize the FPU using the FNINIT instruction.

5-34

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (175)

INTERRUPT AND EXCEPTION HANDLING

d in theitch isin theloaded inter-

eption

Interrupt 10—Invalid TSS Exception (#TS)

Exception Class Fault.

Description

Indicates that a task switch was attempted and that invalid information was detected in the TSSfor the target task. Table 5-6 shows the conditions that will cause an invalid-TSS exception tobe generated. In general, these invalid conditions result from protection violations for the TSSdescriptor; the LDT pointed to by the TSS; or the stack, code, or data segments referenced bythe TSS.

This exception can generated either in the context of the original task or in the context of thenew task (refer to Section 6.3., “Task Switching” in Chapter 6, Task Management). Until theprocessor has completely verified the presence of the new TSS, the exception is generatecontext of the original task. Once the existence of the new TSS is verified, the task swconsidered complete. Any invalid-TSS conditions detected after this point are handled context of the new task. (A task switch is considered complete when the task register is with the segment selector for the new TSS and, if the switch is due to a procedure call orrupt, the previous task link field of the new TSS references the old TSS.)

To insure that a valid TSS is available to process the exception, the invalid-TSS exchandler must be a task called using a task gate.

Table 5-6. Invalid TSS Conditions

Error Code Index Invalid Condition

TSS segment selector index TSS segment limit less than 67H for 32-bit TSS or less than 2CH for 16-bit TSS.

LDT segment selector index Invalid LDT or LDT not present

Stack-segment selector index Stack-segment selector exceeds descriptor table limit

Stack-segment selector index Stack segment is not writable

Stack-segment selector index Stack segment DPL ≠ CPL

Stack-segment selector index Stack-segment selector RPL ≠ CPL

Code-segment selector index Code-segment selector exceeds descriptor table limit

Code-segment selector index Code segment is not executable

Code-segment selector index Nonconforming code segment DPL ≠ CPL

Code-segment selector index Conforming code segment DPL greater than CPL

Data-segment selector index Data-segment selector exceeds descriptor table limit

Data-segment selector index Data segment not readable

5-35

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (176)

INTERRUPT AND EXCEPTION HANDLING

can be

mit-. If itment

e infor-cessors theiristersmory. in the

handlerneral- diffi-xcep-tion-e TSS.

Exception Error Code

An error code containing the segment selector index for the segment descriptor that caused theviolation is pushed onto the stack of the exception handler. If the EXT flag is set, it indicates thatthe exception was caused by an event external to the currently running program (for example, ifan external interrupt handler using a task gate attempted a task switch to an invalid TSS).

Saved Instruction Pointer

If the exception condition was detected before the task switch was carried out, the savedcontents of CS and EIP registers point to the instruction that invoked the task switch. If theexception condition was detected after the task switch was carried out, the saved contents of CSand EIP registers point to the first instruction of the new task.

Program State Change

The ability of the invalid-TSS handler to recover from the fault depends on the error conditionthan causes the fault. Refer to Section 6.3., “Task Switching” in Chapter 6, Task Managementfor more information on the task switch process and the possible recovery actions that taken.

If an invalid TSS exception occurs during a task switch, it can occur before or after the comto-new-task point. If it occurs before the commit point, no program state change occursoccurs after the commit point (when the segment descriptor information for the new segselectors have been loaded in the segment registers), the processor will load all the statmation from the new TSS before it generates the exception. During a task switch, the profirst loads all the segment registers with segment selectors from the TSS, then checkcontents for validity. If an invalid TSS exception is discovered, the remaining segment regare loaded but not checked for validity and therefore may not be usable for referencing meThe invalid TSS handler should not rely on being able to use the segment selectors foundCS, SS, DS, ES, FS, and GS registers without causing another exception. The exception should load all segment registers before trying to resume the new task; otherwise, geprotection exceptions (#GP) may result later under conditions that make diagnosis morecult. The Intel recommended way of dealing situation is to use a task for the invalid TSS etion handler. The task switch back to the interrupted task from the invalid-TSS excephandler task will then cause the processor to check the registers as it loads them from th

5-36

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (177)

INTERRUPT AND EXCEPTION HANDLING

Interrupt 11—Segment Not Present (#NP)

Exception Class Fault.

Description

Indicates that the present flag of a segment or gate descriptor is clear. The processor can generatethis exception during any of the following operations:

• While attempting to load CS, DS, ES, FS, or GS registers. [Detection of a not-presentsegment while loading the SS register causes a stack fault exception (#SS) to begenerated.] This situation can occur while performing a task switch.

• While attempting to load the LDTR using an LLDT instruction. Detection of a not-presentLDT while loading the LDTR during a task switch operation causes an invalid-TSSexception (#TS) to be generated.

• When executing the LTR instruction and the TSS is marked not present.

• While attempting to use a gate descriptor or TSS that is marked segment-not-present, but isotherwise valid.

An operating system typically uses the segment-not-present exception to implement virtualmemory at the segment level. If the exception handler loads the segment and returns, the inter-rupted program or task resumes execution.

A not-present indication in a gate descriptor, however, does not indicate that a segment is notpresent (because gates do not correspond to segments). The operating system may use thepresent flag for gate descriptors to trigger exceptions of special significance to the operatingsystem.

Exception Error Code

An error code containing the segment selector index for the segment descriptor that caused theviolation is pushed onto the stack of the exception handler. If the EXT flag is set, it indicates thatthe exception resulted from an external event (NMI or INTR) that caused an interrupt, whichsubsequently referenced a not-present segment. The IDT flag is set if the error code refers to anIDT entry (e.g., an INT instruction referencing a not-present gate).

Saved Instruction Pointer

The saved contents of CS and EIP registers normally point to the instruction that generated theexception. If the exception occurred while loading segment descriptors for the segment selectorsin a new TSS, the CS and EIP registers point to the first instruction in the new task. If the excep-tion occurred while accessing a gate descriptor, the CS and EIP registers point to the instructionthat invoked the access (for example a CALL instruction that references a call gate).

5-37

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (178)

INTERRUPT AND EXCEPTION HANDLING

er theout. The

egmenteption.

ption

Program State Change

If the segment-not-present exception occurs as the result of loading a register (CS, DS, SS, ES,FS, GS, or LDTR), a program-state change does accompany the exception, because the registeris not loaded. Recovery from this exception is possible by simply loading the missing segmentinto memory and setting the present flag in the segment descriptor.

If the segment-not-present exception occurs while accessing a gate descriptor, a program-statechange does not accompany the exception. Recovery from this exception is possible merely bysetting the present flag in the gate descriptor.

If a segment-not-present exception occurs during a task switch, it can occur before or after thecommit-to-new-task point (refer to Section 6.3., “Task Switching” in Chapter 6, Task Manage-ment). If it occurs before the commit point, no program state change occurs. If it occurs aftcommit point, the processor will load all the state information from the new TSS (withperforming any additional limit, present, or type checks) before it generates the exceptionsegment-not-present exception handler should thus not rely on being able to use the sselectors found in the CS, SS, DS, ES, FS, and GS registers without causing another exc(Refer to the Program State Change description for “Interrupt 10—Invalid TSS Exce(#TS)” in this chapter for additional information on how to handle this situation.)

5-38

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (179)

INTERRUPT AND EXCEPTION HANDLING

llt, or

Interrupt 12—Stack Fault Exception (#SS)

Exception Class Fault.

Description

Indicates that one of the following stack related conditions was detected:

• A limit violation is detected during an operation that refers to the SS register. Operationsthat can cause a limit violation include stack-oriented instructions such as POP, PUSH,CALL, RET, IRET, ENTER, and LEAVE, as well as other memory references whichimplicitly or explicitly use the SS register (for example, MOV AX, [BP+6] or MOV AX,SS:[EAX+6]). The ENTER instruction generates this exception when there is not enoughstack space for allocating local variables.

• A not-present stack segment is detected when attempting to load the SS register. Thisviolation can occur during the execution of a task switch, a CALL instruction to a differentprivilege level, a return to a different privilege level, an LSS instruction, or a MOV or POPinstruction to the SS register.

Recovery from this fault is possible by either extending the limit of the stack segment (in thecase of a limit violation) or loading the missing stack segment into memory (in the case of a not-present violation.

Exception Error Code

If the exception is caused by a not-present stack segment or by overflow of the new stack duringan inter-privilege-level call, the error code contains a segment selector for the segment thatcaused the exception. Here, the exception handler can test the present flag in the segmentdescriptor pointed to by the segment selector to determine the cause of the exception. For anormal limit violation (on a stack segment already in use) the error code is set to 0.

Saved Instruction Pointer

The saved contents of CS and EIP registers generally point to the instruction that generated theexception. However, when the exception results from attempting to load a not-present stacksegment during a task switch, the CS and EIP registers point to the first instruction of the newtask.

Program State Change

A program-state change does not generally accompany a stack-fault exception, because theinstruction that generated the fault is not executed. Here, the instruction can be restarted afterthe exception handler has corrected the stack fault condition.

If a stack fault occurs during a task switch, it occurs after the commit-to-new-task point (referto Section 6.3., “Task Switching” Chapter 6, Task Management). Here, the processor loads athe state information from the new TSS (without performing any additional limit, presen

5-39

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (180)

INTERRUPT AND EXCEPTION HANDLING

on

type checks) before it generates the exception. The stack fault handler should thus not rely onbeing able to use the segment selectors found in the CS, SS, DS, ES, FS, and GS registerswithout causing another exception. The exception handler should check all segment registersbefore trying to resume the new task; otherwise, general protection faults may result later underconditions that are more difficult to diagnose. (Refer to the Program State Change descriptionfor “Interrupt 10—Invalid TSS Exception (#TS)” in this chapter for additional information how to handle this situation.)

5-40

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (181)

INTERRUPT AND EXCEPTION HANDLING

neral- all thed-TSS,ause

Interrupt 13—General Protection Exception (#GP)

Exception Class Fault.

Description

Indicates that the processor detected one of a class of protection violations called “geprotection violations.” The conditions that cause this exception to be generated compriseprotection violations that do not cause other exceptions to be generated (such as, invalisegment-not-present, stack-fault, or page-fault exceptions). The following conditions cgeneral-protection exceptions to be generated:

• Exceeding the segment limit when accessing the CS, DS, ES, FS, or GS segments.

• Exceeding the segment limit when referencing a descriptor table (except during a taskswitch or a stack switch).

• Transferring execution to a segment that is not executable.

• Writing to a code segment or a read-only data segment.

• Reading from an execute-only code segment.

• Loading the SS register with a segment selector for a read-only segment (unless theselector comes from a TSS during a task switch, in which case an invalid-TSS exceptionoccurs).

• Loading the SS, DS, ES, FS, or GS register with a segment selector for a system segment.

• Loading the DS, ES, FS, or GS register with a segment selector for an execute-only codesegment.

• Loading the SS register with the segment selector of an executable segment or a nullsegment selector.

• Loading the CS register with a segment selector for a data segment or a null segmentselector.

• Accessing memory using the DS, ES, FS, or GS register when it contains a null segmentselector.

• Switching to a busy task during a call or jump to a TSS.

• Switching to an available (nonbusy) task during the execution of an IRET instruction.

• Using a segment selector on task switch that points to a TSS descriptor in the current LDT.TSS descriptors can only reside in the GDT.

• Violating any of the privilege rules described in Chapter 4, Protection.

• Exceeding the instruction length limit of 15 bytes (this only can occur when redundantprefixes are placed before an instruction).

5-41

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (182)

INTERRUPT AND EXCEPTION HANDLING

• Loading the CR0 register with a set PG flag (paging enabled) and a clear PE flag(protection disabled).

• Loading the CR0 register with a set NW flag and a clear CD flag.

• Referencing an entry in the IDT (following an interrupt or exception) that is not aninterrupt, trap, or task gate.

• Attempting to access an interrupt or exception handler through an interrupt or trap gatefrom virtual-8086 mode when the handler’s code segment DPL is greater than 0.

• Attempting to write a 1 into a reserved bit of CR4.

• Attempting to execute a privileged instruction when the CPL is not equal to 0 (refer toSection 4.9., “Privileged Instructions” in Chapter 4, Protection for a list of privilegedinstructions).

• Writing to a reserved bit in an MSR.

• Accessing a gate that contains a null segment selector.

• Executing the INT n instruction when the CPL is greater than the DPL of the referencedinterrupt, trap, or task gate.

• The segment selector in a call, interrupt, or trap gate does not point to a code segment.

• The segment selector operand in the LLDT instruction is a local type (TI flag is set) ordoes not point to a segment descriptor of the LDT type.

• The segment selector operand in the LTR instruction is local or points to a TSS that is notavailable.

• The target code-segment selector for a call, jump, or return is null.

• If the PAE and/or PSE flag in control register CR4 is set and the processor detects anyreserved bits in a page-directory-pointer-table entry set to 1. These bits are checked duringa write to control registers CR0, CR3, or CR4 that causes a reloading of the page-directory-pointer-table entry.

A program or task can be restarted following any general-protection exception. If the exceptionoccurs while attempting to call an interrupt handler, the interrupted program can be restartable,but the interrupt may be lost.

Exception Error Code

The processor pushes an error code onto the exception handler’s stack. If the fault condition wasdetected while loading a segment descriptor, the error code contains a segment selector to or IDTvector number for the descriptor; otherwise, the error code is 0. The source of the selector in anerror code may be any of the following:

• An operand of the instruction.

• A selector from a gate which is the operand of the instruction.

• A selector from a TSS involved in a task switch.

5-42

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (183)

INTERRUPT AND EXCEPTION HANDLING

er thehoutn. Thet selec-. (Refer)” in

• IDT vector number.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction that generated the exception.

Program State Change

In general, a program-state change does not accompany a general-protection exception, becausethe invalid instruction or operation is not executed. An exception handler can be designed tocorrect all of the conditions that cause general-protection exceptions and restart the program ortask without any loss of program continuity.

If a general-protection exception occurs during a task switch, it can occur before or after thecommit-to-new-task point (refer to Section 6.3., “Task Switching” in Chapter 6, Task Manage-ment). If it occurs before the commit point, no program state change occurs. If it occurs aftcommit point, the processor will load all the state information from the new TSS (witperforming any additional limit, present, or type checks) before it generates the exceptiogeneral-protection exception handler should thus not rely on being able to use the segmentors found in the CS, SS, DS, ES, FS, and GS registers without causing another exceptionto the Program State Change description for “Interrupt 10—Invalid TSS Exception (#TSthis chapter for additional information on how to handle this situation.)

5-43

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (184)

INTERRUPT AND EXCEPTION HANDLING

and code

ram or privi-

rma-

or to

was a

(1) or

page PSE

Interrupt 14—Page-Fault Exception (#PF)

Exception Class Fault.

Description

Indicates that, with paging enabled (the PG flag in the CR0 register is set), the processor detectedone of the following conditions while using the page-translation mechanism to translate a linearaddress to a physical address:

• The P (present) flag in a page-directory or page-table entry needed for the addresstranslation is clear, indicating that a page table or the page containing the operand is notpresent in physical memory.

• The procedure does not have sufficient privilege to access the indicated page (that is, aprocedure running in user mode attempts to access a supervisor-mode page).

• Code running in user mode attempts to write to a read-only page. In the Intel486™later processors, if the WP flag is set in CR0, the page fault will also be triggered byrunning in supervisor mode that tries to write to a read-only user-mode page.

The exception handler can recover from page-not-present conditions and restart the progtask without any loss of program continuity. It can also restart the program or task after alege violation, but the problem that caused the privilege violation may be uncorrectable.

Exception Error Code

Yes (special format). The processor provides the page-fault handler with two items of infotion to aid in diagnosing the exception and recovering from it:

• An error code on the stack. The error code for a page fault has a format different from thatfor other exceptions (refer to Figure 5-7). The error code tells the exception handler fourthings:

— The P flag indicates whether the exception was due to a not-present page (0)either an access rights violation or the use of a reserved bit (1).

— The W/R flag indicates whether the memory access that caused the exception read (0) or write (1).

— The U/S flag indicates whether the processor was executing at user mode supervisor mode (0) at the time of the exception.

— The RSVD flag indicates that the processor detected 1s in reserved bits of thedirectory, when the PSE or PAE flags in control register CR4 are set to 1. (Theflag is only available in the P6 family and Pentium® processors, and the PAE flag isonly available on the P6 family processors. In earlier Intel Architecture processorfamilies, the bit position of the RSVD flag is reserved.)

5-44

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (185)

INTERRUPT AND EXCEPTION HANDLING

State

se the excep-ory),

• The contents of the CR2 register. The processor loads the CR2 register with the 32-bitlinear address that generated the exception. The page-fault handler can use this address tolocate the corresponding page directory and page-table entries. If another page fault canpotentially occur during execution of the page-fault handler, the handler must push thecontents of the CR2 register onto the stack before the second page fault occurs.

If a page fault is caused by a page-level protection violation, the access flag in the page-directoryentry is set when the fault occurs. The behavior of Intel Architecture processors regarding theaccess flag in the corresponding page-table entry is model specific and not architecturallydefined.

Saved Instruction Pointer

The saved contents of CS and EIP registers generally point to the instruction that generated theexception. If the page-fault exception occurred during a task switch, the CS and EIP registersmay point to the first instruction of the new task (as described in the following “Program Change” section).

Program State Change

A program-state change does not normally accompany a page-fault exception, becauinstruction that causes the exception to be generated is not executed. After the page-faulttion handler has corrected the violation (for example, loaded the missing page into memexecution of the program or task can be resumed.

Figure 5-7. Page-Fault Error Code

P 0 The fault was caused by a nonpresent page.1 The fault was caused by a page-level protection violation.

W/R 0 The access causing the fault was a read. 1 The access causing the fault was a write.

U/S 0 The access causing the fault originated when the processor was executing in supervisor mode.

1 The access causing the fault originated when the processor was executing in user mode.

31 0

PReservedR/

W

U/S

1234RSVD

RSVD 0 The fault was not caused by a reserved bit violation. 1 The page fault occured because a 1 was detected in one of the reserved bit positions of a page table entry or directory entry

that was marked present.

5-45

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (186)

INTERRUPT AND EXCEPTION HANDLING

ption

t stack written stack,

to getment

d. At thisegment

es to a, if thesor will

in theas thea pair

When a page-fault exception is generated during a task switch, the program-state may change,as follows. During a task switch, a page-fault exception can occur during any of followingoperations:

• While writing the state of the original task into the TSS of that task.

• While reading the GDT to locate the TSS descriptor of the new task.

• While reading the TSS of the new task.

• While reading segment descriptors associated with segment selectors from the new task.

• While reading the LDT of the new task to verify the segment registers stored in the newTSS.

In the last two cases the exception occurs in the context of the new task. The instruction pointerrefers to the first instruction of the new task, not to the instruction which caused the task switch(or the last instruction to be executed, in the case of an interrupt). If the design of the operatingsystem permits page faults to occur during task-switches, the page-fault handler should be calledthrough a task gate.

If a page fault occurs during a task switch, the processor will load all the state information fromthe new TSS (without performing any additional limit, present, or type checks) before it gener-ates the exception. The page-fault handler should thus not rely on being able to use the segmentselectors found in the CS, SS, DS, ES, FS, and GS registers without causing another exception.(Refer to the Program State Change description for “Interrupt 10—Invalid TSS Exce(#TS)” in this chapter for additional information on how to handle this situation.)

Additional Exception-Handling Information

Special care should be taken to ensure that an exception that occurs during an expliciswitch does not cause the processor to use an invalid stack pointer (SS:ESP). Softwarefor 16-bit Intel Architecture processors often use a pair of instructions to change to a newfor example:

MOV SS, AX

MOV SP, StackTop

When executing this code on one of the 32-bit Intel Architecture processors, it is possiblea page fault, general-protection fault (#GP), or alignment check fault (#AC) after the segselector has been loaded into the SS register but before the ESP register has been loadepoint, the two parts of the stack pointer (SS and ESP) are inconsistent. The new stack sis being used with the old stack pointer.

The processor does not use the inconsistent stack pointer if the exception handler switchwell defined stack (that is, the handler is a task or a more privileged procedure). Howeverexception handler is called at the same privilege level and from the same task, the procesattempt to use the inconsistent stack pointer.

In systems that handle page-fault, general-protection, or alignment check exceptions withfaulting task (with trap or interrupt gates), software executing at the same privilege level exception handler should initialize a new stack by using the LSS instruction rather than

5-46

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (187)

INTERRUPT AND EXCEPTION HANDLING

of MOV instructions, as described earlier in this note. When the exception handler is running atprivilege level 0 (the normal case), the problem is limited to procedures or tasks that run at priv-ilege level 0, typically the kernel of the operating system.

5-47

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (188)

INTERRUPT AND EXCEPTION HANDLING

ting-

ting-int,

-point-

Interrupt 16—Floating-Point Error Exception (#MF)

Exception Class Fault.

Description

Indicates that the FPU has detected a floating-point-error exception. The NE flag in the registerCR0 must be set and the appropriate exception must be unmasked (clear mask bit in the controlregister) for an interrupt 16, floating-point-error exception to be generated. (Refer to Section2.5., “Control Registers” in Chapter 2, System Architecture Overview for a detailed descriptionof the NE flag.)

While executing floating-point instructions, the FPU detects and reports six types of floapoint errors:

• Invalid operation (#I)

— Stack overflow or underflow (#IS)

— Invalid arithmetic operation (#IA)

• Divide-by-zero (#Z)

• Denormalized operand (#D)

• Numeric overflow (#O)

• Numeric underflow (#U)

• Inexact result (precision) (#P)

For each of these error types, the FPU provides a flag in the FPU status register and a mask bitin the FPU control register. If the FPU detects a floating-point error and the mask bit for the erroris set, the FPU handles the error automatically by generating a predefined (default) response andcontinuing program execution. The default responses have been designed to provide a reason-able result for most floating-point applications.

If the mask for the error is clear and the NE flag in register CR0 is set, the FPU does thefollowing:

1. Sets the necessary flag in the FPU status register.

2. Waits until the next “waiting” floating-point instruction or WAIT/FWAIT instruction isencountered in the program’s instruction stream. (The FPU checks for pending floapoint exceptions on “waiting” instructions prior to executing them. All the floating-poinstructions except the FNINIT, FNCLEX, FNSTSW, FNSTSW AX, FNSTCWFNSTENV, and FNSAVE instructions are “waiting” instructions.)

3. Generates an internal error signal that causes the processor to generate a floatingerror exception.

5-48

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (189)

INTERRUPT AND EXCEPTION HANDLING

FPU

All of the floating-point-error conditions can be recovered from. The floating-point-error excep-tion handler can determine the error condition that caused the exception from the settings of theflags in the FPU status word. Refer to “Software Exception Handling” in Chapter 7 of theIntelArchitecture Software Developer’s Manual, Volume 1, for more information on handlingfloating-point-error exceptions.

Exception Error Code

None. The FPU provides its own error information.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the floating-point or WAIT/FWAIT instruc-tion that was about to be executed when the floating-point-error exception was generated. Thisis not the faulting instruction in which the error condition was detected. The address of thefaulting instruction is contained in the FPU instruction pointer register. Refer to “The Instruction and Operand (Data) Pointers” in Chapter 7 of the Intel Architecture Software Devel-oper’s Manual, Volume 1, for more information about information the FPU saves for use inhandling floating-point-error exceptions.

Program State Change

A program-state change generally accompanies a floating-point-error exception because thehandling of the exception is delayed until the next waiting floating-point or WAIT/FWAITinstruction following the faulting instruction. The FPU, however, saves sufficient informationabout the error condition to allow recovery from the error and re-execution of the faultinginstruction if needed.

In situations where nonfloating-point instructions depend on the results of a floating-pointinstruction, a WAIT or FWAIT instruction can be inserted in front of a dependent instruction toforce a pending floating-point-error exception to be handled before the dependent instruction isexecuted. Refer to “Floating-Point Exception Synchronization” in Chapter 7 of the Intel Archi-tecture Software Developer’s Manual, Volume 1, for more information about synchronization offloating-point-error exceptions.

5-49

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (190)

INTERRUPT AND EXCEPTION HANDLING

Interrupt 17—Alignment Check Exception (#AC)

Exception Class Fault.

Description

Indicates that the processor detected an unaligned memory operand when alignment checkingwas enabled. Alignment checks are only carried out in data (or stack) segments (not in code orsystem segments). An example of an alignment-check violation is a word stored at an odd byteaddress, or a doubleword stored at an address that is not an integer multiple of 4. Table 5-7 liststhe alignment requirements various data types recognized by the processor.

1. 128-bit datatype introduced with the Pentium® III processor. This type of alignment check is done foroperands less than 128-bits in size: 32-bit scalar single and 16-bit/32-bit/64-bit integer MMX™ technol-ogy; 2, 4, or 8 byte alignments checks are possible when #AC is enabled. Some exceptional cases are:

• The MOVUPS instruction, which performs a 128-bit unaligned load or store. In this case, 2/4/8-bytemisalignments will be detected, but detection of 16-byte misalignment is not guaranteed and mayvary with implementation.

• The FXSAVE/FXRSTOR instructions - refer to instruction descriptions

To enable alignment checking, the following conditions must be true:

• AM flag in CR0 register is set.

• AC flag in the EFLAGS register is set.

• The CPL is 3 (protected mode or virtual-8086 mode).

Table 5-7. Alignment Requirements by Data Type

Data Type Address Must Be Divisible By

Word 2

Doubleword 4

Single Real 4

Double Real 8

Extended Real 8

Segment Selector 2

32-bit Far Pointer 2

48-bit Far Pointer 4

32-bit Pointer 4

GDTR, IDTR, LDTR, or Task Register Contents 4

FSTENV/FLDENV Save Area 4 or 2, depending on operand size

FSAVE/FRSTOR Save Area 4 or 2, depending on operand size

Bit String 2 or 4 depending on the operand-size attribute.

128-bit1 16

5-50

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (191)

INTERRUPT AND EXCEPTION HANDLING

Alignment-check faults are generated only when operating at privilege level 3 (user mode).Memory references that default to privilege level 0, such as segment descriptor loads, do notgenerate alignment-check faults, even when caused by a memory reference made from privilegelevel 3.

Storing the contents of the GDTR, IDTR, LDTR, or task register in memory while at privilegelevel 3 can generate an alignment-check fault. Although application programs do not normallystore these registers, the fault can be avoided by aligning the information stored on an evenword-address.

FSAVE and FRSTOR instructions generate unaligned references which can cause alignment-check faults. These instructions are rarely needed by application programs.

Exception Error Code

Yes (always zero).

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction that generated the exception.

Program State Change

A program-state change does not accompany an alignment-check fault, because the instructionis not executed.

5-51

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (192)

INTERRUPT AND EXCEPTION HANDLING

13,

Interrupt 18—Machine-Check Exception (#MC)

Exception Class Abort.

Description

Indicates that the processor detected an internal machine error or a bus error, or that an externalagent detected a bus error. The machine-check exception is model-specific, available only onthe P6 family and Pentium® processors. The implementation of the machine-check exception isdifferent between the P6 family and Pentium® processors, and these implementations may notbe compatible with future Intel Architecture processors. (Use the CPUID instruction to deter-mine whether this feature is present.)

Bus errors detected by external agents are signaled to the processor on dedicated pins: theBINIT# pin on the P6 family processors and the BUSCHK# pin on the Pentium® processor.When one of these pins is enabled, asserting the pin causes error information to be loaded intomachine-check registers and a machine-check exception is generated.

The machine-check exception and machine-check architecture are discussed in detail in Chapter13, Machine-Check Architecture. Also, refer to the data books for the individual processors forprocessor-specific hardware information.

Exception Error Code

None. Error information is provide by machine-check MSRs.

Saved Instruction Pointer

For the P6 family processors, if the EIPV flag in the MCG_STATUS MSR is set, the savedcontents of CS and EIP registers are directly associated with the error that caused the machine-check exception to be generated; if the flag is clear, the saved instruction pointer may not beassociated with the error (refer to Section 13.3.1.2., “MCG_STATUS MSR”, in ChapterMachine-Check Architecture).

For the Pentium® processor, contents of the CS and EIP registers may not be associated with theerror.

Program State Change

A program-state change always accompanies a machine-check exception. If the machine-checkmechanism is enabled (the MCE flag in control register CR4 is set), a machine-check exceptionresults in an abort; that is, information about the exception can be collected from the machine-check MSRs, but the program cannot be restarted. If the machine-check mechanism is notenabled, a machine-check exception causes the processor to enter the shutdown state.

5-52

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (193)

INTERRUPT AND EXCEPTION HANDLING

ult andption

asked

bits in, theexcep- (i.e.ugh = 0)= 0),

e situ-amingn.

Interrupt 19—SIMD Floating-Point Exception (#XF)

Exception Class Fault.

Description

Indicates the processor has detected a SIMD floating-point execution unit exception. The appro-priate status flag in the MXCSR register must be set and the particular exception unmasked forthis interrupt to be generated.

There are six classes of numeric exception conditions that can occur while executing StreamingSIMD Extensions:

1. Invalid operation (#I)

2. Divide-by-zero (#Z)

3. Denormalized operand (#D)

4. Numeric overflow (#O)

5. Numeric underflow (#U)

6. Inexact result (Precision) (#P)

Invalid, Divide-by-zero, and Denormal exceptions are pre-computation exceptions, i.e., they aredetected before any arithmetic operation occurs. Underflow, Overflow, and Precision exceptionsare post-computational exceptions.

When numeric exceptions occur, a processor supporting Streaming SIMD Extensions takes oneof two possible courses of action:

• The processor can handle the exception by itself, producing the most reasonable resallowing numeric program execution to continue undisturbed (i.e., masked exceresponse).

• A software exception handler can be invoked to handle the exception (i.e., unmexception response).

Each of the six exception conditions described above has corresponding flag and maskthe MXCSR. If an exception is masked (the corresponding mask bit in MXCSR = 1)processor takes an appropriate default action and continues with the computation. If the tion is unmasked (mask bit = 0) and the OS supports SIMD floating-point exceptionsCR4.OSXMMEXCPT = 1), a software exception handler is invoked immediately throSIMD floating-point exception interrupt vector 19. If the exception is unmasked (mask bitand the OS does not support SIMD floating-point exceptions (i.e. CR4.OSXMMEXCPT an invalid opcode exception is signaled instead of a SIMD floating-point exception.

Note that because SIMD floating-point exceptions are precise and occur immediately, thation does not arise where an x87-FP instruction, an FWAIT instruction, or another StreSIMD Extensions instruction will catch a pending unmasked SIMD floating-point exceptio

5-53

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (194)

INTERRUPT AND EXCEPTION HANDLING

Exception Error Code

None. The Streaming SIMD Extensions provide their own error information.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the Streaming SIMD Extensions instructionthat was executed when the SIMD floating-point exception was generated. This is the faultinginstruction in which the error condition was detected.

Program State Change

A program-state change generally accompanies a SIMD floating-point exception because thehandling of the exception is immediate unless the particular exception is masked. The Pentium®

III processor contains sufficient information about the error condition to allow recovery fromthe error and re-execution of the faulting instruction if needed.

In situations where a SIMD floating-point exception occurred while the SIMD floating-pointexceptions were masked, SIMD floating-point exceptions were then unmasked, and a StreamingSIMD Extensions instruction was executed, then no exception is raised.

5-54

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (195)

INTERRUPT AND EXCEPTION HANDLING

Interrupts 32 to 255—User Defined Interrupts

Exception Class Not applicable.

Description

Indicates that the processor did one of the following things:

• Executed an INT n instruction where the instruction operand is one of the vector numbersfrom 32 through 255.

• Responded to an interrupt request at the INTR pin or from the local APIC when theinterrupt vector number associated with the request is from 32 through 255.

Exception Error Code

Not applicable.

Saved Instruction Pointer

The saved contents of CS and EIP registers point to the instruction that follows the INT ninstruction or instruction following the instruction on which the INTR signal occurred.

Program State Change

A program-state change does not accompany interrupts generated by the INT n instruction orthe INTR signal. The INT n instruction generates the interrupt within the instruction stream.When the processor receives an INTR signal, it commits all state changes for all previousinstructions before it responds to the interrupt; so, program execution can resume upon returningfrom the interrupt handler.

5-55

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (196)

INTERRUPT AND EXCEPTION HANDLING

5-56

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (197)

6

Task Management

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (198)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (199)

TASK MANAGEMENT

s are

used to excep-

tasksode,

at leastsupport

he taskegments-levelrivilege

storageism for

cessor for the

pter 2,

task is

CHAPTER 6TASK MANAGEMENT

This chapter describes the Intel Architecture’s task management facilities. These facilitieonly available when the processor is running in protected mode.

6.1. TASK MANAGEMENT OVERVIEW

A task is a unit of work that a processor can dispatch, execute, and suspend. It can be execute a program, a task or process, an operating-system service utility, an interrupt ortion handler, or a kernel or executive utility.

The Intel Architecture provides a mechanism for saving the state of a task, for dispatchingfor execution, and for switching from one task to another. When operating in protected mall processor execution takes place from within a task. Even simple systems must define one task. More complex systems can use the processor’s task management facilities to multitasking applications.

6.1.1. Task Structure

A task is made up of two parts: a task execution space and a task-state segment (TSS). Texecution space consists of a code segment, a stack segment, and one or more data s(refer to Figure 6-1). If an operating system or executive uses the processor’s privilegeprotection mechanism, the task execution space also provides a separate stack for each plevel.

The TSS specifies the segments that make up the task execution space and provides aplace for task state information. In multitasking systems, the TSS also provides a mechanlinking tasks.

NOTE

This chapter describes primarily 32-bit tasks and the 32-bit TSS structure.For information on 16-bit tasks and the 16-bit TSS structure, refer to Section6.6., “16-Bit Task-State Segment (TSS)”.

A task is identified by the segment selector for its TSS. When a task is loaded into the profor execution, the segment selector, base address, limit, and segment descriptor attributesTSS are loaded into the task register (refer to Section 2.4.4., “Task Register (TR)” in ChaSystem Architecture Overview).

If paging is implemented for the task, the base address of the page directory used by theloaded into control register CR3.

6-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (200)

TASK MANAGEMENT

gment

state ofe TSS,

6.1.2. Task State

The following items define the state of the currently executing task:

• The task’s current execution space, defined by the segment selectors in the seregisters (CS, DS, SS, ES, FS, and GS).

• The state of the general-purpose registers.

• The state of the EFLAGS register.

• The state of the EIP register.

• The state of control register CR3.

• The state of the task register.

• The state of the LDTR register.

• The I/O map base address and I/O map (contained in the TSS).

• Stack pointers to the privilege 0, 1, and 2 stacks (contained in the TSS).

• Link to previously executed task (contained in the TSS).

Prior to dispatching a task, all of these items are contained in the task’s TSS, except the the task register. Also, the complete contents of the LDTR register are not contained in thonly the segment selector for the LDT.

Figure 6-1. Structure of a Task

CodeSegment

StackSegment

(Current Priv.

DataSegment

Stack Seg.Priv. Level 0

Stack Seg.Priv. Level 1

StackSegment

(Priv. Level 2)

Task-StateSegment

(TSS)

Task Register

CR3

Level)

6-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (201)

TASK MANAGEMENT

-ocessor EIPint to last

task),vide a

itself.

ocessortically task.

o havey basef pageh oneides novilege page

of other

6.1.3. Executing a Task

Software or the processor can dispatch a task for execution in one of the following ways:

• A explicit call to a task with the CALL instruction.

• A explicit jump to a task with the JMP instruction.

• An implicit call (by the processor) to an interrupt-handler task.

• An implicit call to an exception-handler task.

• A return (initiated with an IRET instruction) when the NT flag in the EFLAGS register isset.

All of these methods of dispatching a task identify the task to be dispatched with a segmentselector that points either to a task gate or the TSS for the task. When dispatching a task with aCALL or JMP instruction, the selector in the instruction may select either the TSS directly or atask gate that holds the selector for the TSS. When dispatching a task to handle an interrupt orexception, the IDT entry for the interrupt or exception must contain a task gate that holds theselector for the interrupt- or exception-handler TSS.

When a task is dispatched for execution, a task switch automatically occurs between thecurrently running task and the dispatched task. During a task switch, the execution environmentof the currently executing task (called the task’s state or context) is saved in its TSS and execution of the task is suspended. The context for the dispatched task is then loaded into the prand execution of that task begins with the instruction pointed to by the newly loadedregister. If the task has not been run since the system was last initialized, the EIP will pothe first instruction of the task’s code; otherwise, it will point to the next instruction after theinstruction that the task executed when it was last active.

If the currently executing task (the calling task) called the task being dispatched (the calledthe TSS segment selector for the calling task is stored in the TSS of the called task to prolink back to the calling task.

For all Intel Architecture processors, tasks are not recursive. A task cannot call or jump to

Interrupts and exceptions can be handled with a task switch to a handler task. Here, the prnot only can perform a task switch to handle the interrupt or exception, but it can automaswitch back to the interrupted task upon returning from the interrupt- or exception-handlerThis mechanism can handle interrupts that occur during interrupt tasks.

As part of a task switch, the processor can also switch to another LDT, allowing each task ta different logical-to-physical address mapping for LDT-based segments. The page-directorregister (CR3) also is reloaded on a task switch, allowing each task to have its own set otables. These protection facilities help isolate tasks and prevent them from interfering witanother. If one or both of these protection mechanisms are not used, the processor provprotection between tasks. This is true even with operating systems that use multiple prilevels for protection. Here, a task running at privilege level 3 that uses the same LDT andtables as other privilege-level-3 tasks can access code and corrupt data and the stack tasks.

6-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (202)

TASK MANAGEMENT

Use of task management facilities for handling multitasking applications is optional. Multi-tasking can be handled in software, with each software defined task executed in the context ofa single Intel Architecture task.

6.2. TASK MANAGEMENT DATA STRUCTURES

The processor defines five data structures for handling task-related activities:

• Task-state segment (TSS).

• Task-gate descriptor.

• TSS descriptor.

• Task register.

• NT flag in the EFLAGS register.

When operating in protected mode, a TSS and TSS descriptor must be created for at least onetask, and the segment selector for the TSS must be loaded into the task register (using the LTRinstruction).

6.2.1. Task-State Segment (TSS)

The processor state information needed to restore a task is saved in a system segment called thetask-state segment (TSS). Figure 6-2 shows the format of a TSS for tasks designed for 32-bitCPUs. (Compatibility with 16-bit Intel 286 processor tasks is provided by a different kind ofTSS, refer to Figure 6-9.) The fields of a TSS are divided into two main categories: dynamicfields and static fields.

The processor updates the dynamic fields when a task is suspended during a task switch. Thefollowing are dynamic fields:

General-purpose register fieldsState of the EAX, ECX, EDX, EBX, ESP, EBP, ESI, and EDI registers prior tothe task switch.

Segment selector fieldsSegment selectors stored in the ES, CS, SS, DS, FS, and GS registers prior tothe task switch.

EFLAGS register fieldState of the Efa*gS register prior to the task switch.

EIP (instruction pointer) fieldState of the EIP register prior to the task switch.

Previous task link fieldContains the segment selector for the TSS of the previous task (updated on atask switch that was initiated by a call, interrupt, or exception). This field

6-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (203)

TASK MANAGEMENT

(which is sometimes called the back link field) permits a task switch back tothe previous task to be initiated with an IRET instruction.

The processor reads the static fields, but does not normally change them. These fields are set upwhen a task is created. The following are static fields:

LDT segment selector fieldContains the segment selector for the task’s LDT.

Figure 6-2. 32-Bit Task-State Segment (TSS)

031

100

96

92

88

84

80

76

I/O Map Base Address

15

LDT Segment Selector

GS

FS

DS

SS

CS

72

68

64

60

56

52

48

44

40

36

32

28

24

20

SS2

16

12

8

4

SS1

SS0

ESP0

Previous Task Link

ESP1

ESP2

CR3 (PDBR)

T

ES

EDI

ESI

EBP

ESP

EBX

EDX

ECX

EAX

EFLAGS

EIP

Reserved bits. Set to 0.

6-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (204)

TASK MANAGEMENT

-

mapn theing ofap.

al-

of thedary issent at whenytes of

he first withind phys-

ing anre-

task’s switchemory

s theot be its TI

CR3 control register fieldContains the base physical address of the page directory to be used by the task.Control register CR3 is also known as the page-directory base register (PDBR).

Privilege level-0, -1, and -2 stack pointer fieldsThese stack pointers consist of a logical address made up of the segmentselector for the stack segment (SS0, SS1, and SS2) and an offset into the stack(ESP0, ESP1, and ESP2). Note that the values in these fields are static for aparticular task; whereas, the SS and ESP values will change if stack switchingoccurs within the task.

T (debug trap) flag (byte 100, bit 0) When set, the T flag causes the processor to raise a debug exception when atask switch to this task occurs (refer to Section 15.3.1.5., “Task-Switch Exception Condition”, in Chapter 15, Debugging and Performance Monitoring).

I/O map base address fieldContains a 16-bit offset from the base of the TSS to the I/O permission bit and interrupt redirection bitmap. When present, these maps are stored iTSS at higher addresses. The I/O map base address points to the beginnthe I/O permission bit map and the end of the interrupt redirection bit mRefer to Chapter 9, Input/Output, in the Intel Architecture Software Devel-oper’s Manual, Volume 1, for more information about the I/O permission bitmap. Refer to Section 16.3., “Interrupt and Exception Handling in Virtu8086 Mode” in Chapter 16, 8086 Emulation for a detailed description of theinterrupt redirection bit map.

If paging is used, care should be taken to avoid placing a page boundary within the partTSS that the processor reads during a task switch (the first 104 bytes). If a page bounplaced within this part of the TSS, the pages on either side of the boundary must be prethe same time and contiguous in physical memory. The reason for this restriction is thataccessing a TSS during a task switch, the processor reads and writes into the first 104 beach TSS from contiguous physical addresses beginning with the physical address of tbyte of the TSS. It may not perform address translations at a page boundary if one occursthis area. So, after the TSS access begins, if a part of the 104 bytes is not both present anically contiguous, the processor will access incorrect TSS information, without generatpage-fault exception. The reading of this incorrect information will generally lead to an ucoverable exception later in the task switch process.

Also, if paging is used, the pages corresponding to the previous task’s TSS, the currentTSS, and the descriptor table entries for each should be marked as read/write. The taskwill be carried out faster if the pages containing these structures are also present in mbefore the task switch is initiated.

6.2.2. TSS Descriptor

The TSS, like all other segments, is defined by a segment descriptor. Figure 6-3 showformat of a TSS descriptor. TSS descriptors may only be placed in the GDT; they cannplaced in an LDT or the IDT. An attempt to access a TSS using a segment selector with

6-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (205)

TASK MANAGEMENT

ererwitch

ceptionven. The when

rically jump.at onlyPLs) priv-

flag set (which indicates the current LDT) causes a general-protection exception (#GP) to begenerated. A general-protection exception is also generated if an attempt is made to load asegment selector for a TSS into a segment register.

The busy flag (B) in the type field indicates whether the task is busy. A busy task is currentlyrunning or is suspended. A type field with a value of 1001B indicates an inactive task; a valueof 1011B indicates a busy task. Tasks are not recursive. The processor uses the busy flag todetect an attempt to call a task whose execution has been interrupted. To insure that there is onlyone busy flag is associated with a task, each TSS should have only one TSS descriptor that pointsto it.

The base, limit, and DPL fields and the granularity and present flags have functions similar totheir use in data-segment descriptors (refer to Section 3.4.3., “Segment Descriptors” in Chapt3, Protected-Mode Memory Management). The limit field must have a value equal to or greatthan 67H (for a 32-bit TSS), one byte less than the minimum size of a TSS. Attempting to sto a task whose TSS descriptor has a limit less than 67H generates an invalid-TSS ex(#TS). A larger limit is required if an I/O permission bit map is included in the TSS. An elarger limit would be required if the operating system stores additional data in the TSSprocessor does not check for a limit greater than 67H on a task switch; however, it doesaccessing the I/O permission bit map or interrupt redirection bit map.

Any program or procedure with access to a TSS descriptor (that is, whose CPL is numeequal to or less than the DPL of the TSS descriptor) can dispatch the task with a call or aIn most systems, the DPLs of TSS descriptors should be set to values less than 3, so thprivileged software can perform task switching. However, in multitasking applications, Dfor some TSS descriptors can be set to 3 to allow task switching at the application (or userilege level.

Figure 6-3. TSS Descriptor

31 24 23 22 21 20 19 16 15 1314 12 11 8 7 0

PBase 31:24 GDPL

Type

00

31 16 15 0

Base Address 15:00 Segment Limit 15:00

Base 23:16AVL

Limit19:160

1B01

TSS Descriptor

AVLBBASEDPLG

Available for use by system softwareBusy flagSegment Base AddressDescriptor Privilege LevelGranularity

LIMITPTYPE

Segment LimitSegment PresentSegment Type

4

6-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (206)

TASK MANAGEMENT

6.2.3. Task Register

The task register holds the 16-bit segment selector and the entire segment descriptor (32-bit baseaddress, 16-bit segment limit, and descriptor attributes) for the TSS of the current task (refer toFigure 2-4 in Chapter 2, System Architecture Overview). This information is copied from theTSS descriptor in the GDT for the current task. Figure 6-4 shows the path the processor uses toaccesses the TSS, using the information in the task register.

The task register has both a visible part (that can be read and changed by software) and an invis-ible part (that is maintained by the processor and is inaccessible by software). The segmentselector in the visible portion points to a TSS descriptor in the GDT. The processor uses theinvisible portion of the task register to cache the segment descriptor for the TSS. Caching thesevalues in a register makes execution of the task more efficient, because the processor does notneed to fetch these values from memory to reference the TSS of the current task.

The LTR (load task register) and STR (store task register) instructions load and read the visibleportion of the task register. The LTR instruction loads a segment selector (source operand) intothe task register that points to a TSS descriptor in the GDT, and then loads the invisible portionof the task register with information from the TSS descriptor. This instruction is a privilegedinstruction that may be executed only when the CPL is 0. The LTR instruction generally is usedduring system initialization to put an initial value in the task register. Afterwards, the contentsof the task register are changed implicitly when a task switch occurs.

The STR (store task register) instruction stores the visible portion of the task register in ageneral-purpose register or memory. This instruction can be executed by code running at anyprivilege level, to identify the currently running task; however, it is normally used only by oper-ating system software.

On power up or reset of the processor, the segment selector and base address are set to the defaultvalue of 0 and the limit is set to FFFFH.

6.2.4. Task-Gate Descriptor

A task-gate descriptor provides an indirect, protected reference to a task. Figure 6-5 shows theformat of a task-gate descriptor. A task-gate descriptor can be placed in the GDT, an LDT, or theIDT.

The TSS segment selector field in a task-gate descriptor points to a TSS descriptor in the GDT.The RPL in this segment selector is not used.

The DPL of a task-gate descriptor controls access to the TSS descriptor during a task switch.When a program or procedure makes a call or jump to a task through a task gate, the CPL andthe RPL field of the gate selector pointing to the task gate must be less than or equal to the DPLof the task-gate descriptor. (Note that when a task gate is used, the DPL of the destination TSSdescriptor is not used.)

6-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (207)

TASK MANAGEMENT

Figure 6-4. Task Register

Figure 6-5. Task-Gate Descriptor

Segment LimitSelector

+

GDT

TSS Descriptor

Base AddressTask

Invisible PartVisible Part

TSS

Register

31 16 15 1314 12 11 8 7 0

PDPL

Type

31 16 15 0

TSS Segment Selector

1010

DPLPTYPE

Descriptor Privilege LevelSegment PresentSegment Type

Reserved

4

6-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (208)

TASK MANAGEMENT

A task can be accessed either through a task-gate descriptor or a TSS descriptor. Both of thesestructures are provided to satisfy the following needs:

• The need for a task to have only one busy flag. Because the busy flag for a task is stored inthe TSS descriptor, each task should have only one TSS descriptor. There may, however,be several task gates that reference the same TSS descriptor.

• The need to provide selective access to tasks. Task gates fill this need, because they canreside in an LDT and can have a DPL that is different from the TSS descriptor’s DPL. Aprogram or procedure that does not have sufficient privilege to access the TSS descriptorfor a task in the GDT (which usually has a DPL of 0) may be allowed access to the taskthrough a task gate with a higher DPL. Task gates give the operating system greaterlatitude for limiting access to specific tasks.

• The need for an interrupt or exception to be handled by an independent task. Task gatesmay also reside in the IDT, which allows interrupts and exceptions to be handled byhandler tasks. When an interrupt or exception vector points to a task gate, the processorswitches to the specified task.

Figure 6-6 illustrates how a task gate in an LDT, a task gate in the GDT, and a task gate in theIDT can all point to the same task.

6.3. TASK SWITCHING

The processor transfers execution to another task in any of four cases:

• The current program, task, or procedure executes a JMP or CALL instruction to a TSSdescriptor in the GDT.

• The current program, task, or procedure executes a JMP or CALL instruction to a task-gatedescriptor in the GDT or the current LDT.

• An interrupt or exception vector points to a task-gate descriptor in the IDT.

• The current task executes an IRET when the NT flag in the EFLAGS register is set.

The JMP, CALL, and IRET instructions, as well as interrupts and exceptions, are all generalizedmechanisms for redirecting a program. The referencing of a TSS descriptor or a task gate (whencalling or jumping to a task) or the state of the NT flag (when executing an IRET instruction)determines whether a task switch occurs.

The processor performs the following operations when switching to a new task:

1. Obtains the TSS segment selector for the new task as the operand of the JMP or CALLinstruction, from a task gate, or from the previous task link field (for a task switch initiatedwith an IRET instruction).

6-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (209)

TASK MANAGEMENT

2. Checks that the current (old) task is allowed to switch to the new task. Data-accessprivilege rules apply to JMP and CALL instructions. The CPL of the current (old) task andthe RPL of the segment selector for the new task must be less than or equal to the DPL ofthe TSS descriptor or task gate being referenced. Exceptions, interrupts (except forinterrupts generated by the INT n instruction), and the IRET instruction are permitted toswitch tasks regardless of the DPL of the destination task-gate or TSS descriptor. Forinterrupts generated by the INT n instruction, the DPL is checked.

3. Checks that the TSS descriptor of the new task is marked present and has a valid limit(greater than or equal to 67H).

4. Checks that the new task is available (call, jump, exception, or interrupt) or busy (IRETreturn).

Figure 6-6. Task Gates Referencing the Same Task

LDT

Task Gate

TSSGDT

TSS Descriptor

IDT

Task Gate

Task Gate

6-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (210)

TASK MANAGEMENT

LL-2.)

flagP

aved

ds the of thementgister,

theiatedageed.

, ortor; if

SS.

5. Checks that the current (old) TSS, new TSS, and all segment descriptors used in the taskswitch are paged into system memory.

6. If the task switch was initiated with a JMP or IRET instruction, the processor clears thebusy (B) flag in the current (old) task’s TSS descriptor; if initiated with a CAinstruction, an exception, or an interrupt, the busy (B) flag is left set. (Refer to Table 6

7. If the task switch was initiated with an IRET instruction, the processor clears the NTin a temporarily saved image of the EFLAGS register; if initiated with a CALL or JMinstruction, an exception, or an interrupt, the NT flag is left unchanged in the sEFLAGS image.

8. Saves the state of the current (old) task in the current task’s TSS. The processor finbase address of the current TSS in the task register and then copies the statesfollowing registers into the current TSS: all the general-purpose registers, segselectors from the segment registers, the temporarily saved image of the EFLAGS reand the instruction pointer register (EIP).

NOTE

At this point, if all checks and saves have been carried out successfully, theprocessor commits to the task switch. If an unrecoverable error occurs insteps 1 through 8, the processor does not complete the task switch and insuresthat the processor is returned to its state prior to the execution of theinstruction that initiated the task switch. If an unrecoverable error occurs afterthe commit point (in steps 9 through 14), the processor completes the taskswitch (without performing additional access and segment availabilitychecks) and generates the appropriate exception prior to beginning executionof the new task. If exceptions occur after the commit point, the exceptionhandler must finish the task switch itself before allowing the processor tobegin executing the task. Refer to Chapter 5, Interrupt and ExceptionHandling for more information about the affect of exceptions on a task whenthey occur after the commit point of a task switch.

9. If the task switch was initiated with a CALL instruction, an exception, or an interrupt,processor sets the NT flag in the EFLAGS image stored in the new task’s TSS; if initwith an IRET instruction, the processor restores the NT flag from the EFLAGS imstored on the stack. If initiated with a JMP instruction, the NT flag is left unchang(Refer to Table 6-2.)

10. If the task switch was initiated with a CALL instruction, JMP instruction, an exceptionan interrupt, the processor sets the busy (B) flag in the new task’s TSS descripinitiated with an IRET instruction, the busy (B) flag is left set.

11. Sets the TS flag in the control register CR0 image stored in the new task’s TSS.

12. Loads the task register with the segment selector and descriptor for the new task's T

6-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (211)

TASK MANAGEMENT

13. Loads the new task’s state from its TSS into processor. Any errors associated with theloading and qualification of segment descriptors in this step occur in the context of the newtask. The task state information that is loaded here includes the LDTR register, the PDBR(control register CR3), the EFLAGS register, the EIP register, the general-purposeregisters, and the segment descriptor parts of the segment registers.

14. Begins executing the new task. (To an exception handler, the first instruction of the newtask appears not to have been executed.)

The state of the currently executing task is always saved when a successful task switch occurs.If the task is resumed, execution starts with the instruction pointed to by the saved EIP value,and the registers are restored to the values they held when the task was suspended.

When switching tasks, the privilege level of the new task does not inherit its privilege level fromthe suspended task. The new task begins executing at the privilege level specified in the CPLfield of the CS register, which is loaded from the TSS. Because tasks are isolated by their sepa-rate address spaces and TSSs and because privilege rules control access to a TSS, software doesnot need to perform explicit privilege checks on a task switch.

Table 6-1 shows the exception conditions that the processor checks for when switching tasks. Italso shows the exception that is generated for each check if an error is detected and the segmentthat the error code references. (The order of the checks in the table is the order used in the P6family processors. The exact order is model specific and may be different for other Intel Archi-tecture processors.) Exception handlers designed to handle these exceptions may be subject torecursive calls if they attempt to reload the segment selector that generated the exception. Thecause of the exception (or the first of multiple causes) should be fixed before reloading theselector.

Table 6-1. Exception Conditions Checked During a Task Switch

Condition Checked Exception1Error Code Reference2

Segment selector for a TSS descriptor references the GDT and is within the limits of the table.

#GP New Task’s TSS

TSS descriptor is present in memory. #NP New Task’s TSS

TSS descriptor is not busy (for task switch initiated by a call, interrupt, or exception).

#GP (for JMP, CALL, INT)

Task’s back-link TSS

TSS descriptor is not busy (for task switch initiated by an IRET instruction).

#TS (for IRET) New Task’s TSS

TSS segment limit greater than or equal to 108 (for 32-bit TSS) or 44 (for 16-bit TSS).

#TS New Task’s TSS

Registers are loaded from the values in the TSS.

LDT segment selector of new task is valid 3. #TS New Task’s LDT

Code segment DPL matches segment selector RPL. #TS New Code Segment

SS segment selector is valid 2. #TS New Stack Segment

Stack segment is present in memory. #SF New Stack Segment

6-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (212)

TASK MANAGEMENT

thecatesnd thel task

essorS forat the

If soft- in theg isld.

NOTES:

1. #NP is segment-not-present exception, #GP is general-protection exception, #TS is invalid-TSS excep-tion, and #SF is stack-fault exception.

2. The error code contains an index to the segment descriptor referenced in this column.

3. A segment selector is valid if it is in a compatible type of table (GDT or LDT), occupies an address withinthe table’s segment limit, and refers to a compatible type of descriptor (for example, a segment selector inthe CS register only is valid when it points to a code-segment descriptor).

The TS (task switched) flag in the control register CR0 is set every time a task switch occurs.System software uses the TS flag to coordinate the actions of floating-point unit when gener-ating floating-point exceptions with the rest of the processor. The TS flag indicates that thecontext of the floating-point unit may be different from that of the current task. Refer to Section2.5., “Control Registers” in Chapter 2, System Architecture Overview for a detailed descriptionof the function and use of the TS flag.

6.4. TASK LINKING

The previous task link field of the TSS (sometimes called the “backlink”) and the NT flag inEFLAGS register are used to return execution to the previous task. The NT flag indiwhether the currently executing task is nested within the execution of another task, aprevious task link field of the current task's TSS holds the TSS selector for the higher-levein the nesting hierarchy, if there is one (refer to Figure 6-7).

When a CALL instruction, an interrupt, or an exception causes a task switch, the proccopies the segment selector for the current TSS into the previous task link field of the TSthe new task, and then sets the NT flag in the EFLAGS register. The NT flag indicates thprevious task link field of the TSS has been loaded with a saved TSS segment selector.ware uses an IRET instruction to suspend the new task, the processor uses the valueprevious task link field and the NT flag to return to the previous task; that is, if the NT flaset, the processor performs a task switch to the task specified in the previous task link fie

Stack segment DPL matches CPL. #TS New stack segment

LDT of new task is present in memory. #TS New Task’s LDT

CS segment selector is valid 3. #TS New Code Segment

Code segment is present in memory. #NP New Code Segment

Stack segment DPL matches selector RPL. #TS New Stack Segment

DS, ES, FS, and GS segment selectors are valid 3. #TS New Data Segment

DS, ES, FS, and GS segments are readable. #TS New Data Segment

DS, ES, FS, and GS segments are present in memory. #NP New Data Segment

DS, ES, FS, and GS segment DPL greater than or equal to CPL (unless these are conforming segments).

#TS New Data Segment

Table 6-1. Exception Conditions Checked During a Task Switch (Contd.)

6-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (213)

TASK MANAGEMENT

NOTE

When a JMP instruction causes a task switch, the new task is not nested; thatis, the NT flag is set to 0 and the previous task link field is not used. A JMPinstruction is used to dispatch a new task when nesting is not desired.

Table 6-2 summarizes the uses of the busy flag (in the TSS segment descriptor), the NT flag, theprevious task link field, and TS flag (in control register CR0) during a task switch. Note that theNT flag may be modified by software executing at any privilege level. It is possible for aprogram to set its NT flag and execute an IRET instruction, which would have the effect ofinvoking the task specified in the previous link field of the current task’s TSS. To keep spurioustask switches from succeeding, the operating system should initialize the previous task link fieldfor every TSS it creates to 0.

Figure 6-7. Nested Tasks

Table 6-2. Effect of a Task Switch on Busy Flag, NT Flag, Previous Task Link Field, and TS Flag

Flag or FieldEffect of JMP

instruction

Effect of CALL Instruction or

InterruptEffect of IRET

Instruction

Busy (B) flag of new task.

Flag is set. Must have been clear before.

Flag is set. Must have been clear before.

No change. Must have been set.

Busy flag of old task. Flag is cleared. No change. Flag is currently set.

Flag is cleared.

NT flag of new task. No change. Flag is set. Restored to value from TSS of new task.

NT flag of old task. No change. No change. Flag is cleared.

Previous task link field of new task.

No change. Loaded with selector for old task’s TSS.

No change.

Previous task link field of old task.

No change. No change. No change.

TS flag in control register CR0.

Flag is set. Flag is set. Flag is set.

Top LevelTask

NT=0

Prev. Task Link

TSS

NestedTask

NT=1

TSS

More DeeplyNested Task

NT=1

TSS

Currently ExecutingTask

NT=1

EFLAGS

Task RegisterPrev. Task Link Prev. Task Link

6-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (214)

TASK MANAGEMENT

fromd tasksvents

lows a keeps“Auto-

hain of

thatext task field

m theh task

6.4.1. Use of Busy Flag To Prevent Recursive Task Switching

A TSS allows only one context to be saved for a task; therefore, once a task is called(dispatched), a recursive (or re-entrant) call to the task would cause the current state of the taskto be lost. The busy flag in the TSS segment descriptor is provided to prevent re-entrant taskswitching and subsequent loss of task state information. The processor manages the busy flag asfollows:

1. When dispatching a task, the processor sets the busy flag of the new task.

2. If during a task switch, the current task is placed in a nested chain (the task switch is beinggenerated by a CALL instruction, an interrupt, or an exception), the busy flag for thecurrent task remains set.

3. When switching to the new task (initiated by a CALL instruction, interrupt, or exception),the processor generates a general-protection exception (#GP) if the busy flag of the newtask is already set. (If the task switch is initiated with an IRET instruction, the exception isnot raised because the processor expects the busy flag to be set.)

4. When a task is terminated by a jump to a new task (initiated with a JMP instruction in thetask code) or by an IRET instruction in the task code, the processor clears the busy flag,returning the task to the “not busy” state.

In this manner the processor prevents recursive task switching by preventing a taskswitching to itself or to any task in a nested chain of tasks. The chain of nested suspendemay grow to any length, due to multiple calls, interrupts, or exceptions. The busy flag prea task from being invoked if it is in this chain.

The busy flag may be used in multiprocessor configurations, because the processor folLOCK protocol (on the bus or in the cache) when it sets or clears the busy flag. This locktwo processors from invoking the same task at the same time. (Refer to Section 7.1.2.1., matic Locking” in Chapter 7, Multiple-Processor Management for more information aboutsetting the busy flag in a multiprocessor applications.)

6.4.2. Modifying Task Linkages

In a uniprocessor system, in situations where it is necessary to remove a task from a clinked tasks, use the following procedure to remove the task:

1. Disable interrupts.

2. Change the previous task link field in the TSS of the pre-empting task (the tasksuspended the task to be removed). It is assumed that the pre-empting task is the n(newer task) in the chain from the task to be removed. Change the previous task linkshould to point to the TSS of the next oldest or to an even older task in the chain.

3. Clear the busy (B) flag in the TSS segment descriptor for the task being removed frochain. If more than one task is being removed from the chain, the busy flag for eacbeing remove must be cleared.

4. Enable interrupts.

6-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (215)

TASK MANAGEMENT

ss space,rough

sk itsgment

fficientg the

accessed

haveral tasks

r of two

In a multiprocessing system, additional synchronization and serialization operations must beadded to this procedure to insure that the TSS and its segment descriptor are both locked whenthe previous task link field is changed and the busy flag is cleared.

6.5. TASK ADDRESS SPACE

The address space for a task consists of the segments that the task can access. These segmentsinclude the code, data, stack, and system segments referenced in the TSS and any other segmentsaccessed by the task code. These segments are mapped into the processor’s linear addrewhich is in turn mapped into the processor’s physical address space (either directly or thpaging).

The LDT segment field in the TSS can be used to give each task its own LDT. Giving a taown LDT allows the task address space to be isolated from other tasks by placing the sedescriptors for all the segments associated with the task in the task’s LDT.

It also is possible for several tasks to use the same LDT. This is a simple and memory-eway to allow some tasks to communicate with or control each other, without droppinprotection barriers for the entire system.

Because all tasks have access to the GDT, it also is possible to create shared segmentsthrough segment descriptors in this table.

If paging is enabled, the CR3 register (PDBR) field in the TSS allows each task can alsoits own set of page tables for mapping linear addresses to physical addresses. Or, sevecan share the same set of page tables.

6.5.1. Mapping Tasks to the Linear and Physical Address Spaces

Tasks can be mapped to the linear address space and physical address space in eitheways:

• One linear-to-physical address space mapping is shared among all tasks. When paging isnot enabled, this is the only choice. Without paging, all linear addresses map to the samephysical addresses. When paging is enabled, this form of linear-to-physical address spacemapping is obtained by using one page directory for all tasks. The linear address spacemay exceed the available physical space if demand-paged virtual memory is supported.

• Each task has its own linear address space that is mapped to the physical address space.This form of mapping is accomplished by using a different page directory for each task.Because the PDBR (control register CR3) is loaded on each task switch, each task mayhave a different page directory.

The linear address spaces of different tasks may map to completely distinct physical addresses.If the entries of different page directories point to different page tables and the page tables pointto different pages of physical memory, then the tasks do not share any physical addresses.

6-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (216)

TASK MANAGEMENT

With either method of mapping task linear address spaces, the TSSs for all tasks must lie in ashared area of the physical space, which is accessible to all tasks. This mapping is required sothat the mapping of TSS addresses does not change while the processor is reading and updatingthe TSSs during a task switch. The linear address space mapped by the GDT also should bemapped to a shared area of the physical space; otherwise, the purpose of the GDT is defeated.Figure 6-8 shows how the linear address spaces of two tasks can overlap in the physical spaceby sharing page tables.

6.5.2. Task Logical Address Space

To allow the sharing of data among tasks, use any of the following techniques to create sharedlogical-to-physical address-space mappings for data segments:

• Through the segment descriptors in the GDT. All tasks must have access to the segmentdescriptors in the GDT. If some segment descriptors in the GDT point to segments in thelinear-address space that are mapped into an area of the physical-address space common toall tasks, then all tasks can share the data and code in those segments.

• Through a shared LDT. Two or more tasks can use the same LDT if the LDT fields in theirTSSs point to the same LDT. If some segment descriptors in a shared LDT point tosegments that are mapped to a common area of the physical address space, the data andcode in those segments can be shared among the tasks that share the LDT. This method ofsharing is more selective than sharing through the GDT, because the sharing can be limited

Figure 6-8. Overlapping Linear-to-Physical Mappings

Task APage

TSS

PDE

Page Directories

PDE

PTEPTEPTE

PTEPTE

Page Tables Page Frames

Task APage

Task APage

SharedPage

SharedPage

Task BPage

Task BPage

Shared PT

PTEPTE

PDEPDE

PDBR

PDBR

Task A TSS

Task B TSS

6-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (217)

TASK MANAGEMENT

to specific tasks. Other tasks in the system may have different LDTs that do not give themaccess to the shared segments.

• Through segment descriptors in distinct LDTs that are mapped to common addresses in thelinear address space. If this common area of the linear address space is mapped to the samearea of the physical address space for each task, these segment descriptors permit the tasksto share segments. Such segment descriptors are commonly called aliases. This method ofsharing is even more selective than those listed above, because, other segment descriptorsin the LDTs may point to independent linear addresses which are not shared.

6.6. 16-BIT TASK-STATE SEGMENT (TSS)

The 32-bit Intel Architecture processors also recognize a 16-bit TSS format like the one used inIntel 286 processors (refer to Figure 6-9). It is supported for compatibility with software writtento run on these earlier Intel Architecture processors.

The following additional information is important to know about the 16-bit TSS.

• Do not use a 16-bit TSS to implement a virtual-8086 task.

• The valid segment limit for a 16-bit TSS is 2CH.

• The 16-bit TSS does not contain a field for the base address of the page directory, which isloaded into control register CR3. Therefore, a separate set of page tables for each task isnot supported for 16-bit tasks. If a 16-bit task is dispatched, the page-table structure for theprevious task is used.

• The I/O base address is not included in the 16-bit TSS, so none of the functions of the I/Omap are supported.

• When task state is saved in a 16-bit TSS, the upper 16 bits of the EFLAGS register and theEIP register are lost.

• When the general-purpose registers are loaded or saved from a 16-bit TSS, the upper 16bits of the registers are modified and not maintained.

6-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (218)

TASK MANAGEMENT

Figure 6-9. 16-Bit TSS Format

Task LDT Selector

DS Selector

SS Selector

CS Selector

ES Selector

DI

SI

BP

SP

BX

DX

CX

AX

FLAG Word

IP (Entry Point)

SS2

SP2

SS1

SP1

SS0

SP0

Previous Task Link

15 0

42

40

36

34

32

30

38

28

26

24

22

20

18

16

14

12

10

8

6

4

2

6-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (219)

7

Multiple-Processor Management

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (220)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (221)

r, theyurposeus.

ptingicationy and,

ssorsm for

pter 9,g,

CHAPTER 7MULTIPLE-PROCESSOR MANAGEMENT

The Intel Architecture provides several mechanisms for managing and improving the perfor-mance of multiple processors connected to the same system bus. These mechanisms include:

• Bus locking and/or cache coherency management for performing atomic operations onsystem memory.

• Serializing instructions. (These instructions apply only to the Pentium® and P6 familyprocessors.)

• Advance programmable interrupt controller (APIC) located on the processor chip. (TheAPIC architecture was introduced into the Intel Architecture with the Pentium® processor.)

• A secondary (level 2, L2) cache. For the P6 family processors, the L2 cache is included inthe processor package and is tightly coupled to the processor. For the Pentium® andIntel486™ processors, pins are provided to support an external L2 cache.

These mechanisms are particularly useful in symmetric-multiprocessing systems; howevecan also be used in applications where a Intel Architecture processor and a special-pprocessor (such as a communications, graphics, or video processor) share the system b

The main goals of these multiprocessing mechanisms are as follows:

• To maintain system memory coherency—When two or more processors are attemsimultaneously to access the same address in system memory, some communmechanism or memory access protocol must be available to promote data coherencin some instances, to allow one processor to temporarily lock a memory location.

• To maintain cache consistency—When one processor accesses data cached in anotherprocessor, it must not receive incorrect data. If it modifies data, all other processors thataccess that data must receive the modified data.

• To allow predictable ordering of writes to memory—In some circ*mstances, it is importantthat memory writes be observed externally in precisely the same order as programmed.

• To distribute interrupt handling among a group of processors—When several proceare operating in a system in parallel, it is useful to have a centralized mechanisreceiving interrupts and distributing them to available processors for servicing.

The Intel Architecture’s caching mechanism and cache consistency are discussed in ChaMemory Cache Control. Bus and memory locking, serializing instructions, memory orderinand the processor’s internal APIC are discussed in the following sections.

7-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (222)

MULTIPLE-PROCESSOR MANAGEMENT

essor’s insure

ly while

plexityessorsr Intel

7.1. LOCKED ATOMIC OPERATIONS

The 32-bit Intel Architecture processors support locked atomic operations on locations insystem memory. These operations are typically used to manage shared data structures (such assemaphores, segment descriptors, system segments, or page tables) in which two or moreprocessors may try simultaneously to modify the same field or flag. The processor uses threeinterdependent mechanisms for carrying out locked atomic operations:

• Guaranteed atomic operations.

• Bus locking, using the LOCK# signal and the LOCK instruction prefix.

• Cache coherency protocols that insure that atomic operations can be carried out on cacheddata structures (cache lock). This mechanism is present in the P6 family processors.

These mechanisms are interdependent in the following ways. Certain basic memory transactions(such as reading or writing a byte in system memory) are always guaranteed to be handled atom-ically. That is, once started, the processor guarantees that the operation will be completed beforeanother processor or bus agent is allowed access to the memory location. The processor alsosupports bus locking for performing selected memory operations (such as a read-modify-writeoperation in a shared area of memory) that typically need to be handled atomically, but are notautomatically handled this way. Because frequently used memory locations are often cached ina processor’s L1 or L2 caches, atomic operations can often be carried out inside a proccaches without asserting the bus lock. Here the processor’s cache coherency protocolsthat other processors that are caching the same memory locations are managed properatomic operations are performed on cached memory locations.

Note that the mechanisms for handling locked atomic operations have evolved as the comof Intel Architecture processors has evolved. As such, more recent Intel Architecture proc(such as the P6 family processors) provide a more refined locking mechanism than earlieArchitecture processors, as is described in the following sections.

7.1.1. Guaranteed Atomic Operations

The Intel386™, Intel486™, Pentium®, and P6 family processors guarantee that the followingbasic memory operations will always be carried out atomically:

• Reading or writing a byte.

• Reading or writing a word aligned on a 16-bit boundary.

• Reading or writing a doubleword aligned on a 32-bit boundary.

The P6 family processors guarantee that the following additional memory operations willalways be carried out atomically:

• Reading or writing a quadword aligned on a 64-bit boundary. (This operation is alsoguaranteed on the Pentium® processor.)

• 16-bit accesses to uncached memory locations that fit within a 32-bit data bus.

• 16-, 32-, and 64-bit accesses to cached memory that fit within a 32-Byte cache line.

7-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (223)

MULTIPLE-PROCESSOR MANAGEMENT

essor

re as

mentt the

veral

Accesses to cacheable memory that are split across bus widths, cache lines, and page boundariesare not guaranteed to be atomic by the Intel486™, Pentium®, or P6 family processors. The P6family processors provide bus control signals that permit external memory subsystems to makesplit accesses atomic; however, nonaligned data accesses will seriously impact the performanceof the processor and should be avoided where possible.

7.1.2. Bus Locking

Intel Architecture processors provide a LOCK# signal that is asserted automatically duringcertain critical memory operations to lock the system bus. While this output signal is asserted,requests from other processors or bus agents for control of the bus are blocked. Software canspecify other occasions when the LOCK semantics are to be followed by prepending the LOCKprefix to an instruction.

In the case of the Intel386™, Intel486™, and Pentium® processors, explicitly locked instruc-tions will result in the assertion of the LOCK# signal. It is the responsibility of the hardwaredesigner to make the LOCK# signal available in system hardware to control memory accessesamong processors.

For the P6 family processors, if the memory area being accessed is cached internally in theprocessor, the LOCK# signal is generally not asserted; instead, locking is only applied to theprocessor’s caches (refer to Section 7.1.4., “Effects of a LOCK Operation on Internal ProcCaches”).

7.1.2.1. AUTOMATIC LOCKING

The operations on which the processor automatically follows the LOCK semantics afollows:

• When executing an XCHG instruction that references memory.

• When setting the B (busy) flag of a TSS descriptor. The processor tests and sets the busyflag in the type field of the TSS descriptor when switching to a task. To insure that twoprocessors do not switch to the same task simultaneously, the processor follows the LOCKsemantics while testing and setting this flag.

• When updating segment descriptors. When loading a segment descriptor, the processorwill set the accessed flag in the segment descriptor if the flag is clear. During thisoperation, the processor follows the LOCK semantics so that the descriptor will not bemodified by another processor while it is being updated. For this action to be effective,operating-system procedures that update descriptors should use the following steps:

— Use a locked operation to modify the access-rights byte to indicate that the segdescriptor is not-present, and specify a value for the type field that indicates thadescriptor is being updated.

— Update the fields of the segment descriptor. (This operation may require sememory accesses; therefore, locked operations cannot be used.)

7-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (224)

MULTIPLE-PROCESSOR MANAGEMENT

ment

gment

— Use a locked operation to modify the access-rights byte to indicate that the segdescriptor is valid and present.

Note that the Intel386™ processor always updates the accessed flag in the sedescriptor, whether it is clear or not. The P6 family, Pentium®, and Intel486™ processorsonly update this flag if it is not already set.

• When updating page-directory and page-table entries. When updating page-directoryand page-table entries, the processor uses locked cycles to set the accessed and dirty flag inthe page-directory and page-table entries.

• Acknowledging interrupts. After an interrupt request, an interrupt controller may use thedata bus to send the interrupt vector for the interrupt to the processor. The processorfollows the LOCK semantics during this time to ensure that no other data appears on thedata bus when the interrupt vector is being transmitted.

7.1.2.2. SOFTWARE CONTROLLED BUS LOCKING

To explicitly force the LOCK semantics, software can use the LOCK prefix with the followinginstructions when they are used to modify a memory location. An invalid-opcode exception(#UD) is generated when the LOCK prefix is used with any other instruction or when no writeoperation is made to memory (that is, when the destination operand is in a register).

• The bit test and modify instructions (BTS, BTR, and BTC).

• The exchange instructions (XADD, CMPXCHG, and CMPXCHG8B).

• The LOCK prefix is automatically assumed for XCHG instruction.

• The following single-operand arithmetic and logical instructions: INC, DEC, NOT, andNEG.

• The following two-operand arithmetic and logical instructions: ADD, ADC, SUB, SBB,AND, OR, and XOR.

A locked instruction is guaranteed to lock only the area of memory defined by the destinationoperand, but may be interpreted by the system as a lock for a larger memory area.

Software should access semaphores (shared memory used for signaling between multipleprocessors) using identical addresses and operand lengths. For example, if one processoraccesses a semaphore using a word access, other processors should not access the semaphoreusing a byte access.

The integrity of a bus lock is not affected by the alignment of the memory field. The LOCKsemantics are followed for as many bus cycles as necessary to update the entire operand.However, it is recommend that locked accesses be aligned on their natural boundaries for bettersystem performance:

• Any boundary for an 8-bit access (locked or otherwise).

• 16-bit boundary for locked word accesses.

• 32-bit boundary for locked doubleword access.

7-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (225)

MULTIPLE-PROCESSOR MANAGEMENT

ces-

thanpend

• 64-bit boundary for locked quadword access.

Locked operations are atomic with respect to all other memory operations and all externallyvisible events. Only instruction fetch and page table accesses can pass locked instructions.Locked instructions can be used to synchronize data written by one processor and read byanother processor.

For the P6 family processors, locked operations serialize all outstanding load and store opera-tions (that is, wait for them to complete).

Locked instructions should not be used to insure that data written can be fetched as instructions.

NOTE

The locked instructions for the current versions of the Intel486™, Pentium®,and P6 family processors will allow data written to be fetched as instructions.However, Intel recommends that developers who require the use of self-modifying code use a different synchronizing mechanism, described in thefollowing sections.

7.1.3. Handling Self- and Cross-Modifying Code

The act of a processor writing data into a currently executing code segment with the intent ofexecuting that data as code is called self-modifying code. Intel Architecture processors exhibitmodel-specific behavior when executing self-modified code, depending upon how far ahead ofthe current execution pointer the code has been modified. As processor architectures becomemore complex and start to speculatively execute code ahead of the retirement point (as in the P6family processors), the rules regarding which code should execute, pre- or post-modification,become blurred. To write self-modifying code and ensure that it is compliant with current andfuture Intel Architectures one of the following two coding options should be chosen.

(* OPTION 1 *)Store modified code (as data) into code segment; Jump to new code or an intermediate location;Execute new code;

(* OPTION 2 *)Store modified code (as data) into code segment;Execute a serializing instruction; (* For example, CPUID instruction *)Execute new code;

(The use of one of these options is not required for programs intended to run on the Pentium® orIntel486™ processors, but are recommended to insure compatibility with the P6 family prosors.)

It should be noted that self-modifying code will execute at a lower level of performancenonself-modifying or normal code. The degree of the performance deterioration will deupon the frequency of modification and specific characteristics of the code.

7-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (226)

MULTIPLE-PROCESSOR MANAGEMENT

cha-achers that

andeveralle, the

The act of one processor writing data into the currently executing code segment of a secondprocessor with the intent of having the second processor execute that data as code is calledcross-modifying code. As with self-modifying code, Intel Architecture processors exhibitmodel-specific behavior when executing cross-modifying code, depending upon how far aheadof the executing processors current execution pointer the code has been modified. To writecross-modifying code and insure that it is compliant with current and future Intel Architectures,the following processor synchronization algorithm should be implemented.

; Action of Modifying ProcessorStore modified code (as data) into code segment;Memory_Flag ← 1;

; Action of Executing ProcessorWHILE (Memory_Flag ≠ 1)

Wait for code to update;ELIHW;Execute serializing instruction; (* For example, CPUID instruction *)Begin executing modified code;

(The use of this option is not required for programs intended to run on the Intel486™ processor,but is recommended to insure compatibility with the Pentium®, and P6 family processors.)

Like self-modifying code, cross-modifying code will execute at a lower level of performancethan noncross-modifying (normal) code, depending upon the frequency of modification andspecific characteristics of the code.

7.1.4. Effects of a LOCK Operation on Internal Processor Caches

For the Intel486™ and Pentium® processors, the LOCK# signal is always asserted on the busduring a LOCK operation, even if the area of memory being locked is cached in the processor.

For the P6 family processors, if the area of memory being locked during a LOCK operation iscached in the processor that is performing the LOCK operation as write-back memory and iscompletely contained in a cache line, the processor may not assert the LOCK# signal on the bus.Instead, it will modify the memory location internally and allow it’s cache coherency menism to insure that the operation is carried out atomically. This operation is called “clocking.” The cache coherency mechanism automatically prevents two or more processohave cached the same area of memory from simultaneously modifying data in that area.

7.2. MEMORY ORDERING

The term memory ordering refers to the order in which the processor issues reads (loads)writes (stores) out onto the bus to system memory. The Intel Architecture supports smemory ordering models depending on the implementation of the architecture. For exampIntel386™ processor enforces program ordering (generally referred to as strong ordering),

7-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (227)

MULTIPLE-PROCESSOR MANAGEMENT

el;ads anduation writescted to

.

family

o controlering.2.4.,

can behar-

lowing

have

where reads and writes are issued on the system bus in the order they occur in the instructionstream under all circ*mstances.

To allow optimizing of instruction execution, the Intel Architecture allows departures fromstrong-ordering model called processor ordering in P6-family processors. These processor-ordering variations allow performance enhancing operations such as allowing reads to go aheadof writes by buffering writes. The goal of any of these variations is to increase instruction execu-tion speeds, while maintaining memory coherency, even in multiple-processor systems.

The following sections describe the memory ordering models used by the Intel486™, Pentium®,and P6 family processors.

7.2.1. Memory Ordering in the Pentium® and Intel486™ Processors

The Pentium® and Intel486™ processors follow the processor-ordered memory modhowever, they operate as strongly-ordered processors under most circ*mstances. Rewrites always appear in programmed order at the system bus—except for the following sitwhere processor ordering is exhibited. Read misses are permitted to go ahead of bufferedon the system bus when all the buffered writes are cache hits and, therefore, are not direthe same address being accessed by the read miss.

In the case of I/O operations, both reads and writes always appear in programmed order

Software intended to operate correctly in processor-ordered processors (such as the P6processors) should not depend on the relatively strong ordering of the Pentium® or Intel486™processors. Instead, it should insure that accesses to shared variables that are intended tconcurrent execution among processors are explicitly required to obey program ordthrough the use of appropriate locking or serializing operations (refer to Section 7“Strengthening or Weakening the Memory Ordering Model”).

7.2.2. Memory Ordering in the P6 Family Processors

The P6 family processors also use a processor-ordered memory ordering model that further refined defined as “write ordered with store-buffer forwarding.” This model can be cacterized as follows.

In a single-processor system for memory regions defined as write-back cacheable, the folordering rules apply:

1. Reads can be carried out speculatively and in any order.

2. Reads can pass buffered writes, but the processor is self-consistent.

3. Writes to memory are always carried out in program order.

4. Writes can be buffered.

5. Writes are not performed speculatively; they are only performed for instructions thatactually been retired.

7-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (228)

MULTIPLE-PROCESSOR MANAGEMENT

t andhe value.

6. Data from buffered writes can be forwarded to waiting reads within the processor.

7. Reads or writes cannot pass (be carried out ahead of) I/O instructions, locked instructions,or serializing instructions.

The second rule allows a read to pass a write. However, if the write is to the same memory loca-tion as the read, the processor’s internal “snooping” mechanism will detect the conflicupdate the already cached read before the processor executes the instruction that uses t

The sixth rule constitutes an exception to an otherwise write ordered model.

In a multiple-processor system, the following ordering rules apply:

• Individual processors use the same ordering rules as in a single-processor system.

• Writes by a single processor are observed in the same order by all processors.

• Writes from the individual processors on the system bus are globally observed and areNOT ordered with respect to each other.

The latter rule can be clarified by the example in Figure 7-1. Consider three processors in asystem and each processor performs three writes, one to each of three defined locations (A, B,and C). Individually, the processors perform the writes in the same program order, but becauseof bus arbitration and other memory access mechanisms, the order that the three processors writethe individual memory locations can differ each time the respective code sequences are executedon the processors. The final values in location A, B, and C would possibly vary on each execu-tion of the write sequence.

Figure 7-1. Example of Write Ordering in Multiple-Processor Systems

Processor #1 Processor #2 Processor #3

Write A.3Write B.3Write C.3

Write A.1Write B.1Write A.2Write A.3Write C.1Write B.2Write C.2Write B.3Write C.3

Order of Writes From Individual Processors

Example of Order of Actual Writes

Write A.2Write B.2Write C.2

Write A.1Write B.1Write C.1

From All Processors to Memory

Writes are in orderwith respect to

individual processors. Writes from allprocessors arenot guaranteedto occur in aparticular order.

Each processoris guaranteed to

perform writesin program order.

7-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (229)

MULTIPLE-PROCESSOR MANAGEMENT

are:

.

ations“fasttially

e. Thisn inval-

ination beestina-

r.

e entireent coderrectly

The processor-ordering model described in this section is virtually identical to that used by thePentium® and Intel486™ processors. The only enhancements in the P6 family processors

• Added support for speculative reads.

• Store-buffer forwarding, when a read passes a write to the same memory location.

• Out of order store from long string store and string move operations (refer to Section7.2.3., “Out of Order Stores From String Operations in P6 Family Processors” below)

7.2.3. Out of Order Stores From String Operations in P6 Family Processors

The P6 family processors modify the processors operation during the string store oper(initiated with the MOVS and STOS instructions) to maximize performance. Once the string” operations initial conditions are met (as described below), the processor will essenoperate on, from an external perspective, the string in a cache line by cache line modresults in the processor looping on issuing a cache-line read for the source address and aidation on the external bus for the destination address, knowing that all bytes in the destcache line will be modified, for the length of the string. In this mode interrupts will onlyaccepted by the processor on cache line boundaries. It is possible in this mode that the dtion line invalidations, and therefore stores, will be issued on the external bus out of orde

Code dependent upon sequential store ordering should not use the string operations for thdata structure to be stored. Data and semaphores should be separated. Order dependshould use a discrete semaphore uniquely stored to after any string operations to allow coordered data to be seen by all processors.

Initial conditions for “fast string” operations:

• Source and destination addresses must be 8-byte aligned.

• String operation must be performed in ascending address order.

• The initial operation counter (ECX) must be equal to or greater than 64.

• Source and destination must not overlap by less than a cache line (32 bytes).

• The memory type for both source and destination addresses must be either WB or WC.

7.2.4. Strengthening or Weakening the Memory Ordering Model

The Intel Architecture provides several mechanisms for strengthening or weakening thememory ordering model to handle special programming situations. These mechanisms include:

• The I/O instructions, locking instructions, the LOCK prefix, and serializing instructionsforce stronger ordering on the processor.

• The memory type range registers (MTRRs) can be used to strengthen or weaken memoryordering for specific area of physical memory (refer to Section 9.12., “Memory Type

7-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (230)

MULTIPLE-PROCESSOR MANAGEMENT

rder ofposen, theferedstruc-hat the

strongCHG

y is waiter to

ctionre orectionwaitsned to

ics fors setoces-

Range Registers (MTRRs)”, in Chapter 9, Memory Cache Control). MTRRs are availableonly in the P6 family processors.

These mechanisms can be used as follows.

Memory mapped devices and other I/O devices on the bus are often sensitive to the owrites to their I/O buffers. I/O instructions can be used to (the IN and OUT instructions) imstrong write ordering on such accesses as follows. Prior to executing an I/O instructioprocessor waits for all previous instructions in the program to complete and for all bufwrites to drain to memory. Only instruction fetch and page tables walks can pass I/O intions. Execution of subsequent instructions do not begin until the processor determines tI/O instruction has been completed.

Synchronization mechanisms in multiple-processor systems may depend upon a memory-ordering model. Here, a program can use a locking instruction such as the Xinstruction or the LOCK prefix to insure that a read-modify-write operation on memorcarried out atomically. Locking operations typically operate like I/O operations in that theyfor all previous instructions to complete and for all buffered writes to drain to memory (refSection 7.1.2., “Bus Locking”).

Program synchronization can also be carried out with serializing instructions (refer to Se7.4., “Serializing Instructions”). These instructions are typically used at critical procedutask boundaries to force completion of all previous instructions before a jump to a new sof code or a context switch occurs. Like the I/O and locking instructions, the processor until all previous instructions have been completed and all buffered writes have been draimemory before executing the serializing instruction.

The MTRRs were introduced in the P6 family processors to define the cache characteristspecified areas of physical memory. The following are two examples of how memory typeup with MTRRs can be used strengthen or weaken memory ordering for the P6 family prsors:

• The uncached (UC) memory type forces a strong-ordering model on memory accesses.Here, all reads and writes to the UC memory region appear on the bus and out-of-order orspeculative accesses are not performed. This memory type can be applied to an addressrange dedicated to memory mapped I/O devices to force strong memory ordering.

• For areas of memory where weak ordering is acceptable, the write back (WB) memorytype can be chosen. Here, reads can be performed speculatively and writes can be bufferedand combined. For this type of memory, cache locking is performed on atomic (locked)operations that do not split across cache lines, which helps to reduce the performancepenalty associated with the use of the typical synchronization instructions, such as XCHG,that lock the bus during the entire read-modify-write operation. With the WB memorytype, the XCHG instruction locks the cache instead of the bus if the memory access iscontained within a cache line.

It is recommended that software written to run on P6 family processors assume the processor-ordering model or a weaker memory-ordering model. The P6 family processors do not imple-ment a strong memory-ordering model, except when using the UC memory type. Despite thefact that P6 family processors support processor ordering, Intel does not guarantee that futureprocessors will support this model. To make software portable to future processors, it is recom-

7-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (231)

MULTIPLE-PROCESSOR MANAGEMENT

API’ssed toftware

ot vsup-

ing, theas “TLBcessoruence

ever,

tion

mended that operating systems provide critical region and resource control constructs and(application program interfaces) based on I/O, locking, and/or serializing instructions be usynchronize access to shared areas of memory in multiple-processor systems. Also, soshould not depend on processor ordering in situations where the system hardware does nport this memory-ordering model.

7.3. PROPAGATION OF PAGE TABLE ENTRY CHANGES TO MULTIPLE PROCESSORS

In a multiprocessor system, when one processor changes a page table entry or mappchanges must also be propagated to all the other processors. This process is also known Shootdown.” Propagation may be done by memory-based semaphores and/or interprointerrupts between processors. One naive but algorithmically correct TLB Shootdown seqfor the Intel Architecture is:

1. Begin barrier: Stop all processors. Cause all but one to HALT or stop in a spinloop.

2. Let the active processor change the PTE(s).

3. Let all processors invalidate the PTE(s) modified in their TLBs.

4. End barrier: Resume all processors.

Alternate, performance-optimized, TBL Shootdown algorithms may be developed; howcare must be taken by the developers to ensure that either:

• The differing TLB mappings are not actually used on different processors during theupdate process.

OR

• The operating system is prepared to deal with the case where processor(s) is/are using thestale mapping during the update process.

7.4. SERIALIZING INSTRUCTIONS

The Intel Architecture defines several serializing instructions. These instructions force theprocessor to complete all modifications to flags, registers, and memory by previous instructionsand to drain all buffered writes to memory before the next instruction is fetched and executed.For example, when a MOV to control register instruction is used to load a new value into controlregister CR0 to enable protected mode, the processor must perform a serializing operationbefore it enters protected mode. This serializing operation insures that all operations that werestarted while the processor was in real-address mode are completed before the switch toprotected mode is made.

The concept of serializing instructions was introduced into the Intel Architecture with thePentium® processor to support parallel instruction execution. Serializing instructions have nomeaning for the Intel486™ and earlier processors that do not implement parallel instrucexecution.

7-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (232)

MULTIPLE-PROCESSOR MANAGEMENT

r),

ecu-ers

other

ansac- next

e

It is important to note that executing of serializing instructions on P6 family processors constrainspeculative execution, because the results of speculatively executed instructions are discarded.

The following instructions are serializing instructions:

• Privileged serializing instructions—MOV (to control register), MOV (to debug registeWRMSR, INVD, INVLPG, WBINVD, LGDT, LLDT, LIDT, and LTR.

• Nonprivileged serializing instructions—CPUID, IRET, and RSM.

The CPUID instruction can be executed at any privilege level to serialize instruction extion with no effect on program flow, except that the EAX, EBX, ECX, and EDX registare modified.

Nothing can pass a serializing instruction, and serializing instructions cannot pass anyinstruction (read, write, instruction fetch, or I/O).

When the processor serializes instruction execution, it ensures that all pending memory trtions are completed, including writes stored in its store buffer, before it executes theinstruction.

The following additional information is worth noting regarding serializing instructions:

• The processor does not writeback the contents of modified data in its data cache to externalmemory when it serializes instruction execution. Software can force modified data to bewritten back by executing the WBINVD instruction, which is a serializing instruction. Itshould be noted that frequent use of the WBINVD instruction will seriously reduce systemperformance.

• When an instruction is executed that enables or disables paging (that is, changes the PGflag in control register CR0), the instruction should be followed by a jump instruction. Thetarget instruction of the jump instruction is fetched with the new setting of the PG flag (thatis, paging is enabled or disabled), but the jump instruction itself is fetched with theprevious setting. The P6 family processors do not require the jump operation following themove to register CR0 (because any use of the MOV instruction in a P6 family processor towrite to CR0 is completely serializing). However, to maintain backwards and forwardcompatibility with code written to run on other Intel Architecture processors, it isrecommended that the jump operation be performed.

• Whenever an instruction is executed to change the contents of CR3 while paging isenabled, the next instruction is fetched using the translation tables that correspond to thenew value of CR3. Therefore the next instruction and the sequentially following instruc-tions should have a mapping based upon the new value of CR3. (Global entries in theTLBs are not invalidated, refer to Section 9.10., “Invalidating the Translation LookasidBuffers (TLBs)”, Chapter 9, Memory Cache Control.)

• The Pentium® and P6 family processors use branch-prediction techniques to improveperformance by prefetching the destination of a branch instruction before the branchinstruction is executed. Consequently, instruction execution is not deterministicallyserialized when a branch instruction is executed.

7-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (233)

MULTIPLE-PROCESSOR MANAGEMENT

essoro itsilitiesith itsupts and, and canbling

that

ces-

7.5. ADVANCED PROGRAMMABLE INTERRUPT CONTROLLER (APIC)

The Advanced Programmable Interrupt Controller (APIC), referred to in the following sectionsas the local APIC, was introduced into the Intel Architecture with the Pentium® processor(beginning with the 735/90 and 815/100 models) and is included in all P6 family processors. Thelocal APIC performs two main functions for the processor:

• It processes local external interrupts that the processor receives at its interrupt pins andlocal internal interrupts that software generates.

• In multiple-processor systems, it communicates with an external I/O APIC chip. Theexternal I/O APIC receives external interrupt events from the system and interprocessorinterrupts from the processors on the system bus and distributes them to the processors onthe system bus. The I/O APIC is part of Intel’s system chip set.

Figure 7-2 shows the relationship of the local APICs on the processors in a multiple-proc(MP) system and the I/O APIC. The local APIC controls the dispatching of interrupts (tassociated processor) that it receives either locally or from the I/O APIC. It provides facfor queuing, nesting and masking of interrupts. It handles the interrupt delivery protocol wlocal processor and accesses to APIC registers, and also manages interprocessor interrremote APIC register reads. A timer on the local APIC allows local generation of interruptslocal interrupt pins permit local reception of processor-specific interrupts. The local APICbe disabled and used in conjunction with a standard 8259A-style interrupt controller. (Disathe local APIC can be done in hardware for the Pentium® processors or in software for the P6family processors.)

The I/O APIC is responsible for receiving interrupts generated by I/O devices and distributingthem among the local APICs by means of the APIC Bus. The I/O APIC manages interrupts usingeither static or dynamic distribution schemes. Dynamic distribution of interrupts allows routingof interrupts to the lowest priority processors. It also handles the distribution of interprocessorinterrupts and system-wide control functions such as NMI, INIT, SMI and start-up-interpro-cessor interrupts. Individual pins on the I/O APIC can be programmed to generate a specific,prioritized interrupt vector when asserted. The I/O APIC also has a “virtual wire mode”allows it to cooperate with an external 8259A in the system.

The APIC in the Pentium® and P6 family processors is an architectural subset of the Intel82489DX external APIC. The differences are described in Section 7.5.19., “Software VisibleDifferences Between the Local APIC and the 82489DX”

The following sections focus on the local APIC, and its implementation in the P6 family prosors. Contact Intel for the information on I/O APIC.

7-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (234)

MULTIPLE-PROCESSOR MANAGEMENT

7.5.1. Presence of APIC

Beginning with the P6 family processors, the presence or absence of an on-chip APIC can bedetected using the CPUID instruction. When the CPUID instruction is executed, bit 9 of thefeature flags returned in the EDX register indicates the presence (set) or absence (clear) of anon-chip local APIC.

7.5.2. Enabling or Disabling the Local APIC

For the P6 family processors, a flag (the E flag, bit 11) in the APIC_BASE_MSR registerpermits the local APIC to be explicitly enabled or disabled. Refer to Section 7.5.8., “Relocationof the APIC Registers Base Address” for a description of this flag. For the Pentium® processor,the APICEN pin (which is shared with the PICD1 pin) is used during reset to enable or disablethe local APIC.

7.5.3. APIC Bus

All I/O APIC and local APICs communicate through the APIC bus (a 3-line inter-APIC bus).Two of the lines are open-drain (wired-OR) and are used for data transmission; the third line isa clock. The bus and its messages are invisible to software and are not classed as architec-tural (that is, the APIC bus and message format may change in future implementationswithout having any effect on software compatibility).

Figure 7-2. I/O APIC and Local APICs in Multiple-Processor Systems

CPU

Local APIC

Processor #2

LocalInterrupts

CPU

Local APIC

Processor #3

LocalInterrupts

CPU

Local APIC

Processor #1

LocalInterrupts

I/O APIC

ExternalInterrupts I/O Chip Set

APIC Bus

7-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (235)

MULTIPLE-PROCESSOR MANAGEMENT

l APICnter-

essor

rupt

le, byourceausesVector on

ate fornnec-ues its

lecteds afied., “APIC

7.5.4. Valid Interrupts

The local and I/O APICs support 240 distinct vectors in the range of 16 to 255. Interrupt priorityis implied by its vector, according to the following relationship:

priority = vector / 16

One is the lowest priority and 15 is the highest. Vectors 16 through 31 are reserved for exclusiveuse by the processor. The remaining vectors are for general use. The processor’s locaincludes an in-service entry and a holding entry for each priority level. To avoid losing irupts, software should allocate no more than 2 interrupt vectors per priority.

7.5.5. Interrupt Sources

The local APIC can receive interrupts from the following sources:

• Interrupt pins on the processor chip, driven by locally connected I/O devices.

• A bus message from the I/O APIC, originated by an I/O device connected to the I/O APIC.

• A bus message from another processor’s local APIC, originated as an interprocinterrupt.

• The local APIC’s programmable timer or the error register, through the self-intergenerating mechanism.

• Software, through the self-interrupt generating mechanism.

• (P6 family processors.) The performance-monitoring counters.

The local APIC services the I/O APIC and interprocessor interrupts according to the informationincluded in the bus message (such as vector, trigger type, interrupt destination, etc.). Interpreta-tion of the processor’s interrupt pins and the timer-generated interrupts is programmabmeans of the local vector table (LVT). To generate an interprocessor interrupt, the sprocessor programs its interrupt command register (ICR). The programming of the ICR cgeneration of a corresponding interrupt bus message. Refer to Section 7.5.11., “Local Table” and Section 7.5.12., “Interprocessor and Self-Interrupts” for detailed informationprogramming the LVT and ICR, respectively.

7.5.6. Bus Arbitration Overview

Being connected on a common bus (the APIC bus), the local and I/O APICs have to arbitrpermission to send a message on the APIC bus. Logically, the APIC bus is a wired-OR cotion, enabling more than one local APIC to send messages simultaneously. Each APIC issarbitration priority at the beginning of each message, and one winner is collectively sefollowing an arbitration round. At any given time, a local APIC’s the arbitration priority iunique value from 0 to 15. The arbitration priority of each local APIC is dynamically modiafter each successfully transmitted message to preserve fairness. Refer to Section 7.5.16Bus Arbitration Mechanism and Protocol” for a detailed discussion of bus arbitration.

7-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (236)

MULTIPLE-PROCESSOR MANAGEMENT

ssagel de-

r start-y areay beesent

localssor’sKBytes

istersfamily

Section 7.5.3., “APIC Bus” describes the existing arbitration protocols and bus meformats, while Section 7.5.12., “Interprocessor and Self-Interrupts” describes the INIT leveassert message, used to resynchronize all local APICs’ arbitration IDs. Note that except foup (refer to Section 7.5.11., “Local Vector Table”), all bus messages failing during deliverautomatically retried. The software should avoid situations in which interrupt messages m“ignored” by disabled or nonexistent “target” local APICs, and messages are being rrepeatedly.

7.5.7. The Local APIC Block Diagram

Figure 7-3 gives a functional block diagram for the local APIC. Software interacts with the APIC by reading and writing its registers. The registers are memory-mapped to the procephysical address space, and for each processor they have an identical address space of 4starting at address FEE00000H. (Refer to Section 7.5.8., “Relocation of the APIC RegBase Address” for information on relocating the APIC registers base address for the P6 processors.)

NOTE

For P6 family processors, the APIC handles all memory accesses to addresseswithin the 4-KByte APIC register space and no external bus cycles areproduced. For the Pentium® processors with an on-chip APIC, bus cycles areproduced for accesses to the 4-KByte APIC register space. Thus, for softwareintended to run on Pentium® processors, system software should explicitlynot map the APIC register space to regular system memory. Doing so canresult in an invalid opcode exception (#UD) being generated or unpredictableexecution.

The 4-KByte APIC register address space should be mapped as uncacheable (UC), refer toSection 9, “Memory Cache Control”, in Chapter 9, Memory Cache Control.

7-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (237)

MULTIPLE-PROCESSOR MANAGEMENT

Within the 4-KByte APIC register area, the register address allocation scheme is shown in Table7-1. Register offsets are aligned on 128-bit boundaries. All registers must be accessed using 32-bit loads and stores. Wider registers (64-bit or 256-bit) are defined and accessed as independentmultiple 32-bit registers. If a LOCK prefix is used with a MOV instruction that accesses theAPIC address space, the prefix is ignored; that is, a locking operation does not take place.

Figure 7-3. Local APIC Structure

Current CountRegister

Initial CountRegister

Divide ConfigurationRegister

Version Register

Interrupt CommandRegister

T

TMR, ISR, IRR Registers

S R V15

T S R V1

T

Software Transparent Registers

R V T R V

Arb. IDRegister

VectorDecode

ProcessorPriority

AcceptanceLogic

Vec[3:0]& TMR Bit

RegisterSelect

INIT,NMI,SMI

APIC BusSend/Receive Logic

Dest. Mode& Vector

APIC Serial Bus

APIC IDRegister

Logical DestinationRegister

Destination FormatRegister

Timer

LocalInterrupts 0,1

PerformanceMonitoring Counters*

Error

Timer

Local Vec Table

DATA/ADDR

Prioritizer

Task PriorityRegister

EOI Register

INTREXTINTINTA

LINT0/1

* Available only in P6 family processors

7-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (238)

MULTIPLE-PROCESSOR MANAGEMENT

Table 7-1. Local APIC Register Address Map

Address Register Name Software Read/Write

FEE0 0000H Reserved

FEE0 0010H Reserved

FEE0 0020H Local APIC ID Register Read/write

FEE0 0030H Local APIC Version Register Read only

FEE0 0040H Reserved

FEE0 0050H Reserved

FEE0 0060H Reserved

FEE0 0070H Reserved

FEE0 0080H Task Priority Register Read/Write

FEE0 0090H Arbitration Priority Register Read only

FEE0 00A0H Processor Priority Register Read only

FEE0 00B0H EOI Register Write only

FEE0 00C0H Reserved

FEE0 00D0H Logical Destination Register Read/Write

FEE0 00E0H Destination Format Register Bits 0-27 Read only. Bits 28-31 Read/Write

FEE0 00F0H Spurious-Interrupt Vector Register Bits 0-3 Read only. Bits 4-9 Read/Write

FEE0 0100H throughFEE0 0170H

ISR 0-255 Read only

FEE0 0180H throughFEE0 01F0H

TMR 0-255 Read only

FEE0 0200H throughFEE0 0270H

IRR 0-255 Read only

FEE0 0280H Error Status Register Read only

FEE0 0290H throughFEE0 02F0H

Reserved

FEE0 0300H Interrupt Command Reg. 0-31 Read/Write

FEE0 0310H Interrupt Command Reg. 32-63 Read/Write

FEE0 0320H Local Vector Table (Timer) Read/Write

FEE0 0330H Reserved

FEE0 0340H Performance Counter LVT1 Read/Write

FEE0 0350H Local Vector Table (LINT0) Read/Write

FEE0 0360H Local Vector Table (LINT1) Read/Write

FEE0 0370H Local Vector Table (Error)2 Read/Write

FEE0 0380H Initial Count Register for Timer Read/Write

7-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (239)

MULTIPLE-PROCESSOR MANAGEMENT

the

g ag is

NOTES:

1. Introduced into the APIC Architecture in the Pentium® Pro processor.

2. Introduced into the APIC Architecture in the Pentium® processor.

7.5.8. Relocation of the APIC Registers Base Address

The P6 family processors permit the starting address of the APIC registers to be relocated fromFEE00000H to another physical address. This extension of the APIC architecture is provided tohelp resolve conflicts with memory maps of existing systems. The P6 family processors alsoprovide the ability to enable or disable the local APIC.

An alternate APIC base address is specified through the APIC_BASE_MSR register. This MSRis located at MSR address 27 (1BH). Figure 7-4 shows the encoding of the bits in this register.This register also provides the flag for enabling or disabling the local APIC.

The functions of the bits in the APIC_BASE_MSR register are as follows:

BSP flag, bit 8 Indicates if the processor is the bootstrap processor (BSP), determined duringthe MP initialization (refer to Section 7.7., “Multiple-Processor (MP) Initial-ization Protocol”). Following a power-up or reset, this flag is clear for all processors in the system except the single BSP.

E (APIC Enabled) flag, bit 11Permits the local APIC to be enabled (set) or disabled (clear). Followinpower-up or reset, this flag is set, enabling the local APIC. When this fla

FEE0 0390H Current Count Register for Timer Read only

FEE0 03A0H through FEE0 03D0H

Reserved

FEE0 03E0H Timer Divide Configuration Register Read/Write

FEE0 03F0H Reserved

Figure 7-4. APIC_BASE_MSR

Table 7-1. Local APIC Register Address Map (Contd.)

Address Register Name Software Read/Write

BSP—Processor is BSP

E—APIC enable/disableAPIC Base—Base physical address

63 0

Reserved

71011 8912

Reserved

36 35

APIC Base

7-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (240)

MULTIPLE-PROCESSOR MANAGEMENT

g isble at

ndedlignsld is

r are

m. Thecal or

ThisC IDdcast up toll localriven BR0#

clear, the processor is functionally equivalent to an Intel Architecture processorwithout an on-chip APIC (for example, an Intel486™ processor). This flaimplementation dependent and in not guaranteed to be available or availathe same location in future Intel Architecture processors.

APIC Base field, bits 12 through 35Specifies the base address of the APIC registers. This 24-bit value is exteby 12 bits at the low end to form the base address, which automatically athe address on a 4-KByte boundary. Following a power-up or reset, this fieset to FEE00000H.

Bits 0 through 7, bits 9 and 10, and bits 36 through 63 in the APIC_BASE_MSR registereserved.

7.5.9. Interrupt Destination and APIC ID

The destination of an interrupt can be one, all, or a subset of the processors in the systesender specifies the destination of an interrupt in one of two destination modes: physilogical.

7.5.9.1. PHYSICAL DESTINATION MODE

In physical destination mode, the destination processor is specified by its local APIC ID.ID is matched against the local APIC’s actual physical ID, which is stored in the local APIregister (refer to Figure 7-5). Either a single destination (the ID is 0 through 14) or a broato all (the ID is 15) can be specified in physical destination mode. Note that in this mode,15 the local APICs can be individually addressed. An ID of all 1s denotes a broadcast to aAPICs. The APIC ID register is loaded at power up by sampling configuration data that is donto pins of the processor. For the P6 family processors, pins A11# and A12# and pinsthrough BR3# are sampled; for the Pentium® processor, pins BE0# through BE3# are sampled.The ID portion can be read and modified by software.

7.5.9.2. LOGICAL DESTINATION MODE

In logical destination mode, message destinations are specified using an 8-bit message destina-tion address (MDA). The MDA is compared against the 8-bit logical APIC ID field of the APIClogical destination register (LDR), refer to Figure 7-6.

Figure 7-5. Local APIC ID Register

31 0

Reserved

232427

ReservedAPIC ID

Address: 0FEE0 0020HValue after reset: 0000 0000H

28

7-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (241)

MULTIPLE-PROCESSOR MANAGEMENT

del,ing all

odel,

APICluster.IC isf thehin MDAmbersving 4

ts only to 15.

Destination format register (DFR) defines the interpretation of the logical destination informa-tion (refer to Figure 7-7). The DFR register can be programmed for flat model or cluster modelinterrupt delivery modes.

7.5.9.3. FLAT MODEL

For the flat model, bits 28 through 31 of the DFR must be programmed to 1111. The MDA isinterpreted as a decoded address. This scheme allows the specification of arbitrary groups oflocal APICs simply by setting each APIC’s bit to 1 in the corresponding LDR. In the flat moup to 8 local APICs can coexist in the system. Broadcast to all APICs is achieved by sett8 bits of the MDA to ones.

7.5.9.4. CLUSTER MODEL

For the cluster model, the DFR bits 28 through 31 should be programmed to 0000. In this mthere are two basic connection schemes: flat cluster and hierarchical cluster.

In the flat cluster connection model, all clusters are assumed to be connected on a singlebus. Bits 28 through 31 of the MDA contains the encoded address of the destination cThese bits are compared with bits 28 through 31 of the LDR to determine if the local APpart of the cluster. Bits 24 through 27 of the MDA are compared with Bits 24 through 27 oLDR to identify individual local APIC unit within the cluster. Arbitrary sets of processors wita cluster can be specified by writing the target cluster address in bits 28 through 31 of theand setting selected bits in bits 24 through 27 of the MDA, corresponding to the chosen meof the cluster. In this mode, 15 clusters (with cluster addresses of 0 through 14) each haprocessors can be specified in the message. The APIC arbitration ID, however, suppor15 agents, and hence the total number of processors supported in this mode is limited

Figure 7-6. Logical Destination Register (LDR)

Figure 7-7. Destination Format Register (DFR)

31 02324

ReservedLogical APIC ID

Address: 0FEE0 00D0HValue after reset: 0000 0000H

31 0

Model

28

Reserved (All 1s)

Address: 0FEE0 00E0HValue after reset: FFFF FFFFH

7-21

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (242)

MULTIPLE-PROCESSOR MANAGEMENT

erythe oneorityone (the

tsrrupt

Broadcast to all local APICs is achieved by setting all destination bits to one. This guarantees amatch on all clusters, and selects all APICs in each cluster.

In the hierarchical cluster connection model, an arbitrary hierarchical network can be created byconnecting different flat clusters via independent APIC buses. This scheme requires a clustermanager within each cluster, responsible for handling message passing between APIC buses.One cluster contains up to 4 agents. Thus 15 cluster managers, each with 4 agents, can form anetwork of up to 60 APIC agents. Note that hierarchical APIC networks requires a specialcluster manager device, which is not part of the local or the I/O APIC units.

7.5.9.5. ARBITRATION PRIORITY

Each local APIC is given an arbitration priority of from 0 to 15 upon reset. The I/O APIC usesthis priority during arbitration rounds to determine which local APIC should be allowed totransmit a message on the APIC bus when multiple local APICs are issuing messages. The localAPIC with the highest arbitration priority wins access to the APIC bus. Upon completion of anarbitration round, the winning local APIC lowers its arbitration priority to 0 and the losing localAPICs each raise theirs by 1. In this manner, the I/O APIC distributes message bus-cyclesamong the contesting local APICs.

The current arbitration priority for a local APIC is stored in a 4-bit, software-transparent arbi-tration ID (Arb ID) register. During reset, this register is initialized to the APIC ID number(stored in the local APIC ID register). The INIT-deassert command resynchronizes the arbitra-tion priorities of the local APICs by resetting Arb ID register of each agent to its current APICID value.

7.5.10. Interrupt Distribution Mechanisms

The APIC supports two mechanisms for selecting the destination processor for an interrupt:static and dynamic. Static distribution is used to access a specific processor in the network.Using this mechanism, the interrupt is unconditionally delivered to all local APICs that matchthe destination information supplied with the interrupt. The following delivery modes fall intothe static distribution category: fixed, SMI, NMI, EXTINT, and start-up.

Dynamic distribution assigns incoming interrupts to the lowest priority processor, which isgenerally the least busy processor. It can be programmed in the LVT for local interrupt deliveryor the ICR for bus messages. Using dynamic distribution, only the “lowest priority” delivmode is allowed. From all processors listed in the destination, the processor selected is whose current arbitration priority is the lowest. The latter is specified in the arbitration priregister (APR), refer to Section 7.5.13.4., “Arbitration Priority Register (APR)” If more than processor shares the lowest priority, the processor with the highest arbitration priorityunique value in the Arb ID register) is selected.

In lowest priority mode, if a focus processor exists, it may accept the interrupt, regardless of ipriority. A processor is said to be the focus of an interrupt if it is currently servicing that inteor if it has a pending request for that interrupt.

7-22

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (243)

MULTIPLE-PROCESSOR MANAGEMENT

-8. Therupte for

ce-R)

theer

ayly.henle

lee

in

e-rs-

r-e

ted isTof

r-hed-

x-e

7.5.11. Local Vector Table

The local APIC contains a local vector table (LVT), specifying interrupt delivery and statusinformation for the local interrupts. The information contained in this table includes the inter-rupt’s associated vector, delivery mode, status bits and other data as shown in Figure 7LVT incorporates five 32-bit entries: one for the timer, one each for the two local inter(LINT0 and LINT1) pins, one for the error interrupt, and (in the P6 family processors) onthe performance-monitoring counter interrupt.

The fields in the LVT are as follows:

Vector Interrupt vector number.

Delivery Mode Defined only for local interrupt entries 1 and 2 and the performanmonitoring counter. The timer and the error status register (ESgenerate only edge triggered maskable hardware interrupts tolocal processor. The delivery mode field does not exist for the timand error interrupts. The performance-monitoring counter LVT mbe programmed with a Deliver Mode equal to Fixed or NMI onNote that certain delivery modes will only operate as intended wused in conjunction with a specific Trigger Mode. The allowabdelivery modes are as follows:

000 (Fixed) Delivers the interrupt, received on the locainterrupt pin, to this processor as specified in thcorresponding LVT entry. The trigger mode can bedge or level. Note, if the processor is not usedconjunction with an I/O APIC, the fixed deliverymode may be software programmed for an edgtriggered interrupt, but the P6 family processoimplementation will always operate in a leveltriggered mode.

100 (NMI) Delivers the interrupt, received on the local interupt pin, to this processor as an NMI interrupt. Thvector information is ignored. The NMI interrupis treated as edge-triggered, even if programmotherwise. Note that the NMI may be masked. Itthe software's responsibility to program the LVmask bit according to the desired behavior NMI.

111 (ExtINT) Delivers the interrupt, received on the local interupt pin, to this processor and responds as if tinterrupt originated in an externally connecte(8259A-compatible) interrupt controller. A special INTA bus cycle corresponding to ExtINT, isrouted to the external controller. The latter is epected to supply the vector information. When thdelivery mode is ExtINT, the trigger-mode is

7-23

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (244)

MULTIPLE-PROCESSOR MANAGEMENT

level-triggered, regardless of how the APIC trig-gering mode is programmed. The APIC architec-ture supports only one ExtINT source in a system,usually contained in the compatibility bridge.

Figure 7-8. Local Vector Table (LVT)

31 07

Vector

Timer Mode0: One-shot1: Periodic

1215161718

Delivery Mode000: Fixed100: NMI

Mask0: Not Masked1: Masked

Address: FEE0 0350H

Value After Reset: 0001 0000H

Reserved

12131516

Vector

31 07810

Address: FEE0 0360HAddress: FEE0 0370H

Vector

Vector

ERROR

LINT1

LINT0

Value after Reset: 0001 0000HAddress: FEE0 0320H

111: ExtlNTAll other combinationsare Reserved

Interrupt InputPin Polarity

Trigger Mode0: Edge1: Level

RemoteIRR

Delivery Status0: Idle1: Send Pending

Timer

13 11 8

11

14

17

Address: FEE0 0340H

PCINT Vector

7-24

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (245)

MULTIPLE-PROCESSOR MANAGEMENT

gisterruptsallyrrupts. ICR-

t not all

Delivery Status (read only)Holds the current status of interrupt delivery. Two states are defined:

0 (Idle) There is currently no activity for this interrupt, orthe previous interrupt from this source has com-pleted.

1 (Send Pending)Indicates that the interrupt transmission has start-ed, but has not yet been completely accepted.

Interrupt Input Pin PolaritySpecifies the polarity of the corresponding interrupt pin: (0) activehigh or (1) active low.

Remote Interrupt Request Register (IRR) BitUsed for level triggered interrupts only; its meaning is undefined foredge triggered interrupts. For level triggered interrupts, the bit is setwhen the logic of the local APIC accepts the interrupt. The remoteIRR bit is reset when an EOI command is received from theprocessor.

Trigger Mode Selects the trigger mode for the local interrupt pins when the deliverymode is Fixed: (0) edge sensitive and (1) level sensitive. When thedelivery mode is NMI, the trigger mode is always level sensitive;when the delivery mode is ExtINT, the trigger mode is always levelsensitive. The timer and error interrupts are always treated as edgesensitive.

Mask Interrupt mask: (0) enables reception of the interrupt and (1) inhibitsreception of the interrupt.

Timer Mode Selects the timer mode: (0) one-shot and (1) periodic (refer to Section7.5.18., “Timer”).

7.5.12. Interprocessor and Self-Interrupts

A processor generates interprocessor interrupts by writing into the interrupt command re(ICR) of its local APIC (refer to Figure 7-9). The processor may use the ICR for self interor for interrupting other processors (for example, to forward device interrupts originaccepted by it to other processors for service). In addition, special inter-processor inte(IPI) such as the start-up IPI message, can only be delivered using the ICR mechanismbased interrupts are treated as edge triggered even if programmed otherwise. Note thacombinations of options for ICR generated interrupts are valid (refer to Table 7-2).

7-25

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (246)

MULTIPLE-PROCESSOR MANAGEMENT

All fields of the ICR are read-write by software with the exception of the delivery status field,which is read-only. Writing to the 32-bit word that contains the interrupt vector causes the inter-rupt message to be sent. The ICR consists of the following fields.

Vector The vector identifying the interrupt being sent. The localAPICregister addresses are summarized in Table 7-1.

Delivery Mode Specifies how the APICs listed in the destination field should actupon reception of the interrupt. Note that all interprocessor interruptsbehave as edge triggered interrupts (except for INIT level de-assertmessage) even if they are programmed as level triggered interrupts.

000 (Fixed) Deliver the interrupt to all processors listed in thedestination field according to the information pro-vided in the ICR. The fixed interrupt is treated as

Figure 7-9. Interrupt Command Register (ICR)

31 0

Reserved

7

Vector

Destination Shorthand

810

Delivery Mode000: Fixed001: Lowest Priority

00: Dest. Field01: Self

111213141516171819

10: All Incl. Self11: All Excl. Self

010: SMI011: Reserved100: NMI101: INIT110: Start Up111: Reserved

Destination Mode0: Physical1: Logical

Delivery Status0: Idle1: Send Pending

Level0 = De-assert1 = Assert

Trigger Mode0: Edge1: Level

63 32

ReservedDestination Field

56

Address: FEE0 0310HValue after Reset: 0H

Reserved

20

55

7-26

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (247)

MULTIPLE-PROCESSOR MANAGEMENT

-

in ahe

an edge-triggered interrupt even if programmedotherwise.

001 (Lowest Priority)Same as fixed mode, except that the interrupt isdelivered to the processor executing at the lowestpriority among the set of processors listed in thedestination.

010 (SMI) Only the edge trigger mode is allowed. The vectorfield must be programmed to 00B.

011 (Reserved)

100 (NMI) Delivers the interrupt as an NMI interrupt to allprocessors listed in the destination field. The vec-tor information is ignored. NMI is treated as anedge triggered interrupt even if programmed oth-erwise.

101 (INIT) Delivers the interrupt as an INIT signal to all pro-cessors listed in the destination field. As a result,all addressed APICs will assume their INIT state.As in the case of NMI, the vector information isignored, and INIT is treated as an edge triggeredinterrupt even if programmed otherwise.

101 (INIT Level De-assert)(The trigger mode must also be set to 1 and levelmode to 0.) Sends a synchronization message toall APIC agents to set their arbitration IDs to thevalues of their APIC IDs. Note that the INIT inter-rupt is sent to all agents, regardless of the destina-tion field value. However, at least one validdestination processor should be specified. For fu-ture compatibility, the software is requested to usea broadcast-to-all (“all-incl-self” shorthand, as described below).

110 (Start-Up) Sends a special message between processorsmultiple-processor system. For details refer to tPentium® Pro Family Developer’s Manual, Vol-ume 1. The Vector information contains the start-up address for the multiple-processor boot-up pro-tocol. Start-up is treated as an edge triggered inter-rupt even if programmed otherwise. Note thatinterrupts are not automatically retried by thesource APIC upon failure in delivery of the mes-sage. It is up to the software to decide whether a

7-27

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (248)

MULTIPLE-PROCESSOR MANAGEMENT

retry is needed in the case of failure, and issue aretry message accordingly.

Destination Mode Selects either (0) physical or (1) logical destination mode.

Delivery Status Indicates the delivery status:

0 (Idle) There is currently no activity for this interrupt, orthe previous interrupt from this source has com-pleted.

1 (Send Pending)Indicates that the interrupt transmission has start-ed, but has not yet been completely accepted.

Level For INIT level de-assert delivery mode the level is 0. For all othermodes the level is 1.

Trigger Mode Used for the INIT level de-assert delivery mode only.

Destination ShorthandIndicates whether a shorthand notation is used to specify the destina-tion of the interrupt and, if so, which shorthand is used. Destinationshorthands do not use the 8-bit destination field, and can be sent bysoftware using a single write to the lower 32-bit part of the APICinterrupt command register. Shorthands are defined for the followingcases: software self interrupt, interrupt to all processors in the systemincluding the sender, interrupts to all processors in the systemexcluding the sender.

00: (destination field, no shorthand)The destination is specified in bits 56 through 63of the ICR.

01: (self) The current APIC is the single destination of theinterrupt. This is useful for software self inter-rupts. The destination field is ignored. Refer to Ta-ble 7-2 for description of supported modes. Notethat self interrupts do not generate bus messages.

10: (all including self)The interrupt is sent to all processors in the systemincluding the processor sending the interrupt. TheAPIC will broadcast a message with the destina-tion field set to FH. Refer to Table 7-2 for descrip-tion of supported modes.

11: (all excluding self)The interrupt is sent to all processors in the systemwith the exception of the processor sending the in-terrupt. The APIC will broadcast a message using

7-28

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (249)

MULTIPLE-PROCESSOR MANAGEMENT

ghta-the

the physical destination mode and destinationfield set to FH.

Destination This field is only used when the destination shorthand field is set to“dest field”. If the destination mode is physical, then bits 56 throu59 contain the APIC ID. In logical destination mode, the interpretion of the 8-bit destination field depends on the DFR and LDR of local APIC Units.

Table 7-2 shows the valid combinations for the fields in the interrupt control register.

NOTES:

1. Valid. Treated as edge triggered if Level = 1 (assert), otherwise ignored.

2. Valid. Treated as edge triggered when Level = 1 (assert); when Level = 0 (deassert), treated as “INITLevel Deassert” message. Only INIT level deassert messages are allowed to have level = deassert. Forall other messages the level must be “assert.”

3. Invalid. The behavior of the APIC is undefined.

4. X—Don’t care.

Table 7-2. Valid Combinations for the APIC Interrupt Command Register

Trigger Mode Destination Mode Delivery Mode

Valid/Invalid

Destination Shorthand

Edge Physical or Logical Fixed, Lowest Priority, NMI, SMI, INIT, Start-Up

Valid Dest. Field

Level Physical or Logical Fixed, Lowest Priority, NMI 1 Dest. field

Level Physical or Logical INIT 2 Dest. Field

Level x4 SMI, Start-Up Invalid3 x

Edge x Fixed Valid Self

Level x Fixed 1 Self

x x Lowest Priority, NMI, INIT, SMI, Start-Up

Invalid3 Self

Edge x Fixed Valid All inc Self

Level x Fixed 1 All inc Self

x x Lowest Priority, NMI, INIT, SMI, Start-Up

Invalid3 All inc Self

Edge x Fixed, Lowest Priority, NMI, INIT, SMI, Start-Up

Valid All excl Self

Level x Fixed, Lowest Priority, NMI 1 All excl Self

Level x SMI, Start-Up Invalid3 All excl Self

Level x INIT 2 All excl Self

7-29

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (250)

MULTIPLE-PROCESSOR MANAGEMENT

e

, butenrre-

butivedssorngisage

7-11.

7.5.13. Interrupt Acceptance

Three 256-bit read-only registers (the IRR, ISR, and TMR registers) are involved in the interruptacceptance logic (refer to Figure 7-10). The 256 bits represents the 256 possible vectors.Because vectors 0 through 15 are reserved, so are bits 0 through 15 in these registers. The func-tions of the three registers are as follows:

TMR (trigger mode register)Upon acceptance of an interrupt, the corresponding TMR bit iscleared for edge triggered interrupts and set for level interrupts. If theTMR bit is set, the local APIC sends an EOI message to all I/OAPICs as a result of software issuing an EOI command (refer toSection 7.5.13.6., “End-Of-Interrupt (EOI)” for a description of thEOI register).

IRR (interrupt request register) Contains the active interrupt requests that have been acceptednot yet dispensed by the current local APIC. A bit in IRR is set whthe APIC accepts the interrupt. The IRR bit is cleared, and a cosponding ISR bit is set when the INTA cycle is issued.

ISR (in-service register)Marks the interrupts that have been delivered to the processor,have not been fully serviced yet, as an EOI has not yet been recefrom the processor. The ISR reflects the current state of the proceinterrupt queue. The ISR bit for the highest priority IRR is set durithe INTA cycle. During the EOI cycle, the highest priority ISR bit cleared, and if the corresponding TMR bit was set, an EOI messis sent to all I/O APICs.

7.5.13.1. INTERRUPT ACCEPTANCE DECISION FLOW CHART

The process that the APIC uses to accept an interrupt is shown in the flow chart in FigureThe response of the local APIC to the start-up IPI is explained in the Pentium® Pro FamilyDeveloper’s Manual, Volume 1.

Figure 7-10. IRR, ISR and TMR Registers

255 0

Reserved

Addresses: IRR FEE0 0200H - FEE0 0270H

Value after reset: 0H

16 15

IRR

Reserved ISR

Reserved TMR

ISR FEE0 0100H - FEE0 0170HTMR FEE0 0180H - FEE0 01F0H

7-30

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (251)

MULTIPLE-PROCESSOR MANAGEMENT

7.5.13.2. TASK PRIORITY REGISTER

Task priority register (TPR) provides a priority threshold mechanism for interrupting theprocessor (refer to Figure 7-12). Only interrupts whose priority is higher than that specified inthe TPR will be serviced. Other interrupts are recorded and are serviced as soon as the TPR valueis decreased enough to allow that. This enables the operating system to block temporarilyspecific interrupts (generally low priority) from disturbing high-priority tasks execution. Thepriority threshold mechanism is not applicable for delivery modes excluding the vector infor-mation (that is, for ExtINT, NMI, SMI, INIT, INIT-Deassert, and Start-Up delivery modes).

Figure 7-11. Interrupt Acceptance Flow Chart for the Local APIC

Wait to ReceiveBus Message

Belongto

Destination?

Is itNMI/SMI/INIT

/ExtINT?

DeliveryMode?

AmI

Focus?

OtherFocus?

Is InterruptSlot Available?

Is Statusa Retry?

DiscardMessage

AcceptMessage

Yes

Yes

AcceptMessage

IsInterrupt Slot

Available?Arbitrate

Yes

Am IWinner?

AcceptMessage

YesNo

Set Statusto Retry

No

No

Yes

Set Statusto Retry

No

DiscardMessage

No

AcceptMessage

Yes

LowestPriorityFixed

Yes No

No

Yes

No

7-31

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (252)

MULTIPLE-PROCESSOR MANAGEMENT

-e is

bit

The Task Priority is specified in the TPR. The 4 most-significant bits of the task priority corre-spond to the 16 interrupt priorities, while the 4 least-significant bits correspond to the sub-classpriority. The TPR value is generally denoted as x:y, where x is the main priority and y providesmore precision within a given priority class. When the x-value of the TPR is 15, the APIC willnot accept any interrupts.

7.5.13.3. PROCESSOR PRIORITY REGISTER (PPR)

The processor priority register (PPR) is used to determine whether a pending interrupt can bedispensed to the processor. Its value is computed as follows:

IF TPR[7:4] ≥ ISRV[7:4]THEN

PPR[7:0] = TPR[7:0]ELSE

PPR[7:4] = ISRV[7:4] AND PPR[3:0] = 0

Where ISRV is the vector of the highest priority ISR bit set, or zero if no ISR bit is set. The PPRformat is identical to that of the TPR. The PPR address is FEE000A0H, and its value after resetis zero.

7.5.13.4. ARBITRATION PRIORITY REGISTER (APR)

Arbitration priority register (APR) holds the current, lowest-priority of the processor, a valueused during lowest priority arbitration (refer to Section 7.5.16., “APIC Bus Arbitration Mechanism and Protocol”). The APR format is identical to that of the TPR. The APR valucomputed as the following.

IF (TPR[7:4] ≥ IRRV[7:4]) AND (TPR[7:4] > ISRV[7:4]) THEN

APR[7:0] = TPR[7:0]ELSE

APR[7:4] = max(TPR[7:4] AND ISRV[7:4], IRRV[7:4]), APR[3:0]=0.

Here, IRRV is the interrupt vector with the highest priority IRR bit set or cleared (if no IRRis set). The APR address is FEE0 0090H, and its value after reset is 0.

Figure 7-12. Task Priority Register (TPR)

31 078

Reserved

Address: FEE0 0080HValue after reset: 0H

TaskPriority

7-32

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (253)

MULTIPLE-PROCESSOR MANAGEMENT

7.5.13.5. SPURIOUS INTERRUPT

A special situation may occur when a processor raises its task priority to be greater than or equalto the level of the interrupt for which the processor INTR signal is currently being asserted. Ifat the time the INTA cycle is issued, the interrupt that was to be dispensed has become masked(programmed by software), the local APIC will return a spurious-interrupt vector to theprocessor. Dispensing the spurious-interrupt vector does not affect the ISR, so the handler forthis vector should return without an EOI.

7.5.13.6. END-OF-INTERRUPT (EOI)

During the interrupt serving routine, software should indicate acceptance of lowest-priority,fixed, timer, and error interrupts by writing an arbitrary value into its local APIC end-of-inter-rupt (EOI) register (refer to Figure 7-13). This is an indication for the local APIC it can issue thenext interrupt, regardless of whether the current interrupt service has been terminated or not.Note that interrupts whose priority is higher than that currently in service, do not wait for theEOI command corresponding to the interrupt in service.

Upon receiving end-of-interrupt, the APIC clears the highest priority bit in the ISR and selectsthe next highest priority interrupt for posting to the CPU. If the terminated interrupt was a level-triggered interrupt, the local APIC sends an end-of-interrupt message to all I/O APICs. Note thatEOI command is supplied for the above two interrupt delivery modes regardless of the interruptsource (that is, as a result of either the I/O APIC interrupts or those issued on local pins or usingthe ICR). For future compatibility, the software is requested to issue the end-of-interruptcommand by writing a value of 0H into the EOI register.

7.5.14. Local APIC State

In P6 family processors, all local APICs are initialized in a software-disabled state after power-up. A software-disabled local APIC unit responds only to self-interrupts and to INIT, NMI, SMI,and start-up messages arriving on the APIC Bus. The operation of local APICs during thedisabled state is as follows:

• For the INIT, NMI, SMI, and start-up messages, the APIC behaves normally, as if fullyenabled.

Figure 7-13. EOI Register

31 0

Address: 0FEE0 00B0HValue after reset: 0H

7-33

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (254)

MULTIPLE-PROCESSOR MANAGEMENT

are’s

• Pending interrupts in the IRR and ISR registers are held and require masking or handlingby the CPU.

• A disabled local APIC does not affect the sending of APIC messages. It is softwresponsibility to avoid issuing ICR commands if no sending of interrupts is desired.

• Disabling a local APIC does not affect the message in progress. The local APIC willcomplete the reception/transmission of the current message and then enter the disabledstate.

• A disabled local APIC automatically sets all mask bits in the LVT entries. Trying to resetthese bits in the local vector table will be ignored.

• A software-disabled local APIC listens to all bus messages in order to keep its arbitrationID synchronized with the rest of the system, in the event that it is re-enabled.

For the Pentium® processor, the local APIC is enabled and disabled through a hardware mecha-nism. (Refer to the Pentium® Processor Data Book for a description of this mechanism.)

7.5.14.1. SPURIOUS-INTERRUPT VECTOR REGISTER

Software can enable or disable a local APIC at any time by programming bit 8 of the spurious-interrupt vector register (SVR), refer to Figure 7-14. The functions of the fields in the SVR areas follows:

Spurious Vector Released during an INTA cycle when all pending interrupts aremasked or when no interrupt is pending. Bits 4 through 7 of the thisfield are programmable by software, and bits 0 through 3 are hard-wired to logical ones. Software writes to bits 0 through 3 have noeffect.

APIC Enable Allows software to enable (1) or disable (0) the local APIC. Tobypass APIC completely, use the APIC_BASE_MSR in Figure 7-4.

Focus Processor Determines if focus processor checking is enabled during the lowest

Checking Priority delivery: (0) enabled and (1) disabled.

Figure 7-14. Spurious-Interrupt Vector Register (SVR)

31 0

Reserved

7

1 1 1 1

Focus Processor Checking

APIC Enabled

8910

0: APIC SW Disabled1: APIC SW Enabled

Spurious Vector

Address: FEE0 00F0HValue after reset: 0000 00FFH

0: Enabled1: Disabled

34

7-34

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (255)

MULTIPLE-PROCESSOR MANAGEMENT

7.5.14.2. LOCAL APIC INITIALIZATION

On a hardware reset, the processor and its local APIC are initialized simultaneously. For the P6family processors, the local APIC obtains its initial physical ID from system hardware at thefalling edge of the RESET# signal by sampling 6 lines on the system bus (the BR[3:0]) andcluster ID[1:0] lines) and storing this value into the APIC ID register; for the Pentium®

processor, four lines are sampled (BE0# through BE3#). Refer to the Pentium® Pro & PentiumII Processors Data Book and the Pentium® Processor Data Book for descriptions of this mech-anism.

7.5.14.3. LOCAL APIC STATE AFTER POWER-UP RESET

The state of local APIC registers and state machines after a power-up reset are as follows:

• The following registers are all reset to 0: the IRR, ISR, TMR, ICR, LDR, and TPRregisters; the holding registers; the timer initial count and timer current count registers; theremote register; and the divide configuration register.

• The DFR register is reset to all 1s.

• The LVT register entries are reset to 0 except for the mask bits, which are set to 1s.

• The local APIC version register is not affected.

• The local APIC ID and Arb ID registers are loaded from processor input pins (the Arb IDregister is set to the APIC ID value for the local APIC).

• All internal state machines are reset.

• APIC is software disabled (that is, bit 8 of the SVR register is set to 0).

• The spurious-interrupt vector register is initialized to FFH.

7.5.14.4. LOCAL APIC STATE AFTER AN INIT RESET

An INIT reset of the processor can be initiated in either of two ways:

• By asserting the processor’s INIT# pin.

• By sending the processor an INIT IPI (sending an APIC bus-based interrupt with thedelivery mode set to INIT).

Upon receiving an INIT via either of these two mechanisms, the processor responds by begin-ning the initialization process of the processor core and the local APIC. The state of the localAPIC following an INIT reset is the same as it is after a power-up reset, except that the APICID and Arb ID registers are not affected.

7.5.14.5. LOCAL APIC STATE AFTER INIT-DEASSERT MESSAGE

An INIT-disassert message has no affect on the state of the APIC, other than to reload the arbi-tration ID register with the value in the APIC ID register.

7-35

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (256)

MULTIPLE-PROCESSOR MANAGEMENT

n thebegin- currentitra- thenceus to

by 1.ssumestion, by 1.

sues of its

eously.orities.

7.5.15. Local APIC Version Register

The local APIC contains a hardwired version register, which software can use to identify theAPIC version (refer to Figure 7-16). In addition, the version register specifies the size of LVTused in the specific implementation. The fields in the local APIC version register are as follows:

Version The version numbers of the local APIC or an external 82489DXAPIC controller:

1XH Local APIC.

0XH 82489DX.

20H through FFHReserved.

Max LVT Entry Shows the number of the highest order LVT entry. For the P6 familyprocessors, having 5 LVT entries, the Max LVT number is 4; for thePentium® processor, having 4 LVT entries, the Max LVT number is 3.

7.5.16. APIC Bus Arbitration Mechanism and Protocol

Because only one message can be sent at a time on the APIC bus, the I/O APIC and local APICsemploy a “rotating priority” arbitration protocol to gain permission to send a message oAPIC bus. One or more APICs may start sending their messages simultaneously. At the ning of every message, each APIC presents the type of the message it is sending and itsarbitration priority on the APIC bus. This information is used for arbitration. After each arbtion cycle (within an arbitration round, only the potential winners keep driving the bus. Bytime all arbitration cycles are completed, there will be only one APIC left driving the bus. Oa winner is selected, it is granted exclusive use of the bus, and will continue driving the bsend its actual message.

After each successfully transmitted message, all APICs increase their arbitration priorityThe previous winner (that is, the one that has just successfully transmitted its message) aa priority of 0 (lowest). An agent whose arbitration priority was 15 (highest) during arbitrabut did not send a message, adopts the previous winner’s arbitration priority, incremented

Note that the arbitration protocol described above is slightly different if one of the APICs isa special End-Of-Interrupt (EOI). This high-priority message is granted the bus regardlesssender’s arbitration priority, unless more than one APIC issues an EOI message simultanIn the latter case, the APICs sending the EOI messages arbitrate using their arbitration pri

Figure 7-15. Local APIC Version Register

31 0

Reserved

2324 15

VersionMax. LVT

Value after reset: 000N 00VVHV = Version, N = # of LVT entries

Entry

7

Address: FEE0 0030H

16 8

Reserved

7-36

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (257)

MULTIPLE-PROCESSOR MANAGEMENT

ruptorityster)

d lowestow.

levelf soft-EOI

Bit0)APICn error,e. Therror

tart-hort

If the APICs are set up to use “lowest priority” arbitration (refer to Section 7.5.10., “InterDistribution Mechanisms”) and multiple APICs are currently executing at the lowest pri(the value in the APR register), the arbitration priorities (unique values in the Arb ID regiare used to break ties. All 8 bits of the APR are used for the lowest priority arbitration.

7.5.16.1. BUS MESSAGE FORMATS

The APICs use three types of messages: EOI message, short message, and non-focusepriority message. The purpose of each type of message and its format are described bel

EOI Message. Local APICs send 14-cycle EOI messages to the I/O APIC to indicate that a triggered interrupt has been accepted by the processor. This interrupt, in turn, is a result oware writing into the EOI register of the local APIC. Table 7-3 shows the cycles in an message.

The checksum is computed for cycles 6 through 9. It is a cumulative sum of the 2-bit (Bit1:logical data values. The carry out of all but the last addition is added to the sum. If any computes a different checksum than the one appearing on the bus in cycle 10, it signals adriving 11 on the APIC bus during cycle 12. In this case, the APICs disregard the messagsending APIC will receive an appropriate error indication (refer to Section 7.5.17., “EHandling”) and resend the message. The status cycles are defined in Table 7-6.

Short Message. Short messages (21-cycles) are used for sending fixed, NMI, SMI, INIT, sup, ExtINT and lowest-priority-with-focus interrupts. Table 7-4 shows the cycles in a smessage.

Table 7-3. EOI Message (14 Cycles)

Cycle Bit1 Bit0

1 1 1 11 = EOI

2 ArbID3 0 Arbitration ID bits 3 through 0

3 ArbID2 0

4 ArbID1 0

5 ArbID0 0

6 V7 V6 Interrupt vector V7 - V0

7 V5 V4

8 V3 V2

9 V1 V0

10 C C Checksum for cycles 6 - 9

11 0 0

12 A A Status Cycle 0

13 A1 A1 Status Cycle 1

14 0 0 Idle

7-37

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (258)

MULTIPLE-PROCESSOR MANAGEMENT

f “all-15tinguish

iden-othersage ise EOI 7.5.17.,

sedgh 20

If the physical delivery mode is being used, then cycles 15 and 16 represent the APIC ID andcycles 13 and 14 are considered don’t care by the receiver. If the logical delivery mode is beingused, then cycles 13 through 16 are the 8-bit logical destination field. For shorthands oincl-self” and “all-excl-self,” the physical delivery mode and an arbitration priority of (D0:D3 = 1111) are used. The agent sending the message is the only one required to disbetween the two cases. It does so using internal information.

When using lowest priority delivery with an existing focus processor, the focus processortifies itself by driving 10 during cycle 19 and accepts the interrupt. This is an indication to APICs to terminate arbitration. If the focus processor has not been found, the short mesextended on-the-fly to the non-focused lowest-priority message. Note that except for thmessage, messages generating a checksum or an acceptance error (refer to Section“Error Handling”) terminate after cycle 21.

Nonfocused Lowest Priority Message. These 34-cycle messages (refer to Table 7-5) are uin the lowest priority delivery mode when a focus processor is not present. Cycles 1 throu

Table 7-4. Short Message (21 Cycles)

Cycle Bit1 Bit0

1 0 1 0 1 = normal

2 ArbID3 0 Arbitration ID bits 3 through 0

3 ArbID2 0

4 ArbID1 0

5 ArbID0 0

6 DM M2 DM = Destination Mode

7 M1 M0 M2-M0 = Delivery mode

Cycle Bit1 Bit0

8 L TM L = Level, TM = Trigger Mode

9 V7 V6 V7-V0 = Interrupt Vector

10 V5 V4

11 V3 V2

12 V1 V0

13 D7 D6 D7-D0 = Destination

14 D5 D4

15 D3 D2

16 D1 D0

17 C C Checksum for cycles 6-16

18 0 0

19 A A Status cycle 0

20 A1 A1 Status cycle 1

21 0 0 Idle

7-38

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (259)

MULTIPLE-PROCESSOR MANAGEMENT

are same as for the short message. If during the status cycle (cycle 19) the state of the (A:A) flagsis 10B, a focus processor has been identified, and the short message format is used (refer toTable 7-4). If the (A:A) flags are set to 00B, lowest priority arbitration is started and the 34-cycles of the nonfocused lowest priority message are competed. For other combinations of statusflags, refer to Section 7.5.16.2., “APIC Bus Status Cycles”

Table 7-5. Nonfocused Lowest Priority Message (34 Cycles)Cycle Bit0 Bit1

1 0 1 0 1 = normal

2 ArbID3 0 Arbitration ID bits 3 through 0

3 ArbID2 0

4 ArbID1 0

5 ArbID0 0

6 DM M2 DM = Destination mode

7 M1 M0 M2-M0 = Delivery mode

8 L TM L = Level, TM = Trigger Mode

9 V7 V6 V7-V0 = Interrupt Vector

10 V5 V4

11 V3 V2

12 V1 V0

13 D7 D6 D7-D0 = Destination

Cycle Bit0 Bit1

14 D5 D4

15 D3 D2

16 D1 D0

17 C C Checksum for cycles 6-16

18 0 0

19 A A Status cycle 0

20 A1 A1 Status cycle 1

21 P7 0 P7 - P0 = Inverted Processor Priority

22 P6 0

23 P5 0

24 P4 0

25 P3 0

26 P2 0

27 P1 0

28 P0 0

29 ArbID3 0 Arbitration ID 3 -0

30 ArbID2 0

31 ArbID1 0

32 ArbID0 0

33 A2 A2 Status Cycle

34 0 0 Idle

7-39

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (260)

MULTIPLE-PROCESSOR MANAGEMENT

tion Only 33

us flagsreted,

Cycles 21 through 28 are used to arbitrate for the lowest priority processor. The processorsparticipating in the arbitration drive their inverted processor priority on the bus. Only the localAPICs having free interrupt slots participate in the lowest priority arbitration. If no such APICexists, the message will be rejected, requiring it to be tried at a later time.

Cycles 29 through 32 are also used for arbitration in case two or more processors have the samelowest priority. In the lowest priority delivery mode, all combinations of errors in cycle 33 (A2A2) will set the “accept error” bit in the error status register (refer to Figure 7-16). Arbitrapriority update is performed in cycle 20, and is not affected by errors detected in cycle 33.the local APIC that wins in the lowest priority arbitration, drives cycle 33. An error in cyclewill force the sender to resend the message.

7.5.16.2. APIC BUS STATUS CYCLES

Certain cycles within an APIC bus message are status cycles. During these cycles the stat(A:A) and (A1:A1) are examined. Table 7-6 shows how these status flags are interpdepending on the current delivery mode and existence of a focus processor.

Table 7-6. APIC Bus Status Cycles Interpretation

DeliveryMode A Status A1 Status A2 Status

Update ArbID and

Cycle#MessageLength Retry

EOI 00: CS_OK 10: Accept XX: Yes, 13 14 Cycle No

00: CS_OK 11: Retry XX: Yes, 13 14 Cycle Yes

00: CS_OK 0X: Accept Error XX: No 14 Cycle Yes

11: CS_Error XX: XX: No 14 Cycle Yes

10: Error XX: XX: No 14 Cycle Yes

01: Error XX: XX: No 14 Cycle Yes

Fixed 00: CS_OK 10: Accept XX: Yes, 20 21 Cycle No

00: CS_OK 11: Retry XX: Yes, 20 21 Cycle Yes

00: CS_OK 0X: Accept Error XX: No 21 Cycle Yes

11: CS_Error XX: XX: No 21 Cycle Yes

10: Error XX: XX: No 21 Cycle Yes

01: Error XX: XX: No 21 Cycle Yes

NMI, SMI, INIT, ExtINT,Start-Up

00: CS_OK 10: Accept XX: Yes, 20 21 Cycle No

00: CS_OK 11: Retry XX: Yes, 20 21 Cycle Yes

00: CS_OK 0X: Accept Error XX: No 21 Cycle Yes

11: CS_Error XX: XX: No 21 Cycle Yes

10: Error XX: XX: No 21 Cycle Yes

01: Error XX: XX: No 21 Cycle Yes

7-40

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (261)

MULTIPLE-PROCESSOR MANAGEMENT

Lowest 00: CS_OK, NoFocus 11: Do Lowest 10: Accept Yes, 20 34 Cycle No

00: CS_OK, NoFocus 11: Do Lowest 11: Error Yes, 20 34 Cycle Yes

00: CS_OK, NoFocus 11: Do Lowest 0X: Error Yes, 20 34 Cycle Yes

00: CS_OK, NoFocus 10: End and Retry XX: Yes, 20 34 Cycle Yes

00: CS_OK, NoFocus 0X: Error XX: No 34 Cycle Yes

10: CS_OK, Focus XX: XX: Yes, 20 34 Cycle No

11: CS_Error XX: XX: No 21 Cycle Yes

01: Error XX: XX: No 21 Cycle Yes

Table 7-6. APIC Bus Status Cycles Interpretation (Contd.)

7-41

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (262)

MULTIPLE-PROCESSOR MANAGEMENT

7.5.17. Error Handling

The local APIC sets flags in the error status register (ESR) to record all the errors that is detects(refer to Figure 7-16). The ESR is a read/write register and is reset after being written to by theprocessor. A write to the ESR must be done just prior to reading the ESR to allow the register tobe updated. An error interrupt is generated when one of the error bits is set. Error bits are cumu-lative. The ESR must be cleared by software after unmasking of the error interrupt entry in theLVT is performed (by executing back-to-back a writes). If the software, however, wishes tohandle errors set in the register prior to unmasking, it should write and then read the ESR prioror immediately after the unmasking.

Figure 7-16. Error Status Register (ESR)

Address: FEE0 0280HValue after reset: 0H

31 0

Reserved

78 123456

Illegal Register AddressReceived Illegal VectorSend Illegal VectorReservedReceive Accept ErrorSend Accept ErrorReceive CS ErrorSend CS Error

7-42

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (263)

MULTIPLE-PROCESSOR MANAGEMENT

divideimer local

sage

sage

not

s not

that

e itble

notter

R)

The functions of the ESR flags are as follows:

7.5.18. Timer

The local APIC unit contains a 32-bit programmable timer for use by the local processor. Thistimer is configured through the timer register in the local vector table (refer to Figure 7-8). Thetime base is derived from the processor’s bus clock, divided by a value specified in the configuration register (refer to Figure 7-17). After reset, the timer is initialized to zero. The tsupports one-shot and periodic modes. The timer can be configured to interrupt theprocessor with an arbitrary vector.

Send CS Error Set when the local APIC detects a check sum error for a mesthat was sent by it.

Receive CS Error Set when the local APIC detects a check sum error for a mesthat was received by it.

Send Accept Error Set when the local APIC detects that a message it sent wasaccepted by any APIC on the bus.

Receive Accept Error Set when the local APIC detects that the message it received waaccepted by any APIC on the bus, including itself.

Send Illegal Vector Set when the local APIC detects an illegal vector in the messageit is sending on the bus.

Receive Illegal Vector Set when the local APIC detects an illegal vector in the messagreceived, including an illegal vector code in the local vector tainterrupts and self-interrupts from ICR.

Illegal Reg. Address (P6 Family ProcessorsOnly)

Set when the processor is trying to access a register that isimplemented in the P6 family processors’ local APIC regisaddress space; that is, within FEE00000H (the APICBase MSthrough FEE003FFH (the APICBase MSR plus 4K Bytes).

Figure 7-17. Divide Configuration Register

Address: FEE0 03E0HValue after reset: 0H

Divide Value (bits 0, 1 and 3)000: Divide by 2001: Divide by 4010: Divide by 8011: Divide by 16100: Divide by 32101: Divide by 64110: Divide by 128111: Divide by 1

31 0

Reserved

1234

7-43

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (264)

MULTIPLE-PROCESSOR MANAGEMENT

The timer is started by programming its initial-count register, refer to Figure 7-18. The initialcount value is copied into the current-count register and count-down is begun. After the timerreaches zero in one-shot mode, an interrupt is generated and the timer remains at its 0 value untilreprogrammed. In periodic mode, the current-count register is automatically reloaded from theinitial-count register when the count reaches 0 and the count-down is repeated. If during thecount-down process the initial-count register is set, the counting will restart and the new valuewill be used. The initial-count register is read-write by software, while the current-count registeris read only.

7.5.19. Software Visible Differences Between the Local APIC and the 82489DX

The following local APIC features differ in their definitions from the 82489DX features:

• When the local APIC is disabled, its internal registers are not cleared. Instead, setting themask bits in the local vector table to disable the local APIC merely causes it to ceaseaccepting the bus messages except for INIT, SMI, NMI, and start-up. In the 82489DX,when the local unit is disabled by resetting the bit 8 of the spurious vector register, all theinternal registers including the IRR, ISR and TMR are cleared and the mask bits in thelocal vector tables are set to logical ones. In the disabled mode, 82489DX local unit willaccept only the reset deassert message.

• In the local APIC, NMI and INIT (except for INIT deassert) are always treated as edgetriggered interrupts, even if programmed otherwise. In the 82489DX these interrupts arealways level triggered.

• In the local APIC, interrupts generated through ICR messages are always treated as edgetriggered (except INIT Deassert). In the 82489DX, the ICR can be used to generate eitheredge or level triggered interrupts.

• Logical Destination register the local APIC supports 8 bits, where it supports 32 bits forthe 82489DX.

• APIC ID register is 4 bits wide for the local APIC and 8 bits wide for the 82489DX.

• The remote read delivery mode provided in the 82489DX is not supported in the IntelArchitecture local APIC.

Figure 7-18. Initial Count and Current Count Registers

31 0

Initial Count

Address: Initial Count

Value after reset: 0H

Current Count

Current Count FEE0 0390HFEE0 0380H

7-44

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (265)

MULTIPLE-PROCESSOR MANAGEMENT

itself,

lf thenal, it

7.5.20. Performance Related Differences between the Local APIC and the 82489DX

For the 82489DX, in the lowest priority mode, all the target local APICs specified by the desti-nation field participate in the lowest priority arbitration. Only those local APICs which have freeinterrupt slots will participate in the lowest priority arbitration.

7.5.21. New Features Incorporated in the Pentium® and P6 Family Processors Local APIC

The local APIC in the Pentium® and P6 family processors have the following new features notfound in the 82489DX.

• The local APIC supports cluster addressing in logical destination mode.

• Focus processor checking can be enabled/disabled in the local APIC.

• Interrupt input signal polarity can be programmed in the local APIC.

• The local APIC supports SMI through the ICR and I/O redirection table.

• The local APIC incorporates an error status register to log and report errors to theprocessor.

In the P6 family processors, the local APIC incorporates an additional local vector table entryto handle performance monitoring counter interrupts.

7.6. DUAL-PROCESSOR (DP) INITIALIZATION PROTOCOL

The Pentium® processor contains an internal dual-processing (DP) mechanism that permits twoprocessors to be initialized and configured for tightly coupled symmetric multiprocessing(SMP). The DP initialization protocol supports the controlled booting and configuration of thetwo Pentium® processors. When configuration has been completed, the two Pentium® processorscan share the processing load for the system and share the handling of interrupts received fromthe system’s I/O APIC.

The Pentium® DP initialization protocol defines two processors:

• Primary processor (also called the bootstrap processor, BSP)—This processor bootsconfigures the APIC environment, and starts the second processor.

• Secondary processor (also called the dual processor, DP)—This processor boots itsewaits for a startup signal from the primary processor. Upon receiving the startup signcompletes its configuration.

Appendix C, Dual-Processor (DP) Bootup Sequence Example (Specific to Pentium® Proces-sors) gives an example (with code) of the bootup sequence for two Pentium® processors oper-ating in a DP configuration.

7-45

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (266)

MULTIPLE-PROCESSOR MANAGEMENT

been

een

Appendix E, Programming the LINT0 and LINT1 Inputs describes (with code) how to programthe LINT[0:1] pins of the processor’s local APICs after a dual-processor configuration hascompleted.

7.7. MULTIPLE-PROCESSOR (MP) INITIALIZATION PROTOCOL

The Intel Architecture (beginning with the Pentium® Pro processors) defines a multiple-processor (MP) initialization protocol, for use with both single- and multiple-processor systems.(Here, multiple processors is defined as two or more processors.) The primary goals of thisprotocol are as follows:

• To permit sequential or controlled booting of multiple processors (from 2 to 4) with nodedicated system hardware. The initialization algorithm is not limited to 4 processors; itcan support supports from 1 to 15 processors in a multiclustered system when the APICbusses are tied together. Larger systems are not supported.

• To be able to initiate the MP protocol without the need for a dedicated signal or BSP.

• To provide fault tolerance. No single processor is geographically designated the BSP. TheBSP is determined dynamically during initialization.

The following sections describe an MP initialization protocol.

Appendix D, Multiple-Processor (MP) Bootup Sequence Example (Specific to P6 FamilyProcessors) gives an example (with code) of the bootup sequence for two P6 family processorsoperating in an MP configuration.

Appendix E, Programming the LINT0 and LINT1 Inputs describes (with code) how to programthe LINT[0:1] pins of the processor’s local APICs after an MP configuration has bcompleted.

7.7.1. MP Initialization Protocol Requirements and Restrictions

The MP protocol imposes the following requirements and restrictions on the system:

• An APIC clock (APICLK) must be provided on all systems based on the P6 familyprocessors (excluding mobile processors and modules).

• All interrupt mechanisms must be disabled for the duration of the MP protocol algorithm,including the window of time between the assertion of INIT# or receipt of an INIT IPI bythe application processors and the receipt of a STARTUP IPI by the application processors.That is, requests generated by interrupting devices must not be seen by the local APIC unit(on board the processor) until the completion of the algorithm. Failure to disable theinterrupt mechanisms may result in processor shutdown.

• The MP protocol should be initiated only after a hardware reset. After completion of theprotocol algorithm, a flag is set in the APIC base MSR of the BSP (APIC_BASE.BSP) toindicate that it is the BSP. This flag is cleared for all other processors. If a processor or thecomplete system is subject to an INIT sequence (either through the INIT# pin or an INIT

7-46

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (267)

MULTIPLE-PROCESSOR MANAGEMENT

y thePIC

of theinimal

ing a

ainingages.ore the

IPI), then the MP protocol is not re-executed. Instead, each processor examines its BSPflag to determine whether the processor should boot or wait for a STARTUP IPI.

7.7.2. MP Protocol Nomenclature

The MP initialization protocol defines two classes of processors:

• The bootstrap processor (BSP)—This primary processor is dynamically selected bMP initialization algorithm. After the BSP has been selected, it configures the Aenvironment, and starts the secondary processors, under software control.

• Application processors (APs)—These secondary processors are the remainder processors in a MP system that were not selected as the BSP. The APs complete a mself-configuration, then wait for a startup signal from the BSP processor. Upon receivstartup signal, an AP completes its configuration.

Table 7-7 describes the interrupt-style abbreviations that will be used through out the remdescription of the MP initialization protocol. These IPIs do not define new interrupt messThey are messages that are special only by virtue of the time that they exist (that is, befRESET sequence is complete).

Table 7-8 describes the various fields of each boot phase IPI.

NOTE:

* For all P6 family processors.

Table 7-7. Types of Boot Phase IPIs

Message Type Abbreviation Description

Boot Inter-Processor Interrupt

BIPI An APIC serial bus message that Symmetric Multiprocessing (SMP) agents use to dynamically determine a BSP after reset.

Final Boot Inter- Processor Interrupt

FIPI An APIC serial bus message that the BSP issues before it fetches from the reset vector. This message has the lowest priority of all boot phase IPIs. When a BSP sees an FIPI that it issued, it fetches the reset vector because no other boot phase IPIs can follow an FIPI.

Startup Inter-Processor Interrupt

SIPI Used to send a new reset vector to a Application Processor (non-BSP) processor in an MP system.

Table 7-8. Boot Phase IPI Message Format

TypeDestination

FieldDestinationShorthand

TriggerMode Level

DestinationMode

DeliveryMode

Vector(Hex)

BIPI Not used All including self

Edge Deassert Don’t Care Fixed(000)

40 to 4E*

FIPI Not used All including self

Edge Deassert Don’t Care Fixed(000)

10 to 1E

SIPI Used All allowed Edge Assert Physical or Logical

StartUp(110)

00 to FF

7-47

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (268)

MULTIPLE-PROCESSOR MANAGEMENT

a P6he P6ot be

tran-brokenAPIC

localntially

lowedmpleteinitial-s that

tem

colantare

For BIPI and FIPI messages, the lower 4 bits of the vector field are equal to the APIC ID of theprocessor issuing the message. The upper 4 bits of the vector field of a BIPI or FIPI can bethought of as the “generation ID” of the message. All processors that run symmetric tofamily processor will have a generation ID of 0100B or 4H. BIPIs in a system based on tfamily processors will therefore use vector values ranging from 40H to 4EH (4FH can nused because FH is not a valid APIC ID).

7.7.3. Error Detection During the MP Initialization Protocol

Errors may occur on the APIC bus during the MP initialization phase. These errors may besient or permanent and can be caused by a variety of failure mechanisms (for example, traces, soft errors during bus usage, etc.). All serial bus related errors will result in an checksum or acceptance error.

The occurrence of an APIC error causes a processor shutdown.

7.7.4. Error Handling During the MP Initialization Protocol

The MP initialization protocol makes the following assumptions:

• If any errors are detected on the APIC bus during execution of the MP initializationprotocol, all processors will shutdown.

• In a system that conforms to Intel Architecture guidelines, a likely error (broken trace,check sum error during transmission) will result in no more than one processor booting.

• The MP initialization protocol will be executed by processors even if they fail their BISTsequences.

7.7.5. MP Initialization Protocol Algorithm

The MP initialization protocol uses the message passing capabilities of the processor’sAPIC to dynamically determine a boot strap processor (BSP). The algorithm used esseimplements a “race for the flag” mechanism using the APIC bus for atomicity.

The MP initialization algorithm is based on the fact that one and only one message is alto exist on the APIC bus at a given time and that once the message is issued, it will co(APIC messages are atomic). Another feature of the APIC architecture that is used in the ization algorithm is the existence of a round-robin priority mechanism between all agentuse the APIC bus.

The MP initialization protocol algorithm performs the following operations in a SMP sys(refer to Figure 7-19):

1. After completing their internal BISTs, all processors start their MP initialization protosequence by issuing BIPIs to “all including self” (at time t=0). The four least significbits of the vector field of the IPI contain each processor's APIC ID. The APIC hardw

7-48

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (269)

MULTIPLE-PROCESSOR MANAGEMENT

thentinuethees an

d isis due (theas tothat it

nly bell also

observes the BNR# (block next request) pin to guarantee that the initial BIPI is not issuedon the APIC bus until the BIST sequence is completed for all processors in the system.

2. When the first BIPI completes (at time t=1), the APIC hardware (in each processor)propagates an interrupt to the processor core to indicate the arrival of the BIPI.

3. The processor compares the four least significant bits of the BIPI’s vector field toprocessor's APIC ID. A match indicates that the processor should be the BSP and cothe initialization sequence. If the APIC ID fails to match the BIPIs vector field, processor is essentially the “loser” or not the BSP. The processor then becomapplication processor and should enter a “wait for SIPI” loop.

4. The winner (the BSP) issues an FIPI. The FIPI is issued to “all including self” anguaranteed to be the last IPI on the APIC bus during the initialization sequence. This to the fact that the round-robin priority mechanism forces the winning APIC agent'sBSPs) arbitration priority to 0. The FIPI is therefore issued by a priority 0 agent and hwait until all other agents have issued their BIPI's. When the BSP receives the FIPI issued (t=5), it will start fetching code at the reset vector (Intel Architecture address).

5. All application processors (non-BSP processors) remain in a “halted” state and can owoken up by SIPIs issued by another processor (note an AP in the startup IPI loop wirespond to BINIT and snoops).

Figure 7-19. SMP System

P6 FamilyProcessor A

P6 FamilyProcessor B

P6 FamilyProcessor C

P6 FamilyProcessor D

BIPI.A BIPI.B BIPI.C BIPI.D FIPI

t=0 t=1 t=2 t=3 t=4 t=5

System (CPU) Bus

APIC Bus

Serial Bus Activity

7-49

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (270)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (271)

8

Processor Management and Initialization

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (272)

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (273)

PROCESSOR MANAGEMENT AND INITIALIZATION

te andslationken

essortocolotstrap the-BSP)fer to

P6

tium

essorcodetically

ary

or

CHAPTER 8PROCESSOR MANAGEMENT AND

INITIALIZATION

This chapter describes the facilities provided for managing processor wide functions and forinitializing the processor. The subjects covered include: processor initialization, FPU initializa-tion, processor configuration, feature determination, mode switching, the MSRs (in thePentium® and P6 family processors), and the MTRRs (in the P6 family processors).

8.1. INITIALIZATION OVERVIEW

Following power-up or an assertion of the RESET# pin, each processor on the system busperforms a hardware initialization of the processor (known as a hardware reset) and an optionalbuilt-in self-test (BIST). A hardware reset sets each processor’s registers to a known staplaces the processor in real-address mode. It also invalidates the internal caches, tranlookaside buffers (TLBs) and the branch target buffer (BTB). At this point, the action tadepends on the processor family:

• P6 family processors—All the processors on the system bus (including a single procin a uniprocessor system) execute the multiple processor (MP) initialization proacross the APIC bus. The processor that is selected through this protocol as the boprocessor (BSP) then immediately starts executing software-initialization code incurrent code segment beginning at the offset in the EIP register. The application (nonprocessors (AP) go into a halt state while the BSP is executing initialization code. ReSection 7.7., “Multiple-Processor (MP) Initialization Protocol” in Chapter 7, Multiple-Processor Management for more details. Note that in a uniprocessor system, the singlefamily processor automatically becomes the BSP.

• Pentium® processors—In either a single- or dual- processor system, a single Pen®

processor is always pre-designated as the primary processor. Following a reset, the primaryprocessor behaves as follows in both single- and dual-processor systems. Using the dual-processor (DP) ready initialization protocol, the primary processor immediately startsexecuting software-initialization code in the current code segment beginning at the offsetin the EIP register. The secondary processor (if there is one) goes into a halt state. (Refer toSection 7.6., “Dual-Processor (DP) Initialization Protocol” in Chapter 7, Multiple-Processor Management for more details.)

• Intel486™ processor—The primary processor (or single processor in a uniprocsystem) immediately starts executing software-initialization code in the current segment beginning at the offset in the EIP register. (The Intel486™ does not automaexecute a DP or MP initialization protocol to determine which processor is the primprocessor.)

The software-initialization code performs all system-specific initialization of the BSPprimary processor and the system logic.

8-1

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (274)

PROCESSOR MANAGEMENT AND INITIALIZATION

efer

eared indi-e EAX

BIST

At this point, for MP (or DP) systems, the BSP (or primary) processor wakes up each AP (orsecondary) processor to enable those processors to execute self-configuration code.

When all processors are initialized, configured, and synchronized, the BSP or primary processorbegins executing an initial operating-system or executive task.

The floating-point unit (FPU) is also initialized to a known state during hardware reset. FPUsoftware initialization code can then be executed to perform operations such as setting the preci-sion of the FPU and the exception masks. No special initialization of the FPU is required toswitch operating modes.

Asserting the INIT# pin on the processor invokes a similar response to a hardware reset. Themajor difference is that during an INIT, the internal caches, MSRs, MTRRs, and FPU state areleft unchanged (although, the TLBs and BTB are invalidated as with a hardware reset). An INITprovides a method for switching from protected to real-address mode while maintaining thecontents of the internal caches.

8.1.1. Processor State After Reset

Table 8-1 shows the state of the flags and other registers following power-up for the Pentium®

Pro, Pentium®, and Intel486™ processors. The state of control register CR0 is 60000010H (rto Figure 8-1), which places the processor is in real-address mode with paging disabled.

8.1.2. Processor Built-In Self-Test (BIST)

Hardware may request that the BIST be performed at power-up. The EAX register is cl(0H) if the processor passes the BIST. A nonzero value in the EAX register after the BISTcates that a processor fault was detected. If the BIST is not requested, the contents of thregister after a hardware reset is 0H.

The overhead for performing a BIST varies between processor families. For example, thetakes approximately 5.5 million processor clock periods to execute on the Pentium® Proprocessor. (This clock count is model-specific, and Intel reserves the right to change the exactnumber of periods, for any of the Intel Architecture processors, without notification.)

8-2

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (275)

PROCESSOR MANAGEMENT AND INITIALIZATION

Table 8-1. 32-Bit Intel Architecture Processor StatesFollowing Power-up, Reset, or INIT

Register P6 Family Processors Pentium® Processor Intel486™ Processor

EFLAGS1 00000002H 00000002H 00000002H

EIP 0000FFF0H 0000FFF0H 0000FFF0H

CR0 60000010H2 60000010H2 60000010H2

CR2, CR3, CR4 00000000H 00000000H 00000000H

MXCSR Pentium® III processor only-Pwr up or Reset: 1F80HFINIT/FNINIT: Unchanged

NA NA

CS Selector = F000HBase = FFFF0000HLimit = FFFFHAR = Present, R/W, Accessed

Selector = F000HBase = FFFF0000HLimit = FFFFHAR = Present, R/W, Accessed

Selector = F000HBase = FFFF0000HLimit = FFFFHAR = Present, R/W, Accessed

SS, DS, ES, FS, GS

Selector = 0000HBase = 00000000HLimit = FFFFHAR = Present, R/W, Accessed

Selector = 0000HBase = 00000000HLimit = FFFFHAR = Present, R/W, Accessed

Selector = 0000HBase = 00000000HLimit = FFFFHAR = Present, R/W, Accessed

EDX 000006xxH 000005xxH 000004xxH

EAX 03 03 03

EBX, ECX, ESI, EDI, EBP, ESP

00000000H 00000000H 00000000H

MM0 through MM74

Pentium® Pro processor - NAPentium® II and Pentium® III processor -Pwr up or Reset: 0000000000000000HFINIT/FNINIT: Unchanged

Pwr up or Reset: 0000000000000000HFINIT/FNINIT: Unchanged

NA

XMM0 through XMM75

Pentium® III processor only-Pwr up or Reset: 0000000000000000HFINIT/FNINIT: Unchanged

NA NA

ST0 through ST74

Pwr up or Reset: +0.0FINIT/FNINIT: Unchanged

Pwr up or Reset: +0.0FINIT/FNINIT: Unchanged

Pwr up or Reset: +0.0FINIT/FNINIT: Unchanged

FPU Control Word4

Pwr up or Reset: 0040HFINIT/FNINIT: 037FH

Pwr up or Reset: 0040HFINIT/FNINIT: 037FH

Pwr up or Reset: 0040HFINIT/FNINIT: 037FH

FPU Status Word4

Pwr up or Reset: 0000HFINIT/FNINIT: 0000H

Pwr up or Reset: 0000HFINIT/FNINIT: 0000H

Pwr up or Reset: 0000HFINIT/FNINIT: 0000H

FPU Tag Word4 Pwr up or Reset: 5555HFINIT/FNINIT: FFFFH

Pwr up or Reset: 5555HFINIT/FNINIT: FFFFH

Pwr up or Reset: 5555HFINIT/FNINIT: FFFFH

FPU Data Operand and CS Seg. Selectors4

Pwr up or Reset: 0000HFINIT/FNINIT: 0000H

Pwr up or Reset: 0000HFINIT/FNINIT: 0000H

Pwr up or Reset: 0000HFINIT/FNINIT: 0000H

8-3

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (276)

PROCESSOR MANAGEMENT AND INITIALIZATION

NOTES:

1. The 10 most-significant bits of the EFLAGS register are undefined following a reset. Software should notdepend on the states of any of these bits.

2. The CD and NW flags are unchanged, bit 4 is set to 1, all other bits are cleared.

3. If Built-In Self-Test (BIST) is invoked on power up or reset, EAX is 0 only if all tests passed. (BIST cannotbe invoked during an INIT.)

4. The state of the FPU state and MMX™ registers is not changed by the execution of an INIT.

5. Available in the Pentium® III processor and Pentium® III Xeon™ processor only. The state of the SIMDfloating-point registers is not changed by the execution of an INIT.

FPU Data Operand and Inst. Pointers4

Pwr up or Reset: 00000000HFINIT/FNINIT: 00000000H

Pwr up or Reset: 00000000HFINIT/FNINIT: 00000000H

Pwr up or Reset: 00000000HFINIT/FNINIT: 00000000H

GDTR,IDTR Base = 00000000HLimit = FFFFHAR = Present, R/W

Base = 00000000HLimit = FFFFHAR = Present, R/W

Base = 00000000HLimit = FFFFHAR = Present, R/W

LDTR, Task Register

Selector = 0000HBase = 00000000HLimit = FFFFHAR = Present, R/W

Selector = 0000HBase = 00000000HLimit = FFFFHAR = Present, R/W

Selector = 0000HBase = 00000000HLimit = FFFFHAR = Present, R/W

DR0, DR1, DR2, DR3

00000000H 00000000H 00000000H

DR6 FFFF0FF0H FFFF0FF0H FFFF1FF0H

DR7 00000400H 00000400H 00000000H

Time-Stamp Counter

Power up or Reset: 0HINIT: Unchanged

Power up or Reset: 0HINIT: Unchanged

Not Implemented

Perf. Counters and Event Select

Power up or Reset: 0HINIT: Unchanged

Power up or Reset: 0HINIT: Unchanged

Not Implemented

All Other MSRs Pwr up or Reset: UndefinedINIT: Unchanged

Pwr up or Reset: UndefinedINIT: Unchanged

Not Implemented

Data and Code Cache, TLBs

Invalid Invalid Invalid

Fixed MTRRs Pwr up or Reset: DisabledINIT: Unchanged

Not Implemented Not Implemented

Variable MTRRs Pwr up or Reset: DisabledINIT: Unchanged

Not Implemented Not Implemented

Machine-Check Architecture

Pwr up or Reset: UndefinedINIT: Unchanged

Not Implemented Not Implemented

APIC Pwr up or Reset: EnabledINIT: Unchanged

Pwr up or Reset: EnabledINIT: Unchanged

Not Implemented

Table 8-1. 32-Bit Intel Architecture Processor StatesFollowing Power-up, Reset, or INIT (Contd.)

Register P6 Family Processors Pentium® Processor Intel486™ Processor

8-4

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (277)

PROCESSOR MANAGEMENT AND INITIALIZATION

t. Fore caninitial-

8.1.3. Model and Stepping Information

Following a hardware reset, the EDX register contains component identification and revisioninformation (refer to Figure 8-2). The device ID field is set to the value 6H, 5H, 4H, or 3H toindicate a Pentium® Pro, Pentium®, Intel486™, or Intel386™ processor, respectively. Differenvalues may be returned for the various members of these Intel Architecture familiesexample the Intel386™ SX processor returns 23H in the device ID field. Binary object codbe made compatible with other Intel processors by using this number to select the correct ization software.

Figure 8-1. Contents of CR0 Register after Reset

Figure 8-2. Processor Type and Signature in the EDX Register after Reset

External FPU error reporting: 0(Not used): 1No task switch: 0FPU instructions not trapped: 0WAIT/FWAIT instructions not trapped: 0Real-address mode: 0

Reserved

31 19 16 15 0

PE

1234561718282930

MP

EM1N

ETS

PG

CD

NW

WP

AM

Paging disabled: 0

Alignment check disabled: 0

Caching disabled: 1Not write-through disabled: 1

Write-protect disabled: 0

31 12 11 8 7 4 3 0

EDX

Family (0110B for the Pentium® Pro Processor Family)Model (Beginning with 0001B)

1314

Processor Type

ModelFamilyStepping

ID

Reserved

8-5

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (278)

PROCESSOR MANAGEMENT AND INITIALIZATION

ision

ysicalhysicaldress.

hile inhe CS. In real-r valueegment with

e in the

cessorddressemainsst not

lector

hed togs in

The stepping ID field contains a unique identifier for the processor’s stepping ID or revlevel. The upper word of EDX is reserved following reset.

8.1.4. First Instruction Executed

The first instruction that is fetched and executed following a hardware reset is located at phaddress FFFFFFF0H. This address is 16 bytes below the processor’s uppermost paddress. The EPROM containing the software-initialization code must be located at this ad

The address FFFFFFF0H is beyond the 1-MByte addressable range of the processor wreal-address mode. The processor is initialized to this starting address as follows. Tregister has two parts: the visible segment selector part and the hidden base address partaddress mode, the base address is normally formed by shifting the 16-bit segment selecto4 bits to the left to produce a 20-bit base address. However, during a hardware reset, the sselector in the CS register is loaded with F000H and the base address is loadedFFFF0000H. The starting address is thus formed by adding the base address to the valuEIP register (that is, FFFF0000 + FFF0H = FFFFFFF0H).

The first time the CS register is loaded with a new value after a hardware reset, the prowill follow the normal rule for address translation in real-address mode (that is, [CS base a= CS segment selector * 16]). To insure that the base address in the CS register runchanged until the EPROM based software-initialization code is completed, the code mucontain a far jump or far call or allow an interrupt to occur (which would cause the CS sevalue to be changed).

8.2. FPU INITIALIZATION

Software-initialization code can determine the whether the processor contains or is attacan FPU by using the CPUID instruction. The code must then initialize the FPU and set flacontrol register CR0 to reflect the state of the FPU environment.

A hardware reset places the Pentium® processor FPU in the state shown in Table 8-1. This stateis different from the state the processor is placed in when executing an FINIT or FNINIT instruc-tion (also shown in Table 8-1). If the FPU is to be used, the software-initialization code shouldexecute an FINIT/FNINIT instruction following a hardware reset. These instructions, tag alldata registers as empty, clear all the exception masks, set the TOP-of-stack value to 0, and selectthe default rounding and precision controls setting (round to nearest and 64-bit precision).

If the processor is reset by asserting the INIT# pin, the FPU state is not changed.

8.2.1. Configuring the FPU Environment

Initialization code must load the appropriate values into the MP, EM, and NE flags of controlregister CR0. These bits are cleared on hardware reset of the processor. Figure 8-2 shows thesuggested settings for these flags, depending on the Intel Architecture processor being initial-

8-6

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (279)

PROCESSOR MANAGEMENT AND INITIALIZATION

ized. Initialization code can test for the type of processor present before setting or clearing theseflags.

NOTE:

* The setting of the NE flag depends on the operating system being used.

The EM flag determines whether floating-point instructions are executed by the FPU (EM iscleared) or generate a device-not-available exception (#NM) so that an exception handler canemulate the floating-point operation (EM = 1). Ordinarily, the EM flag is cleared when an FPUor math coprocessor is present and set if they are not present. If the EM flag is set and no FPU,math coprocessor, or floating-point emulator is present, the system will hang when a floating-point instruction is executed.

The MP flag determines whether WAIT/FWAIT instructions react to the setting of the TS flag.If the MP flag is clear, WAIT/FWAIT instructions ignore the setting of the TS flag; if the MPflag is set, they will generate a device-not-available exception (#NM) if the TS flag is set. Gener-ally, the MP flag should be set for processors with an integrated FPU and clear for processorswithout an integrated FPU and without a math coprocessor present. However, an operatingsystem can choose to save the floating-point context at every context switch, in which case therewould be no need to set the MP bit.

Table 2-1 in Chapter 2, System Architecture Overview shows the actions taken for floating-pointand WAIT/FWAIT instructions based on the settings of the EM, MP, and TS flags.

The NE flag determines whether unmasked floating-point exceptions are handled by generatinga floating-point error exception internally (NE is set, native mode) or through an external inter-rupt (NE is cleared). In systems where an external interrupt controller is used to invoke numericexception handlers (such as MS-DOS-based systems), the NE bit should be cleared.

Table 8-2. Recommended Settings of EM and MP Flags on Intel Architecture Processors

EM MP NE Intel Architecture Processor

1 0 1 Intel486™ SX, Intel386™ DX, and Intel386™ SX processors only, without the presence of a math coprocessor.

0 1 1 or 0* Pentium® Pro, Pentium®, Intel486™ DX, and Intel 487 SX processors, and also Intel386™ DX and Intel386™ SX processors when a companion math coprocessor is present.

8-7

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (280)

PROCESSOR MANAGEMENT AND INITIALIZATION

il-

ernallags innes areblingecific

config- thegs in regis-r the

8.2.2. Setting the Processor for FPU Software Emulation

Setting the EM flag causes the processor to generate a device-not-available exception (#NM)and trap to a software exception handler whenever it encounters a floating-point instruction.(Table 8-2 shows when it is appropriate to use this flag.) Setting this flag has two functions:

• It allows floating-point code to run on an Intel processor that neither has an integrated FPUnor is connected to an external math coprocessor, by using a floating-point emulator.

• It allows floating-point code to be executed using a special or nonstandard floating-pointemulator, selected for a particular application, regardless of whether an FPU or mathcoprocessor is present.

To emulate floating-point instructions, the EM, MP, and NE flag in control register CR0 shouldbe set as shown in Table 8-3.

Regardless of the value of the EM bit, the Intel486™ SX processor generates a device-not-avaable exception (#NM) upon encountering any floating-point instruction.

8.3. CACHE ENABLING

The Intel Architecture processors (beginning with the Intel486™ processor) contain intinstruction and data caches. These caches are enabled by clearing the CD and NW fcontrol register CR0. (They are set during a hardware reset.) Because all internal cache liinvalid following reset initialization, it is not necessary to invalidate the cache before enacaching. Any external caches may require initialization and invalidation using a system-spinitialization and invalidation code sequence.

Depending on the hardware and operating system or executive requirements, additional uration of the processor’s caching facilities will probably be required. Beginning withIntel486™ processor, page-level caching can be controlled with the PCD and PWT flapage-directory and page-table entries. For P6 family processors, the memory type rangeters (MTRRs) control the caching characteristics of the regions of physical memory. (FoIntel486™ and Pentium® processors, external hardware can be used to control the caching char-acteristics of regions of physical memory.) Refer to Chapter 9, Memory Cache Control, fordetailed information on configuration of the caching facilities in the P6 family processors andsystem memory.

8.4. MODEL-SPECIFIC REGISTERS (MSRS)

The P6 family processors and Pentium® processors contain model-specific registers (MSRs).These registers are by definition implementation specific; that is, they are not guaranteed to be

Table 8-3. Software Emulation Settings of EM, MP, and NE Flags

CR0 Bit Value

EM 1

MP 0

NE 1

8-8

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (281)

PROCESSOR MANAGEMENT AND INITIALIZATION

oring

ch,

ge

tively.

Rs to thesereset

supported on future Intel Architecture processors and/or to have the same functions. The MSRsare provided to control a variety of hardware- and software-related features, including:

• The performance-monitoring counters (refer to Section 15.6., “Performance-MonitCounters”, in Chapter 15, Debugging and Performance Monitoring).

• (P6 family processors only.) Debug extensions (refer to Section 15.4., “Last BranInterrupt, and Exception Recording”, in Chapter 15, Debugging and PerformanceMonitoring).

• (P6 family processors only.) The machine-check exception capability and its accompa-nying machine-check architecture (refer to Chapter 13, Machine-Check Architecture).

• (P6 family processors only.) The MTRRs (refer to Section 9.12., “Memory Type RanRegisters (MTRRs)”, in Chapter 9, Memory Cache Control).

The MSRs can be read and written to using the RDMSR and WRMSR instructions, respec

When performing software initialization of a Pentium® Pro or Pentium® processor, many of theMSRs will need to be initialized to set up things like performance-monitoring events, run-timemachine checks, and memory types for physical memory.

Systems configured to implement FRC mode must write all of the processors’ internal MSdeterministic values before performing either a read or read-modify-write operation usingregisters. The following is a list of MSRs that are not initialized by the processors’ sequences.

• All fixed and variable MTRRs.

• All Machine Check Architecture (MCA) status registers.

• Microcode update signature register.

• All L2 cache initialization MSRs.

The list of available performance-monitoring counters for the Pentium® Pro and Pentium®

processors is given in Appendix A, Performance-Monitoring Events, and the list of availableMSRs for the Pentium® Pro processor is given in Appendix B, Model-Specific Registers. Thereferences earlier in this section show where the functions of the various groups of MSRs aredescribed in this manual.

8.5. MEMORY TYPE RANGE REGISTERS (MTRRS)

Memory type range registers (MTRRs) were introduced into the Intel Architecture with thePentium® Pro processor. They allow the type of caching (or no caching) to be specified in systemmemory for selected physical address ranges. They allow memory accesses to be optimized forvarious types of memory such as RAM, ROM, frame buffer memory, and memory-mapped I/Odevices.

In general, initializing the MTRRs is normally handled by the software initialization code orBIOS and is not an operating system or executive function. At the very least, all the MTRRsmust be cleared to 0, which selects the uncached (UC) memory type. Refer to Section 9.12.,

8-9

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (282)

PROCESSOR MANAGEMENT AND INITIALIZATION

in) thee fromy datarruptsd addi-

execu-

ry datad-modeon for

he IDTphys-e base

ption-

RAM;ssor inith the

anding

must and

“Memory Type Range Registers (MTRRs)”, in Chapter 9, Memory Cache Control, for detailedinformation on the MTRRs.

8.6. SOFTWARE INITIALIZATION FOR REAL-ADDRESS MODE OPERATION

Following a hardware reset (either through a power-up or the assertion of the RESET# pprocessor is placed in real-address mode and begins executing software initialization codphysical address FFFFFFF0H. Software initialization code must first set up the necessarstructures for handling basic system functions, such as a real-mode IDT for handling inteand exceptions. If the processor is to remain in real-address mode, software must then loational operating-system or executive code modules and data structures to allow reliable tion of application programs in real-address mode.

If the processor is going to operate in protected mode, software must load the necessastructures to operate in protected mode and then switch to protected mode. The protectedata structures that must be loaded are described in Section 8.7., “Software InitializatiProtected-Mode Operation”.

8.6.1. Real-Address Mode IDT

In real-address mode, the only system data structure that must be loaded into memory is t(also called the “interrupt vector table”). By default, the address of the base of the IDT is ical address 0H. This address can be changed by using the LIDT instruction to change thaddress value in the IDTR. Software initialization code needs to load interrupt- and excehandler pointers into the IDT before interrupts can be enabled.

The actual interrupt- and exception-handler code can be contained either in EPROM or however, the code must be located within the 1-MByte addressable range of the procereal-address mode. If the handler code is to be stored in RAM, it must be loaded along wIDT.

8.6.2. NMI Interrupt Handling

The NMI interrupt is always enabled (except when multiple NMIs are nested). If the IDTthe NMI interrupt handler need to be loaded into RAM, there will be a period of time followhardware reset when an NMI interrupt cannot be handled. During this time, hardwareprovide a mechanism to prevent an NMI interrupt from halting code execution until the IDTthe necessary NMI handler software is loaded.

8-10

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (283)

PROCESSOR MANAGEMENT AND INITIALIZATION

twared to

ectedof the

Here are two examples of how NMIs can be handled during the initial states of processor initial-ization:

• A simple IDT and NMI interrupt handler can be provided in EPROM. This allows an NMIinterrupt to be handled immediately after reset initialization.

• The system hardware can provide a mechanism to enable and disable NMIs by passing theNMI# signal through an AND gate controlled by a flag in an I/O port. Hardware can clearthe flag when the processor is reset, and software can set the flag when it is ready to handleNMI interrupts.

8.7. SOFTWARE INITIALIZATION FOR PROTECTED-MODE OPERATION

The processor is placed in real-address mode following a hardware reset. At this point in theinitialization process, some basic data structures and code modules must be loaded into physicalmemory to support further initialization of the processor, as described in Section 8.6., “SofInitialization for Real-Address Mode Operation”. Before the processor can be switcheprotected mode, the software initialization code must load a minimum number of protmode data structures and code modules into memory to support reliable operation processor in protected mode. These data structures include the following:

• A protected-mode IDT.

• A GDT.

• A TSS.

• (Optional.) An LDT.

• If paging is to be used, at least one page directory and one page table.

• A code segment that contains the code to be executed when the processor switches toprotected mode.

• One or more code modules that contain the necessary interrupt and exception handlers.

Software initialization code must also initialize the following system registers before theprocessor can be switched to protected mode:

• The GDTR.

• (Optional.) The IDTR. This register can also be initialized immediately after switching toprotected mode, prior to enabling interrupts.

• Control registers CR1 through CR4.

• (Pentium® Pro processor only.) The memory type range registers (MTRRs).

With these data structures, code modules, and system registers initialized, the processor can beswitched to protected mode by loading control register CR0 with a value that sets the PE flag(bit 0).

8-11

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (284)

PROCESSOR MANAGEMENT AND INITIALIZATION

nto the

ell asin thed. Thisver,e. Angments way

riptorsed, thexceptionents are

T for

IDTRfter

stateing isinitial-

8.7.1. Protected-Mode System Data Structures

The contents of the protected-mode system data structures loaded into memory during softwareinitialization, depend largely on the type of memory management the protected-mode operating-system or executive is going to support: flat, flat with paging, segmented, or segmented withpaging.

To implement a flat memory model without paging, software initialization code must at aminimum load a GDT with one code and one data-segment descriptor. A null descriptor in thefirst GDT entry is also required. The stack can be placed in a normal read/write data segment,so no dedicated descriptor for the stack is required. A flat memory model with paging alsorequires a page directory and at least one page table (unless all pages are 4 MBytes in which caseonly a page directory is required). Refer to Section 8.7.3., “Initializing Paging”

Before the GDT can be used, the base address and limit for the GDT must be loaded iGDTR register using an LGDT instruction.

A multisegmented model may require additional segments for the operating system, as wsegments and LDTs for each application program. LDTs require segment descriptors GDT. Some operating systems allocate new segments and LDTs as they are needeprovides maximum flexibility for handling a dynamic programming environment. Howemany operating systems use a single LDT for all tasks, allocating GDT entries in advancembedded system, such as a process controller, might pre-allocate a fixed number of seand LDTs for a fixed number of application programs. This would be a simple and efficientto structure the software environment of a real-time system.

8.7.2. Initializing Protected-Mode Exceptions and Interrupts

Software initialization code must at a minimum load a protected-mode IDT with gate descfor each exception vector that the processor can generate. If interrupt or trap gates are ugate descriptors can all point to the same code segment, which contains the necessary ehandlers. If task gates are used, one TSS and accompanying code, data, and task segmrequired for each exception handler called with a task gate.

If hardware allows interrupts to be generated, gate descriptors must be provided in the IDone or more interrupt handlers.

Before the IDT can be used, the base address and limit for the IDT must be loaded into theregister using an LIDT instruction. This operation is typically carried out immediately aswitching to protected mode.

8.7.3. Initializing Paging

Paging is controlled by the PG flag in control register CR0. When this flag is clear (its following a hardware reset), the paging mechanism is turned off; when it is set, pagenabled. Before setting the PG flag, the following data structures and registers must be ized:

8-12

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (285)

PROCESSOR MANAGEMENT AND INITIALIZATION

• Software must load at least one page directory and one page table into physical memory.The page table can be eliminated if the page directory contains a directory entry pointing toitself (here, the page directory and page table reside in the same page), or if only 4-MBytepages are used.

• Control register CR3 (also called the PDBR register) is loaded with the physical baseaddress of the page directory.

• (Optional) Software may provide one set of code and data descriptors in the GDT or in anLDT for supervisor mode and another set for user mode.

With this paging initialization complete, paging is enabled and the processor is switched toprotected mode at the same time by loading control register CR0 with an image in which the PGand PE flags are set. (Paging cannot be enabled before the processor is switched to protectedmode.)

8.7.4. Initializing Multitasking

If the multitasking mechanism is not going to be used and changes between privilege levels arenot allowed, it is not necessary load a TSS into memory or to initialize the task register.

If the multitasking mechanism is going to be used and/or changes between privilege levels areallowed, software initialization code must load at least one TSS and an accompanying TSSdescriptor. (A TSS is required to change privilege levels because pointers to the privileged-level0, 1, and 2 stack segments and the stack pointers for these stacks are obtained from the TSS.)TSS descriptors must not be marked as busy when they are created; they should be marked busyby the processor only as a side-effect of performing a task switch. As with descriptors for LDTs,TSS descriptors reside in the GDT.

After the processor has switched to protected mode, the LTR instruction can be used to load asegment selector for a TSS descriptor into the task register. This instruction marks the TSSdescriptor as busy, but does not perform a task switch. The processor can, however, use the TSSto locate pointers to privilege-level 0, 1, and 2 stacks. The segment selector for the TSS must beloaded before software performs its first task switch in protected mode, because a task switchcopies the current task state into the TSS.

After the LTR instruction has been executed, further operations on the task register areperformed by task switching. As with other segments and LDTs, TSSs and TSS descriptors canbe either pre-allocated or allocated as needed.

8.8. MODE SWITCHING

To use the processor in protected mode, a mode switch must be performed from real-addressmode. Once in protected mode, software generally does not need to return to real-address mode.To run software written to run in real-address mode (8086 mode), it is generally more convenientto run the software in virtual-8086 mode, than to switch back to real-address mode.

8-13

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (286)

PROCESSOR MANAGEMENT AND INITIALIZATION

cted-h into

e CR0ging.)

g toIntel

NMIt no

f the

g) in

LLthe

the

LLbeforeabled).

ntity

the

initial TSS

ey hadister.ment

gisters

8.8.1. Switching to Protected Mode

Before switching to protected mode, a minimum set of system data structures and code modulesmust be loaded into memory, as described in Section 8.7., “Software Initialization for ProteMode Operation”. Once these tables are created, software initialization code can switcprotected mode.

Protected mode is entered by executing a MOV CR0 instruction that sets the PE flag in thregister. (In the same instruction, the PG flag in register CR0 can be set to enable paExecution in protected mode begins with a CPL of 0.

The 32-bit Intel Architecture processors have slightly different requirements for switchinprotected mode. To insure upwards and downwards code compatibility with all 32-bit Architecture processors, it is recommended that the following steps be performed:

1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. interrupts can be disabled with external circuitry. (Software must guarantee thaexceptions or interrupts are generated during the mode switching operation.)

2. Execute the LGDT instruction to load the GDTR register with the base address oGDT.

3. Execute a MOV CR0 instruction that sets the PE flag (and optionally the PG flacontrol register CR0.

4. Immediately following the MOV CR0 instruction, execute a far JMP or far CAinstruction. (This operation is typically a far jump or call to the next instruction in instruction stream.)

The JMP or CALL instruction immediately after the MOV CR0 instruction changes flow of execution and serializes the processor.

If paging is enabled, the code for the MOV CR0 instruction and the JMP or CAinstruction must come from a page that is identity mapped (that is, the linear address the jump is the same as the physical address after paging and protected mode is enThe target instruction for the JMP or CALL instruction does not need to be idemapped.

5. If a local descriptor table is going to be used, execute the LLDT instruction to loadsegment selector for the LDT in the LDTR register.

6. Execute the LTR instruction to load the task register with a segment selector to the protected-mode task or to a writable area of memory that can be used to storeinformation on a task switch.

7. After entering protected mode, the segment registers continue to hold the contents thin real-address mode. The JMP or CALL instruction in step 4 resets the CS regPerform one of the following operations to update the contents of the remaining segregisters.

— Reload segment registers DS, SS, ES, FS, and GS. If the ES, FS, and/or GS reare not going to be used, load them with a null selector.

8-14

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (287)

PROCESSOR MANAGEMENT AND INITIALIZATION

the

f the

the

es willinserted

e CR0hould

NMI

ysical

FH).ode.

taining

gmentre not

— Perform a JMP or CALL instruction to a new task, which automatically resetsvalues of the segment registers and branches to a new code segment.

8. Execute the LIDT instruction to load the IDTR register with the address and limit oprotected-mode IDT.

9. Execute the STI instruction to enable maskable hardware interrupts and performnecessary hardware operation to enable NMI interrupts.

Random failures can occur if other instructions exist between steps 3 and 4 above. Failurbe readily seen in some situations, such as when instructions that reference memory are between steps 3 and 4 while in System Management mode.

8.8.2. Switching Back to Real-Address Mode

The processor switches back to real-address mode if software clears the PE bit in thregister with a MOV CR0 instruction. A procedure that re-enters real-address mode sperform the following steps:

1. Disable interrupts. A CLI instruction disables maskable hardware interrupts. interrupts can be disabled with external circuitry.

2. If paging is enabled, perform the following operations:

— Transfer program control to linear addresses that are identity mapped to phaddresses (that is, linear addresses equal physical addresses).

— Insure that the GDT and IDT are in identity mapped pages.

— Clear the PG bit in the CR0 register.

— Move 0H into the CR3 register to flush the TLB.

3. Transfer program control to a readable segment that has a limit of 64 KBytes (FFFThis operation loads the CS register with the segment limit required in real-address m

4. Load segment registers SS, DS, ES, FS, and GS with a selector for a descriptor conthe following values, which are appropriate for real-address mode:

— Limit = 64 KBytes (0FFFFH)

— Byte granular (G = 0)

— Expand up (E = 0)

— Writable (W = 1)

— Present (P = 1)

— Base = any value

The segment registers must be loaded with nonnull segment selectors or the seregisters will be unusable in real-address mode. Note that if the segment registers a

8-15

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (288)

PROCESSOR MANAGEMENT AND INITIALIZATION

essor.

FFFFHFF0H,

ing forbers

reloaded, execution continues using the descriptor attributes loaded during protectedmode.

5. Execute an LIDT instruction to point to a real-address mode interrupt table that is withinthe 1-MByte real-address mode address range.

6. Clear the PE flag in the CR0 register to switch to real-address mode.

7. Execute a far JMP instruction to jump to a real-address mode program. This operationflushes the instruction queue and loads the appropriate base and access rights values in theCS register.

8. Load the SS, DS, ES, FS, and GS registers as needed by the real-address mode code. If anyof the registers are not going to be used in real-address mode, write 0s to them.

9. Execute the STI instruction to enable maskable hardware interrupts and perform thenecessary hardware operation to enable NMI interrupts.

NOTE

All the code that is executed in steps 1 through 9 must be in a single page andthe linear addresses in that page must be identity mapped to physicaladdresses.

8.9. INITIALIZATION AND MODE SWITCHING EXAMPLE

This section provides an initialization and mode switching example that can be incorporated intoan application. This code was originally written to initialize the Intel386™ processor, but it willexecute successfully on the Pentium® Pro, Pentium®, and Intel486™ processors. The code in thisexample is intended to reside in EPROM and to run following a hardware reset of the procThe function of the code is to do the following:

• Establish a basic real-address mode operating environment.

• Load the necessary protected-mode system data structures into RAM.

• Load the system registers with the necessary pointers to the data structures and theappropriate flag settings for protected-mode operation.

• Switch the processor to protected mode.

Figure 8-3 shows the physical memory layout for the processor following a hardware reset andthe starting point of this example. The EPROM that contains the initialization code resides at theupper end of the processor’s physical memory address range, starting at address FFFFand going down from there. The address of the first instruction to be executed is at FFFFFthe default starting address for the processor following a hardware reset.

The main steps carried out in this example are summarized in Table 8-4. The source listthe example (with the filename STARTUP.ASM) is given in Example 8-1. The line numgiven in Table 8-4 refer to the source listing.

8-16

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (289)

PROCESSOR MANAGEMENT AND INITIALIZATION

The following are some additional notes concerning this example:

• When the processor is switched into protected mode, the original code segment base-address value of FFFF0000H (located in the hidden part of the CS register) is retained andexecution continues from the current offset in the EIP register. The processor will thuscontinue to execute code in the EPROM until a far jump or call is made to a new codesegment, at which time, the base address in the CS register will be changed.

• Maskable hardware interrupts are disabled after a hardware reset and should remaindisabled until the necessary interrupt handlers have been installed. The NMI interrupt isnot disabled following a reset. The NMI# pin must thus be inhibited from being asserteduntil an NMI handler has been loaded and made available to the processor.

• The use of a temporary GDT allows simple transfer of tables from the EPROM toanywhere in the RAM area. A GDT entry is constructed with its base pointing to address 0and a limit of 4 GBytes. When the DS and ES registers are loaded with this descriptor, thetemporary GDT is no longer needed and can be replaced by the application GDT.

• This code loads one TSS and no LDTs. If more TSSs exist in the application, they must beloaded into RAM. If there are LDTs they may be loaded as well.

Figure 8-3. Processor State After Reset

FFFF FFFFHAfter Reset

[CS.BASE+EIP] FFFF FFF0H

EIP = 0000 FFF0H

[SP, DS, SS, ES]

FFFF 0000H

64K EPROM

CS.BASE = FFFF 0000HDS.BASE = 0HES.BASE = 0HSS.BASE = 0HESP = 0H

8-17

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (290)

PROCESSOR MANAGEMENT AND INITIALIZATION

Table 8-4. Main Initialization Steps in STARTUP.ASM Source Listing

STARTUP.ASMLine Numbers

DescriptionFrom To

157 157 Jump (short) to the entry code in the EPROM

162 169 Construct a temporary GDT in RAM with one entry:0 - null1 - R/W data segment, base = 0, limit = 4 GBytes

171 172 Load the GDTR to point to the temporary GDT

174 177 Load CR0 with PE flag set to switch to protected mode

179 181 Jump near to clear real mode instruction queue

184 186 Load DS, ES registers with GDT[1] descriptor, so both point to the entire physical memory space

188 195 Perform specific board initialization that is imposed by the new protected mode

196 218 Copy the application’s GDT from ROM into RAM

220 238 Copy the application’s IDT from ROM into RAM

241 243 Load application’s GDTR

244 245 Load application’s IDTR

247 261 Copy the application’s TSS from ROM into RAM

263 267 Update TSS descriptor and other aliases in GDT (GDT alias or IDT alias)

277 277 Load the task register (without task switch) using LTR instruction

282 286 Load SS, ESP with the value found in the application’s TSS

287 287 Push EFLAGS value found in the application’s TSS

288 288 Push CS value found in the application’s TSS

289 289 Push EIP value found in the application’s TSS

290 293 Load DS, ES with the value found in the application’s TSS

296 296 Perform IRET; pop the above values and enter the application code

8-18

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (291)

PROCESSOR MANAGEMENT AND INITIALIZATION

8.9.1. Assembler Usage

In this example, the Intel assembler ASM386 and build tools BLD386 are used to assemble andbuild the initialization code module. The following assumptions are used when using the IntelASM386 and BLD386 tools.

• The ASM386 will generate the right operand size opcodes according to the code-segmentattribute. The attribute is assigned either by the ASM386 invocation controls or in thecode-segment definition.

• If a code segment that is going to run in real-address mode is defined, it must be set to aUSE 16 attribute. If a 32-bit operand is used in an instruction in this code segment (forexample, MOV EAX, EBX), the assembler automatically generates an operand prefix forthe instruction that forces the processor to execute a 32-bit operation, even though itsdefault code-segment attribute is 16-bit.

• Intel’s ASM386 assembler allows specific use of the 16- or 32-bit instructions, forexample, LGDTW, LGDTD, IRETD. If the generic instruction LGDT is used, the default-segment attribute will be used to generate the right opcode.

8.9.2. STARTUP.ASM Listing

The source code listing to move the processor into protected mode is provided in Example 8-1.This listing does not include any opcode and offset information.

Example 8-1. STARTUP.ASM

MS-DOS* 5.0(045-N) 386(TM) MACRO ASSEMBLER STARTUP 09:44:51 08/19/92 PAGE 1

MS-DOS 5.0(045-N) 386(TM) MACRO ASSEMBLER V4.0, ASSEMBLY OF MODULE STARTUPOBJECT MODULE PLACED IN startup.objASSEMBLER INVOKED BY: f:\386tools\ASM386.EXE startup.a58 pw (132 )

LINE SOURCE

1 NAME STARTUP 2 3 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 4 ; 5 ; ASSUMPTIONS: 6 ; 7 ; 1. The bottom 64K of memory is ram, and can be used for 8 ; scratch space by this module. 9 ; 10 ; 2. The system has sufficient free usable ram to copy the 11 ; initial GDT, IDT, and TSS

8-19

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (292)

PROCESSOR MANAGEMENT AND INITIALIZATION

12 ; 13 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 14 15 ; configuration data - must match with build definition 16 17 CS_BASE EQU 0FFFF0000H 18 19 ; CS_BASE is the linear address of the segment STARTUP_CODE 20 ; - this is specified in the build language file 21 22 RAM_START EQU 400H 23 24 ; RAM_START is the start of free, usable ram in the linear 25 ; memory space. The GDT, IDT, and initial TSS will be 26 ; copied above this space, and a small data segment will be 27 ; discarded at this linear address. The 32-bit word at 28 ; RAM_START will contain the linear address of the first 29 ; free byte above the copied tables - this may be useful if 30 ; a memory manager is used. 31 32 TSS_INDEX EQU 10 33 34 ; TSS_INDEX is the index of the TSS of the first task to 35 ; run after startup 36 37 38 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; 39 40 ; ------------------------- STRUCTURES and EQU --------------- 41 ; structures for system data 42 43 ; TSS structure 44 TASK_STATE STRUC 45 link DW ? 46 link_h DW ? 47 ESP0 DD ? 48 SS0 DW ? 49 SS0_h DW ? 50 ESP1 DD ? 51 SS1 DW ? 52 SS1_h DW ? 53 ESP2 DD ? 54 SS2 DW ? 55 SS2_h DW ? 56 CR3_reg DD ? 57 EIP_reg DD ? 58 EFLAGS_reg DD ?

8-20

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (293)

PROCESSOR MANAGEMENT AND INITIALIZATION

59 EAX_reg DD ? 60 ECX_reg DD ? 61 EDX_reg DD ? 62 EBX_reg DD ? 63 ESP_reg DD ? 64 EBP_reg DD ? 65 ESI_reg DD ? 66 EDI_reg DD ? 67 ES_reg DW ? 68 ES_h DW ? 69 CS_reg DW ? 70 CS_h DW ? 71 SS_reg DW ? 72 SS_h DW ? 73 DS_reg DW ? 74 DS_h DW ? 75 FS_reg DW ? 76 FS_h DW ? 77 GS_reg DW ? 78 GS_h DW ? 79 LDT_reg DW ? 80 LDT_h DW ? 81 TRAP_reg DW ? 82 IO_map_base DW ? 83 TASK_STATE ENDS 84 85 ; basic structure of a descriptor 86 DESC STRUC 87 lim_0_15 DW ? 88 bas_0_15 DW ? 89 bas_16_23 DB ? 90 access DB ? 91 gran DB ? 92 bas_24_31 DB ? 93 DESC ENDS 94 95 ; structure for use with LGDT and LIDT instructions 96 TABLE_REG STRUC 97 table_lim DW ? 98 table_linear DD ? 99 TABLE_REG ENDS 100 101 ; offset of GDT and IDT descriptors in builder generated GDT 102 GDT_DESC_OFF EQU 1*SIZE(DESC) 103 IDT_DESC_OFF EQU 2*SIZE(DESC) 104 105 ; equates for building temporary GDT in RAM

8-21

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (294)

PROCESSOR MANAGEMENT AND INITIALIZATION

106 LINEAR_SEL EQU 1*SIZE (DESC) 107 LINEAR_PROTO_LO EQU 00000FFFFH ; LINEAR_ALIAS 108 LINEAR_PROTO_HI EQU 000CF9200H 109 110 ; Protection Enable Bit in CR0 111 PE_BIT EQU 1B 112 113 ; ------------------------------------------------------------ 114 115 ; ------------------------- DATA SEGMENT---------------------- 116 117 ; Initially, this data segment starts at linear 0, according 118 ; to the processor’s power-up state. 119 120 STARTUP_DATA SEGMENT RW 121 122 free_mem_linear_base LABEL DWORD 123 TEMP_GDT LABEL BYTE ; must be first in segment 124 TEMP_GDT_NULL_DESC DESC <> 125 TEMP_GDT_LINEAR_DESC DESC <> 126 127 ; scratch areas for LGDT and LIDT instructions 128 TEMP_GDT_SCRATCH TABLE_REG <> 129 APP_GDT_RAM TABLE_REG <> 130 APP_IDT_RAM TABLE_REG <> 131 ; align end_data 132 fill DW ? 133 134 ; last thing in this segment - should be on a dword boundary 135 end_data LABEL BYTE 136 137 STARTUP_DATA ENDS 138 ; ------------------------------------------------------------ 139 140 141 ; ------------------------- CODE SEGMENT---------------------- 142 STARTUP_CODE SEGMENT ER PUBLIC USE16 143 144 ; filled in by builder 145 PUBLIC GDT_EPROM 146 GDT_EPROM TABLE_REG <> 147 148 ; filled in by builder 149 PUBLIC IDT_EPROM 150 IDT_EPROM TABLE_REG <> 151 152 ; entry point into startup code - the bootstrap will vector

8-22

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (295)

PROCESSOR MANAGEMENT AND INITIALIZATION

153 ; here with a near JMP generated by the builder. This 154 ; label must be in the top 64K of linear memory. 155 156 PUBLIC STARTUP 157 STARTUP: 158 159 ; DS,ES address the bottom 64K of flat linear memory 160 ASSUME DS:STARTUP_DATA, ES:STARTUP_DATA 161 ; See Figure 8-4 162 ; load GDTR with temporary GDT 163 LEA EBX,TEMP_GDT ; build the TEMP_GDT in low ram, 164 MOV DWORD PTR [EBX],0 ; where we can address 165 MOV DWORD PTR [EBX]+4,0 166 MOV DWORD PTR [EBX]+8, LINEAR_PROTO_LO 167 MOV DWORD PTR [EBX]+12, LINEAR_PROTO_HI 168 MOV TEMP_GDT_scratch.table_linear,EBX 169 MOV TEMP_GDT_scratch.table_lim,15 170 171 DB 66H ; execute a 32 bit LGDT 172 LGDT TEMP_GDT_scratch 173 174 ; enter protected mode 175 MOV EBX,CR0 176 OR EBX,PE_BIT 177 MOV CR0,EBX 178

179 ; clear prefetch queue 180 JMP CLEAR_LABEL 181 CLEAR_LABEL: 182 183 ; make DS and ES address 4G of linear memory 184 MOV CX,LINEAR_SEL 185 MOV DS,CX 186 MOV ES,CX 187 188 ; do board specific initialization 189 ; 190 ; 191 ; ...... 192 ; 193 194 195 ; See Figure 8-5 196 ; copy EPROM GDT to ram at: 197 ; RAM_START + size (STARTUP_DATA) 198 MOV EAX,RAM_START

8-23

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (296)

PROCESSOR MANAGEMENT AND INITIALIZATION

199 ADD EAX,OFFSET (end_data) 200 MOV EBX,RAM_START 201 MOV ECX, CS_BASE 202 ADD ECX, OFFSET (GDT_EPROM) 203 MOV ESI, [ECX].table_linear 204 MOV EDI,EAX 205 MOVZX ECX, [ECX].table_lim 206 MOV APP_GDT_ram[EBX].table_lim,CX 207 INC ECX 208 MOV EDX,EAX 209 MOV APP_GDT_ram[EBX].table_linear,EAX 210 ADD EAX,ECX 211 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI] 212 213 ; fixup GDT base in descriptor 214 MOV ECX,EDX 215 MOV [EDX].bas_0_15+GDT_DESC_OFF,CX 216 ROR ECX,16 217 MOV [EDX].bas_16_23+GDT_DESC_OFF,CL 218 MOV [EDX].bas_24_31+GDT_DESC_OFF,CH 219 220 ; copy EPROM IDT to ram at: 221 ; RAM_START+size(STARTUP_DATA)+SIZE (EPROM GDT) 222 MOV ECX, CS_BASE 223 ADD ECX, OFFSET (IDT_EPROM) 224 MOV ESI, [ECX].table_linear 225 MOV EDI,EAX 226 MOVZX ECX, [ECX].table_lim 227 MOV APP_IDT_ram[EBX].table_lim,CX 228 INC ECX 229 MOV APP_IDT_ram[EBX].table_linear,EAX 230 MOV EBX,EAX 231 ADD EAX,ECX 232 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI] 233 234 ; fixup IDT pointer in GDT 235 MOV [EDX].bas_0_15+IDT_DESC_OFF,BX 236 ROR EBX,16 237 MOV [EDX].bas_16_23+IDT_DESC_OFF,BL 238 MOV [EDX].bas_24_31+IDT_DESC_OFF,BH 239 240 ; load GDTR and IDTR 241 MOV EBX,RAM_START 242 DB 66H ; execute a 32 bit LGDT 243 LGDT APP_GDT_ram[EBX] 244 DB 66H ; execute a 32 bit LIDT 245 LIDT APP_IDT_ram[EBX] 246 247 ; move the TSS

8-24

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (297)

PROCESSOR MANAGEMENT AND INITIALIZATION

248 MOV EDI,EAX 249 MOV EBX,TSS_INDEX*SIZE(DESC) 250 MOV ECX,GDT_DESC_OFF ;build linear address for TSS 251 MOV GS,CX 252 MOV DH,GS:[EBX].bas_24_31 253 MOV DL,GS:[EBX].bas_16_23 254 ROL EDX,16 255 MOV DX,GS:[EBX].bas_0_15 256 MOV ESI,EDX 257 LSL ECX,EBX 258 INC ECX 259 MOV EDX,EAX 260 ADD EAX,ECX 261 REP MOVS BYTE PTR ES:[EDI],BYTE PTR DS:[ESI] 262 263 ; fixup TSS pointer 264 MOV GS:[EBX].bas_0_15,DX 265 ROL EDX,16 266 MOV GS:[EBX].bas_24_31,DH 267 MOV GS:[EBX].bas_16_23,DL 268 ROL EDX,16 269 ;save start of free ram at linear location RAMSTART 270 MOV free_mem_linear_base+RAM_START,EAX 271 272 ;assume no LDT used in the initial task - if necessary, 273 ;code to move the LDT could be added, and should resemble 274 ;that used to move the TSS 275 276 ; load task register 277 LTR BX ; No task switch, only descriptor loading 278 ; See Figure 8-6 279 ; load minimal set of registers necessary to simulate task 280 ; switch 281 282 283 MOV AX,[EDX].SS_reg ; start loading registers 284 MOV EDI,[EDX].ESP_reg 285 MOV SS,AX 286 MOV ESP,EDI ; stack now valid 287 PUSH DWORD PTR [EDX].EFLAGS_reg 288 PUSH DWORD PTR [EDX].CS_reg 289 PUSH DWORD PTR [EDX].EIP_reg 290 MOV AX,[EDX].DS_reg 291 MOV BX,[EDX].ES_reg 292 MOV DS,AX ; DS and ES no longer linear memory 293 MOV ES,BX294

8-25

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (298)

PROCESSOR MANAGEMENT AND INITIALIZATION

295 ; simulate far jump to initial task 296 IRETD 297 298 STARTUP_CODE ENDS*** WARNING #377 IN 298, (PASS 2) SEGMENT CONTAINS PRIVILEGED INSTRUCTION(S) 299 300 END STARTUP, DS:STARTUP_DATA, SS:STARTUP_DATA 301 302

ASSEMBLY COMPLETE, 1 WARNING, NO ERRORS.

Figure 8-4. Constructing Temporary GDT and Switching to Protected Mode (Lines 162-172 of List File)

FFFF FFFFH

Base=0, Limit=4G

START: [CS.BASE+EIP]

TEMP_GDT

• Jump near start

FFFF 0000H

• Construct TEMP_GDT• LGDT• Move to protected mode

DS, ES = GDT[1] 4GB

0GDT [1]GDT [0]

GDT_SCRATCHBaseLimit

8-26

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (299)

PROCESSOR MANAGEMENT AND INITIALIZATION

Figure 8-5. Moving the GDT, IDT and TSS from ROM to RAM (Lines 196-261 of List File)

FFFF FFFFH

GDT RAM

• Move the GDT, IDT, TSS

• Fix Aliases

• LTR

RAM_START

TSSIDTGDT

TSS RAMIDT RAM

from ROM to RAM

8-27

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (300)

PROCESSOR MANAGEMENT AND INITIALIZATION

Figure 8-6. Task Switching (Lines 282-296 of List File)

GDT RAMRAM_START

TSS RAMIDT RAM

GDT AliasIDT Alias

DS

EIPEFLAGS

CSSS

ES

ESP

••

•••

SS = TSS.SSESP = TSS.ESPPUSH TSS.EFLAGPUSH TSS.CSPUSH TSS.EIPES = TSS.ESDS = TSS.DSIRET

GDT

8-28

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (301)

PROCESSOR MANAGEMENT AND INITIALIZATION

8.9.3. MAIN.ASM Source Code

The file MAIN.ASM shown in Example 8-2 defines the data and stack segments for this appli-cation and can be substituted with the main module task written in a high-level language that isinvoked by the IRET instruction executed by STARTUP.ASM.

Example 8-2. MAIN.ASM

NAME main_moduledata SEGMENT RW

dw 1000 dup(?)DATA ENDSstack stackseg 800CODE SEGMENT ER use32 PUBLICmain_start:

nopnopnop

CODE ENDSEND main_start, ds:data, ss:stack

8.9.4. Supporting Files

The batch file shown in Example 8-3 can be used to assemble the source code filesSTARTUP.ASM and MAIN.ASM and build the final application.

Example 8-3. Batch File to Assemble and Build the Application

ASM386 STARTUP.ASMASM386 MAIN.ASMBLD386 STARTUP.OBJ, MAIN.OBJ buildfile(EPROM.BLD) bootstrap(STARTUP) Bootload

BLD386 performs several operations in this example:

• It allocates physical memory location to segments and tables.

• It generates tables using the build file and the input files.

• It links object files and resolves references.

• It generates a boot-loadable file to be programmed into the EPROM.

Example 8-4 shows the build file used as an input to BLD386 to perform the above functions.

8-29

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (302)

PROCESSOR MANAGEMENT AND INITIALIZATION

Example 8-4. Build File

INIT_BLD_EXAMPLE;

SEGMENT *SEGMENTS(DPL = 0) , startup.startup_code(BASE = 0FFFF0000H) ;

TASK BOOT_TASK(OBJECT = startup, INITIAL,DPL = 0,

NOT INTENABLED), PROTECTED_MODE_TASK(OBJECT = main_module,DPL = 0,

NOT INTENABLED) ;

TABLE GDT ( LOCATION = GDT_EPROM , ENTRY = ( 10: PROTECTED_MODE_TASK , startup.startup_code , startup.startup_data , main_module.data , main_module.code , main_module.stack

) ),

IDT ( LOCATION = IDT_EPROM );

MEMORY ( RESERVE = (0..3FFFH

-- Area for the GDT, IDT, TSS copied from ROM , 60000H..0FFFEFFFFH) , RANGE = (ROM_AREA = ROM (0FFFF0000H..0FFFFFFFFH))

-- Eprom size 64K , RANGE = (RAM_AREA = RAM (4000H..05FFFFH)) );

END

Table 8-5 shows the relationship of each build item with an ASM source file.

8-30

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (303)

PROCESSOR MANAGEMENT AND INITIALIZATION

8.10. P6 FAMILY MICROCODE UPDATE FEATURE

P6 family processors have the capability to correct specific errata through the loading of anIntel-supplied data block. This data block is referred to as a microcode update. This chapterdescribes the underlying mechanisms the BIOS needs to provide in order to utilize this featureduring system initialization. It also describes a specification that provides for incorporatingfuture releases of the microcode update into a system BIOS.

Intel considers the combination of a particular silicon revision and the microcode update as theequivalent stepping of the processor. Intel does not validate processors without the microcodeupdate loaded. Intel completes a full-stepping level validation and testing for new releases ofmicrocode updates.

A microcode update is used to correct specific errata in the processor. The BIOS, which incor-porates an update loader, is responsible for loading the appropriate update on all processorsduring system initialization (refer to Figure 8-7). There are effectively two steps to this process.The first is to incorporate the necessary microcode updates into the BIOS, the second is to actu-ally load the appropriate microcode update into the processor.

Table 8-5. Relationship Between BLD Item and ASM Source File

Item ASM386 and Startup.A58BLD386 Controls and

BLD file Effect

Bootstrap public startupstartup:

bootstrapstart(startup)

Near jump at 0FFFFFFF0H to start

GDT location public GDT_EPROMGDT_EPROM TABLE_REG <>

TABLEGDT(location = GDT_EPROM)

The location of the GDT will be programmed into the GDT_EPROM location

IDT location public IDT_EPROMIDT_EPROM TABLE_REG <>

TABLEIDT(location = IDT_EPROM

The location of the IDT will be programmed into the IDT_EPROM location

RAM start RAM_START equ 400H memory (reserve = (0..3FFFH))

RAM_START is used as the ram destination for moving the tables. It must be excluded from the application’s segment area.

Location of the application TSS in the GDT

TSS_INDEX EQU 10 TABLE GDT(ENTRY=( 10: PROTECTED_MODE_TASK))

Put the descriptor of the application TSS in GDT entry 10

EPROM size and location

size and location of the initialization code

SEGMENT startup.code (base= 0FFFF0000H) ...memory (RANGE(ROM_AREA = ROM(x..y))

Initialization code size must be less than 64K and resides at upper most 64K of the 4GB memory space.

8-31

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (304)

PROCESSOR MANAGEMENT AND INITIALIZATION

3,

8.10.1. Microcode Update

A microcode update consists of an Intel-supplied binary that contains a descriptive header anddata. No executable code resides within the update. This section describes the update and thestructure of its data format.

Each microcode update is tailored for a particular stepping of a P6 family processor. It isdesigned such that a mismatch between a stepping of the processor and the update will result ina failure to load. Thus, a given microcode update is associated with a particular type, family,model, and stepping of the processor as returned by the CPUID instruction. In addition, theintended processor platform type must be determined to properly target the microcode update.The intended processor platform type is determined by reading a model-specific register MSR(17h) (refer to Table 8-6) within the P6 family processor. This is a 64-bit register that may beread using the RDMSR instruction (refer to Section 3.2., “Instruction Reference” Chapter Instruction Set Reference, Volume 1 of the Programmer’s Reference Manual). The three plat-form ID bits, when read as a binary coded decimal (BCD) number indicate the bit position in themicrocode update header’s, Processor Flags field, that is associated with the installed processor.

Figure 8-7. Integrating Processor Specific Updates

P6 Family CPU

BIOS

UpdateBlocks

NewUpdate

UPDATELOADER

8-32

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (305)

PROCESSOR MANAGEMENT AND INITIALIZATION

Register Name:BBL_CR_OVRDMSR Address:017hAccess:Read OnlyBBL_CR_OVRD is a 64-bit register accessed only when referenced as a Qword through a RDMSR instruction.

The microcode update is a data block that is exactly 2048 bytes in length. The initial 48 bytesof the update contain a header with information used to identify the update. The update headerand its reserved fields are interpreted by software based upon the header version. The initialversion of the header is 00000001h. An encoding scheme also guards against tampering of theupdate data and provides a means for determining the authenticity of any given update. Table8-7 defines each of the fields and Figure 8-8 shows the format of the microcode update datablock.

Table 8-6. P6 Family Processor MSR Register Components

Bit Descriptions

63:53 Reserved

52:50 Platform ID bits (RO). The field gives information concerning the intended platform for the processor.52 51 500 0 0 Processor Flag 0 (See Processor Flags in Microcode Update Header)0 0 1 Processor Flag 10 1 0 Processor Flag 20 1 1 Processor Flag 31 0 0 Processor Flag 4 1 0 1 Processor Flag 51 1 0 Processor Flag 61 1 1 Processor Flag 7

49:0 Reserved

8-33

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (306)

PROCESSOR MANAGEMENT AND INITIALIZATION

Table 8-7. Microcode Update Encoding Format

Field NameOffset

(in bytes)Length

(in bytes) Description

Header Version 0 4 Version number of the update header.

Update Revision 4 4 Unique version number for the update, the basis for the update signature provided by the processor to indicate the current update functioning within the processor. Used by the BIOS to authenticate the update and verify that it is loaded successfully by the processor. The value in this field cannot be used for processor stepping identification alone.

Date 8 4 Date of the update creation in binary format: mmddyyyy (e.g. 07/18/98 is 07181998h).

Processor 12 4 Processor type, family, model, and stepping of processor that requires this particular update revision (e.g., 00000650h). Each microcode update is designed specifically for a given processor type, family, model, and stepping of processor. The BIOS uses the Processor field in conjunction with the CPUID instruction to determine whether or not an update is appropriate to load on a processor. The information encoded within this field exactly corresponds to the bit representations returned by the CPUID instruction.

Checksum 16 4 Checksum of update data and header. Used to verify the integrity of the update header and data. Checksum is correct when the summation of the 512 double words of the update result in the value zero.

Loader Revision 20 4 Version number of the loader program needed to correctly load this update. The initial version is 00000001h.

Processor Flags 24 4 Platform type information is encoded in the lower 8 bits of this 4-byte field. Each bit represents a particular platform type for a given CPUID. The BIOS uses the Processor Flags field in conjunction with the platform ID bits in MSR (17h) to determine whether or not an update is appropriate to load on a processor.

Reserved 28 20 Reserved Fields for future expansion.

Update Data 48 2000 Update data.

8-34

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (307)

PROCESSOR MANAGEMENT AND INITIALIZATION

8.10.2. Microcode Update Loader

This section describes the update loader used to load a microcode update into a P6 familyprocessor. It also discusses the requirements placed upon the BIOS to ensure proper loading ofan update.

The update loader contains the minimal instructions needed to load an update. The specificinstruction sequence that is required to load an update is dependent upon the loader revision fieldcontained within the update header. The revision of the update loader is expected to change veryinfrequently, potentially only when new processor models are introduced.

Figure 8-8. Format of the Microcode Update Data Block

32 01624 8

Update Data (2000 Bytes)

Reserved (20 Bytes)

Month: 8

Processor Flags

Loader Revision

Checksum

Processor

Date

Update Revision

Header Revision

Reserved: 24

Reserved: 18

Day: 8

P7: I

ProcType: 2

Year: 16

P6: I P5: I P4: I P3: I P2: I P1: I

Family: 4 Model: 4 Stepping: 4

32 01624 8

8-35

Intel Architecture Software Developer’s Manual· 2005. 11. 21.· Intel Architecture Software Developer’s Manual Volume 3: System Programming NOTE: The Intel Architecture Software - [PDF Document] (308)

PROCESSOR MANAGEMENT AND INITIALIZATION

The code below represents the update loader with a loader revision of 00000001h:

mov ecx,79h ; MSR to read in ECXxoreax,eax ; clear EAXxorebx,ebx ; clear EBXmovax,cs ; Segment of microcode updateshl eax,4movbx,offset Update ; Offset of microcode updateaddeax,ebx ; Linear Address of Update in EAXaddeax,48d ; Offset of the Update Data within the Updatexoredx,edx ; Zero in EDXWRMSR ; microcode update trigger

8.10.2.1. UPDATE LOADING PROCEDURE

The simple loader previously described assumes that Update is the address of a microcodeupdate (header and data) embedded within the code segment of the BIOS. It also assumes thatthe processor is operating in real mode. The data may reside anywhere in memory that is acces-sible by the processor within its current operating mode (real, protected).

Before the BIOS executes the microcode update trigger (WRMSR) instruction the followingmust be true:

• EAX contains the linear address of the start of the update data

• EDX contains zero

• ECX contains 79h

Other requirements to keep in mind are:

• The microcode update must be loaded to the processor early on in the POST, and alwaysprior to the initialization of the P6 family processors L2 cache controller.

• If the update is loaded while the processor is in real mode, then the update data may notcross a segment boundary.

• If the update is loaded while the processor is in real mode, then the update data may notexceed a segment limit.

• If paging is enabled, pages that are currently present must map the update data.

• The microcode update data does not require any particular byte or word boundaryalignment.

8.10.2.2. HARD RESETS IN UPDATE LOADING

The effects of a loaded update are cleared from the processor upon a hard reset. Therefore, eachtime a hard reset is asserted during the BIOS POST, the update must be reloaded on all proces-sors that observed the reset. The effects of a loaded update are, however, maintained across aprocessor INIT. There are no side effects caused by loading an update into a processor multipletimes.

8-36