|

Searching for VME
Of all that is known about the secrets contained in the
Supplement to the Pentium Processor User's Manual ("Appendix
H"), nothing is guarded more closely than details of the
Virtual Mode Extensions (VME) implemented in the Pentium
processor and late-model Intel486's. Even when closely reading
the Pentium manuals, it is possible that the reader doesn't
notice that enhancements to virtual-8086 mode exist. Yet the
Pentium Processor Family Developer's Manual, Volume 3 is not
silent on the subject. There are at least 27 references to VME in
the Pentium manual. In addition to these references, another good
source of information is Intel's British VME Patent application,
which is publicly available. With a good understanding of
Virtual-8086 mode (v86 mode), one could infer most of the
remaining details of VME solely from those 27 references. All
that's needed to characterize the complete details is an
understanding of v86 mode, a little ingenuity, experimentation,
persistence, and no qualms about hitting the reset button to
restart a frozen computer. (For those with $12,000 to spare, an
In-Circuit-Emulator (ICE) would be helpful too.)
The need for VME.
When v86 mode was originally implemented, software writers
found two main purposes for its use. 1) DOS memory managers, and
2) DOS sessions under a multitasking operating system (MTOS)
(like a DOS box in Windows). Under a memory manager, DOS remains
a sequential single-tasking operating system. Therefore, hardware
resources don't need to be arbitrated;[1]
IOPL-sensitive instructions don't need to restricted at all.
Running a DOS session under Windows is quite different. Nearly
all resources need to be restricted. The DOS session should not
have direct access to the hardware with the ability to program
any and every device. Nor should the DOS session have the ability
to directly control the interrupt flag (IF) of the EFLAGS
register. Virtual-8086 mode has support for restricting such
access, through the use of IOPL to restrict IOPL-sensitive
instructions which modify IF,[2] and the I/O
permission bit map to restrict access to I/O ports. However,
there are a few shortcomings with the standard v86 mode.
- Setting IOPL to 3 provides the better performance than
setting IOPL less than 3. This setting reduces the
trapping overhead, but lets Virtual DOS Machines (VDMs)
disable interrupts, a potential integrity problem for the
whole system.[3]
- IOPL must be set less than 3
when the OS needs to virtualize the interrupt flag. When
a Virtual Device Driver (VDD) needs to simulate a
hardware interrupt into a VDM, it must be able to detect
when the VDM is interruptible. Therefore IOPL must be
less than 3 so that the interrupt flag can be
virtualized.[3] Since IOPL is less
than 3, performance is significantly degraded by attempts
to execute interrupt flag sensitive (IF-sensitive)
instructions which always fault to the v86 monitor.
- An operating system may allow a VDM to receive real
(external) interrupts, or virtual interrupts. This is a
policy decision made by the OS implementers. If a v86
task only receives virtual interrupts, then it can be
demand-paged, whereby it is swapped out to disk when real
memory is needed for other purposes. If the v86 task
receives real (external) interrupts, then it cannot be
demand-paged, since the interrupt handler may be paged
out when the interrupt occurs. The OS may not have enough
time to bring in the entire task before another interrupt
occurs; likewise it would be too complicated and still
too time consuming to bring in just the portion of the
task which contains the interrupt service routine.[4]
- All INT-n instructions cause a switch out of v86 mode.
When IOPL<3, an INT-n faults to the v86 monitor. When
IOPL=3, the INT-n attempts to invoke the protected mode
interrupt handler associated with that particular
interrupt (success depends on the DPL of the gate being
used). The monitor or interrupt handler must either
emulate the interrupt's functionality, or restart the v86
task in such that it executes the interrupt itself (this
is known as reflecting the interrupt). DOS kernel
routines are accessed through software interrupts.
Therefore, thousands of interrupt calls generate
thousands of transitions in and out of v86 mode. This
gives the v86 task a substantial performance disadvantage
to the same program running under DOS.
VME fixes v86 problems
Enhanced v86 mode was designed to eliminate many of these
problems, and significantly enhance the performance of v86 tasks
running at all IOPL levels. When running in Enhanced virtual-8086
mode (Ev86) at IOPL=3, CLI and STI still modify IF. This behavior
hasn't changed. Running at IOPL<3 has changed. CLI, STI, and
all other IF-sensitive instructions no longer unconditionally
fault to the Ev86 monitor. Instead, IF-sensitive instructions
clear and set a virtual version of the interrupt flag in the
EFLAGS register called VIF.[5] Clearing VIF does not block external
interrupts, as clearing IF does. Instead, IF-sensitive
instructions clear and set a virtual version of the interrupt
flag called VIF. VIF does not control external interrupts as IF
does. Rather, VIF is an indicator of the interruptibility state
of the EV86 task. Thus, the operating system is invulnerable to a
bug in a DOS program which inadvertently attempts to disable
interrupts and spin inside of a loop, as VIF will be cleared
instead, while IF will remain set. This new behavior has some
substantial benefits: performance is increased, as CLI and STI
don't cause time-consuming faults. Secondly, the complexity of
the monitor program is reduced, as it doesn't have to maintain
its own virtual interrupt flag. When the old-style v86 monitor
virtualized IF, it needed to emulate all changes to IF caused by
IF-sensitive instructions (CLI, STI, PUSHF, POPF, INT, and IRET).
Using Ev86 mode eliminates this complexity because the CPU
automatically virtualizes IF; performance increases because
IF-sensitive instructions don't fault to the Ev86 monitor.
When external interrupts are generated, such as timer ticks
and keyboard strokes, the host operating system running at CPL-0
always intercepts these interrupts. When some interrupts occur,
the current task may not be the v86 task, it may be swapped out
to disk, or it may be in an uninterruptible state. When this
occurs, the host OS must delay sending the interrupt to the v86
task until it is running, and ready to accept interrupts. Other
interrupts may be intended for a specific VDM, but not all VDMs
(like keystrokes). In this case, the v86 monitor needs to send a
specific interrupt to a specific VDM -- ignoring all other VDMs.
Delaying and filtering interrupts in this manner is known as interrupt
virtualization. Once the VDM with a virtual interrupt pending
becomes interruptible, the OS reflects the interrupt to the VDM
as if a real interrupt had occurred.
Prior to Ev86 mode, the v86 monitor needed to maintain a
virtual interrupt flag in software. The v86 monitor was forced to
handle many exceptions which were unnecessary. For example, when
a virtual interrupt was pending, further IOPL-sensitive
instructions which attempted to clear IF caused undesired faults,
which then caused the monitor to redundantly clear the virtual
interrupt flag. This problem doesn't exist in Ev86 mode. These
instructions which redundantly attempt to clear IF don't fault to
the monitor. Therefore the source code which exists to clear the
software virtual interrupt flag can be removed. In fact, while
using Ev86 mode, all of the code needed to maintain the software
virtual interrupt flag can be removed -- as the virtual interrupt
flag is maintained by the CPU itself.
Prior to Ev86 mode, software interrupts (INT-n instructions)
always caused a switch out of v86 mode. If IOPL=3, the transition
occurs through a gate associated with the interrupt;[6] when IOPL<3, the transition occurs as the
result of a general protection fault to the monitor. When IOPL=3,
the monitor needs to determine whether or not the cause of the
interrupt is a software interrupt, external interrupt or
CPU-generated exception. When IOPL<3, software interrupts
don't transition through their associated gates in the IDT (they
transition through the #GP gate). In the case of software
interrupts, the monitor must interpret the opcode to determine
which interrupt number needs servicing. The monitor must then
emulate the interrupt, or reflect the interrupt back to the v86
task. External interrupts and CPU-generated exceptions still
transition through their associated gate in the IDT. For these
cases, the monitor still needs to determine the source of the
interrupt (external or CPU-exception), and take the appropriate
action. Using Ev86 mode can simplify this process, and enhance
the performance of handling software interrupts.
Software interrupt execution is controlled by a new structure
in the TSS called the interrupt redirection bit map (IR bit map).
Each bit in this new structure controls whether or not a specific
software interrupt will be invoked in a manner compatible with
the Intel386, or be invoked purely within the Ev86 task. In Ev86
mode, these interrupts may be invoked and executed without ever
leaving the Ev86 task. Using this new technique would reduce
complexity in the monitor. Interrupts which would normally fault
to the monitor, no longer would. Interrupts which would
transition through the IDT no longer would.
Overview of VME Components
VME support is enabled and disabled by setting and clearing
the VME bit in CR4 (bit 0). When enabled and running at IOPL=3,
all INT-n instructions are controlled by the interrupt
redirection bit map in the TSS.[7] When
running at IOPL<3, in addition to the INT-n behavior,
IF-sensitive instruction are allowed to execute without faulting
to the Ev86 monitor.
The TSS has been extended to include a 32-byte interrupt
redirection bit map. 32-bytes is exactly 256 bits, one bit for
each software interrupt which can be invoked via the INT-n
instruction. This bit map resides immediately below the I/O
permission bit map (see Figure 1). The
definition of the I/O Base field in the TSS is therefore extended
and dual purpose. Not only does the I/O Base field point to the
base of the I/O permission bit map, but also to the end (tail) of
the interrupt redirection bit map. This structure behaves exactly
like the I/O permission bit map, except that it controls software
interrupts. When its corresponding bit is set, an interrupt will
fault to the Ev86 monitor. When its bit is clear, the Ev86 task
will service the interrupt without ever leaving Ev86 mode.
Figure 1 -- Interrupt redirection
bit map in the TSS

VIF and VIP EFLAGS Bits
Two new flags were added to the EFLAGS register. These flags
are intended for use when the IOPL of the Ev86 task is less than
3 (see sidebar Caveats
Of VME (When CR4.VME=1)). They can only be purposely modified
by the CPL-0 Ev86 monitor or an interrupt service routine.
VIF is a virtualized version of the standard interrupt flag
(IF). While the Ev86 task is running, any CLI and STI instruction
will not modify the actual IF, instead these instructions modify
VIF.[5] This fact is completely hidden from
the Ev86 task, as PUSHF, POPF, INT-n, and IRET have also been
modified to help hide this behavior.
The VIP flag is a Virtual Interrupt Pending flag. VIP can
assist the multitasking operating system in sending a virtual
interrupt to the Ev86 task. The easiest way to understand VIP is
to explain its use in the context of a program running on an
8086. When the 8086 is in an uninterruptible state, external
interrupts remain pending but don't get serviced. After IF is set
(because of STI, POPF, or IRET), the pending interrupt is
serviced by the CPU. VIF and VIP are intended to serve this same
purpose to the MTOS running an Ev86 task. Let's assume your Ev86
task was at the same uninterruptible point as the previous 8086
example. A timer-tick interrupt occurs, and the MTOS services the
interrupt. During the interrupt service routine, the MTOS decided
that the Ev86 task needs to service this timer tick, and sets
VIP. After returning, the Ev86 task is still in an
uninterruptible state (VIF=0). At some later time, the Ev86 task
attempts to set IF (STI, POPF, or IRET). When this happens, the
Ev86 task becomes interruptible, and a general protection fault
to the monitor immediately occurs (#GP(0)).[8]
IF-sensitive instructions
To support the new VIF and VIP flags, changes were needed to
the instructions which read and write the interrupt flag of the
EFLAGS register. CLI, STI, PUSHF, POPF, INT-n, and IRET all had
to be changed to support Ev86 mode.
When an Ev86 task is running at IOPL<3, CLI, and STI clear
and set the VIF flag, instead of faulting to the Ev86 monitor, or
affecting the IF flag.[5]
PUSHF copies the contents of the VIF flag to the IF position
as it pushes the FLAGS image onto the stack. This gives the
appearance to the Ev86 task that STI and CLI are really setting
and clearing IF. This appearance is necessary in case the
software attempts to check for this condition. Such a code
sequence which tests for the interrupt flag, is seen in Listing 1. In addition to moving the VIF to the
IF on the stack image, PUSHF always pushes an IOPL image of 3
onto the stack. It is important to remember that the Pentium's
IF-sensitive instructions behave identically to the Intel486 when
IOPL=3, even when CR4.VME=1. Therefore, PUSHF simulates an IOPL=3
to any software wishing to read the stack image to determine its
IOPL. The actual IOPL of the Ev86 task never changes during this
process.
Listing 1 -- Code demonstrating how
software tests for the IF flag
STI ; Enable interrupts
PUSHF ; Store FLAGS on stack
POP AX ; Restore flags into register
TEST AX,200h ; Interrupt flag set?
Jcc label ; Jump on condition
POPF works similar to PUSHF by copying the bit in the IF
position to VIF flag as it pops the FLAGS image from the stack.
The Pentium is careful to make sure that the faked IF and
IOPL aren't accidentally copied into the real IOPL during the
POPF operation. Before the FLAGS image is merged into the EFLAGS
register, the IF image is copied to the VIF slot, the IF and IOPL
images are cleared. All of the actual FLAGS register bits are
cleared, except the actual IF and IOPL. Finally, the filtered
FLAGS image is merged with the actual EFLAGS register. A
side-effect of POPF is its handling of the TF in the stack image.
If the TF on the stack image is set, then POPF causes a general
protection fault before any FLAGS values are modified (#GP(0)).
The IRET instruction behaves exactly as the POPF instruction
does with respect to IF, VIF and IOPL. IRET and POPF differ in
how they handle the trap flag from the stack image. If TF is set
in the FLAGS stack image during POPF, a #GP(0) occurs, yet for
IRET the #GP does not occur.
The INT-n instruction is the most complicated of the
IOPL-sensitive instructions. INT-n behaves exactly like PUSHF in
how it handles IF, VIF, and IOPL.[9] However,
one of the enhancements to Ev86 mode is the ability of the Ev86
task to execute software interrupts without leaving Ev86 mode.
This enhancement has been accomplished with the aid of the
interrupt redirection bit map in the TSS. When the corresponding
IR bit is set, the interrupt will be invoked in exactly the same
manner as a normal v86 task. When the corresponding bit is clear,
the interrupt is invoked as if it were executing on an 8086
processor. In other words, a fault to the monitor is never
generated, nor a transition to the protected mode interrupt
handler. The interrupt transition and return are done entirely
within the Ev86 task. The influence of the IR bit map is best
described by the pseudo-code in Listing 2.
Listing 2 - Interrupt handling
description in Ev86 mode
N = INTERRUPT_NUMBER;
INTERRUPT_BIT_MAP_PTR = TSS_BASE->IO_PERMISSION_BASE - 32;
IF INTERRUPT_BIT_MAP_PTR->BIT_NUMBER[N]
IF (IOPL<3)
#GP(0);
ELSE
GOTO INT-FROM-V86-MODE;
ELSE
INVOKE_REAL_MODE_STYLE_INTERRUPT_FROM_Ev86_TASK(N);
Conclusions
The virtual mode extensions are very useful to memory managers
and multitasking operating systems. Memory managers can primarily
benefit by the use of the interrupt redirection bit map to reduce
the number of switches to and from protected mode. This has the
added benefit of reducing the complexity of the interrupt service
routines, as they no longer have to reflect software interrupts
back to the v86 task.
Multitasking operating systems can benefit in many ways. The
MTOS benefits from interrupt redirection, and from the virtual
interrupt support. The MTOS would run with virtual mode
extensions enabled, and the Ev86 tasks running at IOPL<3. This
gives the MTOS full benefit of the virtualization of interrupts.
When the MTOS wishes to send a virtual interrupt (like a virtual
timer-tick) to an uninterruptible Ev86 task, it will do so by
setting VIP=1. When the task becomes interruptible, a general
protection fault occurs, and the MTOS will send the virtual
interrupt to the Ev86 task. This would give programs which are
timer-dependent (such as games) a significant performance
advantage. As an added benefit of using the virtualization
features of the CPU, even more complexity of the Ev86 monitor can
be removed. The result of using these new features, is an Ev86
monitor that is simpler to implement and maintain than its
non-Ev86 counterpart, and software which runs faster.
Endnotes
- Except I/O ports subject to I/O
protection from the I/O permission bit map (like DMA
ports).
- The I/O port instructions which are
IOPL-sensitive in protected mode are not IOPL-sensitive
in v86 mode.
- The Design of OS/2 by H.M. Deitel
& M.S. Kogan; Chapter 10 "Compatibility - 80386
DOS Compatibility" Section 10.4.1.
- The Design of OS/2 by H.M. Deitel
& M.S. Kogan; Chapter 10 "Compatibility - 80386
DOS Compatibility" Section 10.4.
- STI will fault when EFLAGS.VIP=1.
- Assuming that all other protection
attributes will permit the transition to occur.
- Interrupt redirection is not
conditional upon the IOPL setting.
- See sidebar Caveats Of VME (When
CR4.VME=1).
- Provided this interrupt is redirected
from the Ev86 monitor to the Ev86 task (subject to the
Interrupt Redirection bit map).
Source code examples 1:
The following examples are available for viewing and download.
The first eight examples demonstrate how to reflect an
interrupt back to the (E)v86 task in various processor
environments.
DPL INTR
VME IOPL IGATE BITMAP
VME_1.ASM 0 3 0 X
VME_2.ASM 0 2 X X
VME_3.ASM 0 3 3 X
VME_4.ASM 1 3 0 1
VME_5.ASM 1 2 X 1
VME_6.ASM 1 3 3 1
VME_7.ASM 1 3 X 0
VME_8.ASM 1 2 X 0
View source code for VME_1.ASM:
ftp://ftp.x86.org/source/vme1/vme_1.asm
ftp://ftp.x86.org/source/vme1/struct.inc
ftp://ftp.x86.org/source/vme1/macros.inc
ftp://ftp.x86.org/source/vme1/sysseg.inc
ftp://ftp.x86.org/source/vme1/dataseg.inc
View source code for VME_2.ASM:
ftp://ftp.x86.org/source/vme1/vme_2.asm
View source code for VME_3.ASM:
ftp://ftp.x86.org/source/vme1/vme_3.asm
View source code for VME_4.ASM:
ftp://ftp.x86.org/source/vme1/vme_4.asm
View source code for VME_5.ASM:
ftp://ftp.x86.org/source/vme1/vme_5.asm
View source code for VME_6.ASM:
ftp://ftp.x86.org/source/vme1/vme_6.asm
View source code for VME_7.ASM:
ftp://ftp.x86.org/source/vme1/vme_7.asm
View source code for VME_8.ASM:
ftp://ftp.x86.org/source/vme1/vme_8.asm
Download all eight examples and executables:
ftp://ftp.x86.org/dloads/VME1.ZIP
Source code examples 2:
The second source code example demonstrates the handling of
VIF in Ev86 mode. This was written by Jim Brooks and serves as an
excellent programming example. There are too many files included
to provide links to each and every one of them, so click here to download the
entire source code archive, or you may view individual files at:
ftp://ftp.x86.org/source/v86mon1
Back to Books and Articles
home page
|