LOW-LEVEL STUFF

LOCKED AND LOADED (January 2007)

Last year, I found a curious bug in Windows regarding the handling of certain invalid opcode sequences. At the time, I simply documented it and then forgot about it. Recently, however, I was reminded of the bug, so I thought that other people might be interested in reading about it.

Because of the way in which the Intel x86 architecture works, when an invalid opcode exception occurs, there is no easy way to tell why it occurred. By this, I mean that without actually looking at the faulting opcode sequence, it's not possible to tell the difference between an unsupported opcode and an invalid use of the LOCK prefix. For this reason, Windows runs this code:

mov ecx, 4 ;maximum prefix count
look_op:
mov al, byte ptr es:[esi] ;points to faulting opcode sequence
cmp al, f0h ;looks like LOCK?
je op_lock ;yes
add esi, 1 ;no, continue with next byte
loop look_op ;until no more bytes
mov eax, c000001d ;STATUS_ILLEGAL_INSTRUCTION
ret
op_lock:
mov eax, c000001e ;STATUS_INVALID_LOCK_SEQUENCE
ret

While there are only three classes of prefixes that can appear in addition to an otherwise valid lock sequence (segment override, operand-size override, and address-size override), no current CPU instruction allows REP to be combined with LOCK. This is the reason for the value of 4 in ecx. The bug is that Windows checks for only the LOCK prefix and no other. Thus, if the value "f0" happens to appear anywhere within the first four bytes of the faulting opcode sequence, even if it is not truly a LOCK (e.g. fe f0), then Windows will return the wrong exception value.

This is a particular problem for operating-system emulators, since such a condition would occur only rarely, so it seems likely that no one has support for this behaviour. Surprise!