Thursday, November 12, 2009

Performance of Heap32Next on Windows 7 is linear with heap entry count

Heap32Next is a Windows kernel function used by OpenSSL's rand_win to generate random data via traversing a heap list.

You'd expect such a function to have O(1) performance, wouldn't you?  Nope - not on Win7, at least.  The performance of Heap32Next on any heap entry in heap list is dependent upon the summed number of heap entries in all heap lists.

Here is time taken plotted against the number of heap objects:

You can see that there are jumps which happen to coincide with nice large powers of 2.  Here is the plot of the power-of-two intercepts (represented by the reference lines above), with a linear model fit applied:

Heap32Next Time (ms) = 0.635122*Thousands of Objects on Heap + -56.2174
R-squared: 0.999779

Here's one relevant stack trace:
ntdll.dll!_RtlpWalkHeap@12()  + 0x3f bytes   
ntdll.dll!_RtlpQueryExtendedInformationHeap@16()  + 0x4f5 bytes   
ntdll.dll!_RtlpQueryExtendedInformationAllHeaps@12()  + 0xe5 bytes   
ntdll.dll!_RtlpQueryExtendedHeapInformation@12()  + 0xe7 bytes   
ntdll.dll!_RtlQueryHeapInformation@20()  + 0x1bc76 bytes   
ntdll.dll!_RtlQueryProcessHeapInformation@4()  + 0x288 bytes   
ntdll.dll!_RtlQueryProcessDebugInformation@12()  + 0x11492 bytes   
kernel32.dll!_Heap32Next@4()  + 0x4d bytes   
randwin.exe!RAND_poll()  Line 231 + 0x9 bytes    C++

randwin.exe is my test harness, and sampling indicates that the vast majority of time is spent in RtlpWalkHeap.

Edit: this bug has been filed with Microsoft and OpenSSL.

No comments:

Post a Comment