The crash dump is as follows:
0:005:x86> kb L8 ChildEBP RetAddr Args to Child 02e2a4fc 6bfc5cfc 00000000 80070057 02e2a52c MSHTML!CMarkup::InitCollections+0xf 02e2a50c 6bfc5dc3 04358798 00000000 00001200 MSHTML!GetAll+0x30 02e2a52c 6c0625b3 04358798 056e0594 02e2a9e4 MSHTML!CElement::getElementsByTagName+0x86 02e2a560 6c04e9bb 04358798 09115b80 0433b708 MSHTML!Method_IDispatchpp_BSTR+0xd2 02e2a5d4 6c05a066 04358798 8001043d 00000001 MSHTML!CBase::ContextInvokeEx+0x5dc 02e2a624 6c05a0a6 04358798 8001043d 00000001 MSHTML!CElement::ContextInvokeEx+0x9d 02e2a650 6bffb6ae 04358798 8001043d 00000001 MSHTML!CInput::VersionedInvokeEx+0x2d 02e2a6a4 6f9da1bc 0433beb8 8001043d 00000001 MSHTML!PlainInvokeEx+0xeb
The bug occurs in MSHTML!CMarkup::InitCollections due to a NULL pointer read provided by arg0 and used via ebx. The fault occurs near the start of the code, in the first basic block. Thus it is obvious the trigger isn’t in InitCollections.
This function appears to close HTML/CSS objects. I’m still debugging this function, but the problem appears to be that most of the if blocks are bypassed resulting in this+0x1c being altered CHtmParse::CloseContainer() being called. If you’re clever you can find a very old (yet somewhat accurate) version of this function online. As you can see, there are many reference to bugs, gotos, and general ugliness.
The bug involves the return pointer of CHtmParse::FindContainer() plus 0x75 being NULL and then hitting the following basic blog
loc_70B9E176: ; This is Step 5 mov eax, [edi] or dword ptr [eax+1Ch], 10000h ; <-- the infamous pointerpush 1push edimov ecx, esicall CHtmParse::CloseContainer(CTreeNode *,int)
The green basic blocks indicate areas traversed when the bug triggers:
If you’re Microsoft, I suggest checking the pointer in GetAll() for NULL.