Excel Formula/Macro in .xlsb?

Excel Formula, or XLM – doesn’t stop giving pain to researchers?

On Friday I got a new sample using the xlsb file-format that supposedly was having malicious code. I had a quick look, and wow – this was different. My first check on VirusTotal (VT) showed me that it hadn’t been uploaded to VT yet. So with nothing to go on, I started looking into the sample.

Structurally it’s a Microsoft Excel 2007+ document containing (ZIP) the following files:

No alt text provided for this image

So naturally we look at the xl/macrosheets/sheet1.bin right? Ok, first we need to enumerate these records. The xl/macrosheets/sheet1.bin looks like this:

No alt text provided for this image

So how are the records stored? The answer is in Microsoft’s documentation. To establish the recordId, you read the first byte (0x81). Since the high-bit (0x80) is set, this means there is another byte to add to the recordId. Remove this bit for now and we get 0x01. Next byte is 0x01 and as the high-bit (0x80) isn’t set, this means we can use the value of the byte multiplied with 0x80. This means that the recordId is (1*128)+1 = 129- which is BrtBeginSheet. To get the length you do the same, read the next byte (0x00) which means there is no high-bit (0x80), so there is no other byte – and the rest of the 7 bits say 0, so the record has no data.

The next record is BrtWsProp with recordId 147 and length 23.

recordId: (0x93 & 0x7F) + (0x01*0x80) = 147 (0x93)
length:   (0x17 & 7x7F) = 23 (0x17)

Now you parse all the records and you get a nice list. Unfortunately while parsing the records of the xl/macrosheets/sheet1.bin I see nothing weird.

On to the other sheets then, what can we find here? Quite a few ones actually. The ones you are interested in, for now – while we learn is:

0BrtRowHdrTells you what row you currently are on
8BrtFmlaStringTells you about an embedded string and the pcode (parsed expression) to build this string
11BrtFmlaErrorTells you the pcode (parsed-expression)

Let’s have a brief look at the data we need.


Microsoft have documented this well in their PDF. To start with it contains an 8 byte cell information structure, a variable XLWideString (which looks like a Unicode string), 2 bytes of grbitFlags and then you get to the formula itself (CellParsedFormula structure).

The first one you’ll find is this:

No alt text provided for this image

which after decoding looks like this:

RECORD: BrtFmlaString (Id 8,offset 58d), LENGTH: 30
col: 26, row: 20 | strlen=1 : "/"
         1E 2F 00            PtgInt: 47
         41 6F 00            PtgFunc: CHAR (111)

The record has no information about the row, so you need to get this from the BrtRowHdr record.When you get to the CellParsedFormula structure you parse it like my previous article mentioned).


This record also starts with a 8 byte cell structure, then a one-byte fErr, 2 bytes grbitFlags before you get to the formula itself (CellParsedFormula structure).

No alt text provided for this image

When you parse the first record of this stream you’ll get:

RECORD: BrtFmlaError (Id 11,offset 1e3), LENGTH: 62
     49 27 00            PtgMemFunc: 27
     19 40 00 01         PtgAttrSpace: 0100
     23 04 00 00 00      PtgName: index 4
     23 14 00 00 00      PtgName: index 20
     0F                  PtgIsect:
     23 5D 00 00 00      PtgName: index 93
     0F                  PtgIsect:
     23 46 00 00 00      PtgName: index 70
     0F                  PtgIsect:
     23 15 00 00 00      PtgName: index 21
     0F                  PtgIsect:
     23 2F 00 00 00      PtgName: index 47
     0F                  PtgIsect:
     13                  PtgUminus:


This is just a simple structure, but for now we just want the row.

No alt text provided for this image

First DWORD gives you the sequence you need (in this case, 2).

The result

At the end, when you have parsed all these records from all these binary worksheets, you’ll end up with a virtual sheet that looks like this:

No alt text provided for this image

That is more informative, but it was a bit of work to get there. At least that is context you can relate to.

Now when I write the article I see VT indeed has received a copy, and when the sample was first checked (on entry) a single engine detecting it:

No alt text provided for this image

Kudos to Ikarus! I think my weekend project is over. I have a mental challenge when I have a problem: I can’t let it go until it’s solved, and now I can finally relax. Let me know if you need help! I think tools should give this kind of context automagically.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: