Saturday, May 3, 2008

VBFRiSK

I've decided to release a beta version of my Visual Basic FRiSK plugin for IDA.

FRiSK stands for Fully Reversed in-Sequence Krypto... Just kidding.

Anyway this is how to use it:

1. Drop the IDA plugin into the IDA/plugins directory and re/start IDA.
1. Load a VB executable in IDA as you would normally
2. Run the plugin

Hmm, in retrospect that didn't really require explicit steps. Yes, its meant to be that simple. Once run, and the plugin has established that it is indeed dealing with a VB file to its satisfaction, it will parse the undocuments VB structures that are scattered throughout the file, marking, naming and making everything it finds in its path. This reveals the following:

- external API's (i.e anything not included by msvbvm?0.dll)
- strings (including unicode)
- forms (i have all the code to parse the form attributes as well but its not yet built in)
- event handlers (i.e Form_Load)

As you can imagine this is all really helpful stuff, I hope you find it as useful as I did.

Any troubles at all email me, and remember this is a debug beta build and is not final. Any havoc caused to your IDA listing is not my responsibility. I recommend using this only on VB files and only on virgin deadlistings. I've also noted it may take some time to finish when parsing overly large files, this is not my fault and the best way to handle this is to damn well wait for it to finish . ;)

Enjoy

Sunday, February 24, 2008

Mac OS X + IDA as it should be

[click picture to view]

The Lua script

This Lua script takes a single source file as a command line argument - the output of otx. It will attempt to pull out all the useful information and build a list of IDC commands that can be imported into IDA, decorating the disassembly with the obj-c metadata.

Lua can be downloaded from here: http://www.lua.org

if not arg[1] then
print("Usage: otx.lua <otx output*gt;")
return
end

-- create a few string helper functions
--
function string:split(inSplitPattern, outResults)
if not outResults then
outResults = { }
end
local theStart = 1
local theSplitStart, theSplitEnd = string.find(self, inSplitPattern, theStart)
while theSplitStart do
table.insert(outResults, string.sub( self, theStart, theSplitStart-1))
theStart = theSplitEnd + 1
theSplitStart, theSplitEnd = string.find(self, inSplitPattern, theStart)
end
table.insert(outResults, string.sub(self, theStart))
return outResults
end

function string:strip()
return self:gsub("[%^%$%%%(%)%>%{%}%*%+%-%?%[%]]", "%%%1")
end

function string:for_ida()
return self:gsub("[\\\"]", "\\%1")
end

-- main starts here
--
infile = io.open(arg[1], "r")
outfile = io.open("output.idc", "w")

outfile:write("#include \"ida.idc\"\n\nstatic main()\n{\n")

repeat

-- read in the source, one line at a time
line = infile:read("*line")

if line then
-- split on whitespace
parts = line:split("%s+")

if line:find("^[%-%+]") then

-- we have a method name
meth_name = line:match("%[([^%]]+)%]")

parts = infile:read("*line"):split("%s+")

outfile:write("\tMakeNameEx(0x" .. parts[3] .. ", \"" .. meth_name:for_ida() .. "\", 0x102);\n")
end

if parts[3] and parts[7] then

-- we have an extra comment
ea = parts[3]
temp = parts[7]
o = line:find(temp:strip())

comment = line:sub(o, -1)

outfile:write("\tMakeComm(0x" .. ea .. ", \"" .. comment:for_ida() .. "\");\n")
end
end

until not line

outfile:write("}\n")

io.close(outfile)
io.close(infile)

Building IDC on the fly

IDC is a c-like scripting language that IDA uses. It was originally hacked in as a way to automate certain tasks within IDA.

There are two ways to execute an IDC script, one is by using the 'Quick IDC' (Shift-F2) text box and either copy/pasting the code in, or writing it from scratch. The other is to create a *.idc file and execute that (File->IDC File).

As a language, IDC is very poor. and it is laced with numerous inconsistencies and hacks.

To name a few I've come across:

- You cannot use c-style comments from within Quick IDC scripts (though when loading via file, you can)
- The IDC reference (Help->Help Index : Index of IDC functions) is one of the worst language references I've seen, most elements are either described sparsely or not at all.
- There is no example usage
- There are no types (everything is a variant or 'auto' variable)
- Contains a mixture of three different coding styles in the IDC reference alone (which are case-sensitive), e.g:
  1. loadfile
  2. isCode
  3. DelFunction
- The functionality offered is very basic
- The list goes on...

All is not lost! There exists another way to automate IDA - by writing plugins. Plugins are binaries that are loaded dynamically by IDA and have much greater control over the data and interface than scripts via IDC, however due to the sloppy and poorly documented SDK that comes with IDA (marginally better than IDC - but only marginally), it makes writing them a lesson in tedium and will sorely test the patience of any seasoned coder.

So what do you do if you need to do simple tasks that IDC aren't up to, but that don't warrant a full-blown plugin? In some circumstances this can be done by using a separate language to build up IDC scripts on the fly, it requires an extra step in the process but in some cases can be much better that either a plugin or by using IDC solely.

It is at these times that I use Lua. I write a Lua script does the processing and builds an IDC file on the fly, which I then load into IDA manually. I used this exact method to parse an otx output file into an IDA database. Not only did it work well but I only had to look up two IDC methods, rather than code the whole thing in IDC and spend all my time working around its base inadequacies.

The code and instructions for using it will follow...

More information please

This is an example of the extra information that otx can pull out of Objective-C binaries:

[click picture to view]

And here is what IDA shows by default:

[click picture to view]

As you can see, each tool on its own generates enough information to reverse the target, however if you combine that information you could possibly do it more quickly and easily.

I intend to extract that information from the otx output and insert it into the IDA database, I'll show you how soon.

Saturday, February 23, 2008

IDA and Mac OS X

IDA is the premier disassembling toolkit available currently but it was originally designed for running on Windows and so the Mac OS X version is console only and consequently crap to use.

When using IDA on OS X binaries, it is best to use IDA for Windows, either on another PC or by using virtualization, I would recommend Parallels Desktop, VMWare of QEMU.

Thats not to say it doesn't support OS X binaries well, because it does. The newest version v5.2 has made some good progress concerning OS X binaries, but it still doesn't get it quite right. One good thing it does, is recognise Universal Binaries, which you can open directly, it then gives you the option of choosing the binary you wish to use disassemble, quite handy really.  

Another tool for disassembling Mac binaries, is a tool called 'otool', which gets installed when you install Apple's XTools. otool can give you plenty of information about a binary, from the libraries it needs to run, to a full disassembly listing. 

Many Apple programs are written in a language called Objective C (obj-c from here). In terms of reverse engineering its not necessary to know the language, however some of its internals work slightly differently than most other languages. Obj-c compiles to native instructions, so its not interpreted, but it uses an interesting OOP style system and (ab)uses messaging heavily.

Another interesting obj-c idiosyncrasy is the meta data that is embedded within the binary itself. Since it uses messaging and OOP heavily, it stores all quite a lot of meta data within the binary, so the messaging system knows how and what it can transmit and to where. This data, of course, can be understood and ripped out by tools, IDA even contains some structures that are imported by default to handle this meta-data.

There is a tool called 'otx' which uses otool to create a disassembly and decorate it with the obj-c metadata. This is very handy for us because it means:

- easier identification of functions
- easier readability 
- less renaming/work

My next post will contain a side-by-side look at the IDA output and the otx output.

Note: IDA does recognise many of the meta data structures, it just doesn't really parse and display it in a satisfactory fashion. More work is required to get it to a nice readable level (see next post). 

Universal Binaries

Since most Mac programs need to run on x86 and ppc architectures (which are rather different) you will find that most mac binaries are distributed as something called a "Universal Binary". Example "file" output:

fileoffsets-macbook:MacOS fo$ file Adium 
Adium: Mach-O universal binary with 2 architectures
Adium (for architecture ppc): Mach-O executable ppc
Adium (for architecture i386): Mach-O executable i386


Essentially a Universal Binary is just a bundle which consists of both the x86 and ppc compiled versions of the binary in question. This is good for Mac owners, because it means that for the most part you don't have to know, or even care which CPU your Mac has as OS X will choose the correct binary to run when you execute the application. It's totally transparent to the user.

What this does mean however is that for every Universal Binary, you will be downloading two separate versions of the binaries and bundled libraries (more on this later).

This is the reason that on some websites, where certain programs are available for download for both Windows and Mac OS X, the OS X download may be as much as double the size of the Windows ones.

To extract a single version of the binary, you can use a command line tool called 'ditto', e.g:

ditto --arch i386 <source filename> <dest filename>

This will copy the x86 version of the binary out of the Universal Binary, to the <dest filename>.

OSX Reverse Engineering

I acquired a Macbook recently, so reversing in OS X has become a sudden necessity. Not interested in the somewhat dated PowerPC CPU cores, I made sure to get an Intel version.

Traditionally reversing on a mac was very different from a PC but with the introduction of Intel CPU's and the new MachO OS X kernel (based on FreeBSD's kernel) things took a turn for the better for regular PC users.

The learning curve for a traditional PC reverse engineer is much shorter than it once was, which means most experienced reversers should have very little trouble making the transition.

Over the coming weeks I will be detailing the unusual and more interesting aspects of OS X that will help aid in reversing.

Yet Another Reverse Engineering Blog

First!