At the end of October 2010 I've started thinking more seriously about the possibility of building something like Ksplice (Rebootless Kernel Updates) for Microsoft Windows systems.
As far as I remember Ksplice requires original linux kernel source to create object files. So two kernel builds are created one pre-patched (without applied patches) and second one post-patched (with applied patches). After this level is done Ksplice compares the pre and post object files in order to determine changed functions.
Appending to Ksplice paper linux security updates rarely make semantic changes to kernel's persistent data structures (changes that would require existing instances of kernel data structures to be transformed). This allows Ksplice to work correctly without any additional manual work. At this point the rest of the Ksplice techniques are unimportant for the sake of this story.
As you probably heard Microsoft Windows is not an open source system (I swear I can hear the laugh of China government right now). This means we can't follow the Ksplice methods here and this makes things much more complicated (oh believe me). There are also other limitations like PatchGuard on x86-64 etc. - but lets focus on the x86-32 architecture for now. Even though I doubted from the start this project may work I have decided to go on, silly me!
First thing first. The disassembler is a must, I guess IDA is a standard in the entire security community but somehow i wanted to create my own, independent one. Most of guys consider this as waste of time, they are probably right. I've done the x86-32 disassembler library before, my DISIT was already used in numerous projects - mostly mine projects. But yeah, it is just a library that knows nothing about Portable Executable format, control flow and other important stuff. Anyway in a brief I have created my own mini IDA (without gui and things). My PE disassembler was directed especially for the Microsoft Portable Executable files. Luckily for me they almost always come with symbols (too bad they don't contain information about all the function entrypoints :-() and they are not using self-modifying code. Self-modifying code complicates things and requires manual interaction which makes the entire project pretty useless. Anyway building a disassembler is surely a bumpy road and the data vs code problem is often a hell on earth. But somehow I have managed to create a reliable and pretty stable engine with small number of false-positives regarding the code recognition problem. Disassembler results are pretty good (IMHO) at least when it comes to speed (you can see them on AutoDiff website). I must add that viewing a 500Mb disassembly output as a text-log file is also pain in the ass. Especially because Notepad++ starts to panic with large files (thus I recommend TextPad to everyone who likes viewing big things).
Second step was to create the binary diffing engine. There are numerous papers on this topic. For me the old paper that came out from Microsoft Research in 1999 was the most informative. Nice paper, I was seriously considering tattooing MSR-TR-99-83 on my testicals after reading it. Anyway after some time I have created my own binary differ. It is still not perfect and requires more testing but it turned to be pretty accurate and that was enough for my purposes. I have added some extra functions for finding variables across the diffed files but it turned out to be not enough to complete this project.
So with binary differ done I could proceed with my research. Though any binary differ would be suitable, I just thought I would learn more while writing my own. As I have already mentioned Ksplice assumes that most of the security patches do not make semantic changes to data structures. When it comes to Windows security patches this assumption is wrong. Appending to my tests Microsoft patched files often include additional global variables. This makes the rebootless updates idea impossible to implement on Windows without manual interaction. In other words the program is not able find a solution for many problems like how the added variable should be initialized, when it should be initialized etc. etc. In most cases wrongly initialized variable will create problems with the program behavior. And this is just the tip of the ice berg. In my case this was the dead end for this project, I have tried to ask around but not surprisingly no one had a solution to this issue. I actually doubt there is one. The standard binary rewriting process is not really challenging maybe because I have done it in numerous times before, but this is useless without the rest of the needed answers.
Anyway in order to make some parts of this project usable i have written some SQL exporting module and then the AutoDiff project was born. You can read more about it here. I hope i will have enough free time to keep this thing alive. I have no plans for releasing the source codes. If I change my mind i will let you all know. So that's all folks.
"A life, Jimmy, you know what that is? It's the shit that happens while you're waiting for moments that never come."
(P.S thanks to spendersan for the fast edit)