Basic blocks and instructions statistics.

Posted On // 4 comments
While doing some of my research I was often wondering how big the typical basic block is - how many instructions it typically includes. Since I was unable to find any reputable results in this area I have decided to make my own observations...

Following table presents statistics about number of instructions in typical basic block (including the average number of instructions per basic block and the maximum number of instructions found in a single basic block). This simple research was driven mostly by curiosity and some additional questions I have encountered while building my own software. All of the analysed files (over 1500) come from Microsoft and are a part of Microsoft Windows XP operating system.

TABLE: http://piotrbania.com/all/articles/bb_instr_stats.html


UPDATE:
I have also prepared the Instructions Frequency Statistics table, you can check it here (instruction names are bit sucky at the moment):
TABLE: http://piotrbania.com/all/articles/instr_stats.html

4 komentarze:

Ashutosh said...

Interesting analysis!

Couple of things I was curious about:
1. The max_num_of_instructions_per_basic_block value of 1455 occurs often for several DLLs. It is likely it is some common pattern/code. Can you share what that common basic block is? What offset in, say, advapi32.dll on WinXP SP3?
2. The max of 2881 instructions in a basic block is very surprising. Again, can you share the offset?

Piotr Bania said...

Hey,

I doubt the "max_num_of_instructions_per_basic_block" is a pattern although I made no further experiments in this area.

On the other hand the "Average number of instructions per basic block" should be considered a pattern - most of the books about compiler theory etc. acknowledge 5-7 instructions per basic block as standard.

Regarding to your offset questions unfortunately I wasn't recording the basic blocks addrs for all of the cases. I was just recording the suspicious one and in the case of 2881 instructions I can provide you the addr (RVA) - here it is:

osk.exe: MaxInstrPerBB: 2881 (bbRVA=0x0000afd7)

Please note that in my engine (the built used for this experiment) CALL instructions do not terminate the basic block. IDA also uses this approach.

best regards,
pb

Ashutosh said...

I did a little IDA on osk.exe and that big block starting at 0x0000afd7 is osk!JapaneseKB which looks like its setting up a big table that's for the Japanese keyboard layout (internally called from SwitchToJapaneseKB). That explains such a large basic block.

It is likely that many of the max basic blocks in the other DLLs are doing some kind of table setup. Otherwise it is hard to imagine such large jump-free blocks.

And yes, the average stats for 5--10 instructions sounds right.

Piotr Bania said...

Yes I also think the biggest basic blocks are typically created for initialization purposes.

- pb