Numerous data formats specify CRC32 (Cyclic Redundancy Check 32-bit) checksum for some part of the data for detecting accidental data corruption. For example, the PNG (Portable Network Graphics) file type is such a format. In addition to documented data formats, undocumented formats can also contain CRC32 checksum of some kind.

These data formats, among many constructs, contain a block of data and a field for the CRC32 value for that block of data.

One of the recognizers implemented in BinCovery is to find CRC32 value-block pairs.

This is the description of the algorithm for finding CRC32 value-block pairs.

The recognizer starts reading UInt32 values from the binary data from offset 0. After each read, it moves the data pointer towards the end of window. It creates a list and adds each UInt32 value to that list.

The recognizer calculates the CRC32 checksum for all continuous blocks within the current window, starting from data offset 0. Then it starts matching each of these CRC32 checksums to each UInt32 value from the list. If there is a match, we have found a CRC32 value-block pair. The result will be saved, and the matching resumes to find further pairs.

The recognizer can be configured to find CRC32 values on data offset that is a multiple of four (alignment). Also, it can be configured to calculate the checksum for continuous blocks that match minimum and maximum size criteria. Using these criteria improves the processing speed and reduces the false positive result.

When the scan is finished in the current window, the window is moved forward to continue processing the rest of the data.

The recognizer can be configured to find little and big endian CRC32 values. Also, it can perform CRC32 calculation with various polynomials.

Practical Importance

Reverse engineering and digital forensics. Numerous data formats guard an internal header by a checksum. A header usually carries key information regarding the way to access the data. Examples of such information include offset and size. When you want to locate key information, you may look for a header in the data. You can try looking for such headers by finding CRC32 value-block pairs.

Data mining. You can run the recognizer on a set of files to separate files with CRC32 blocks from files without such blocks. When necessary, you can set the recognizer to use a particular CRC polynomial. You can specify the location, minimum and maximum length of CRC block to find the files with the best matches.

Data compression. The data guarded by a checksum is likely to be different than other parts of the data. Separating different kinds of data allows the best compression method to be applied for each part. This approach can lead to a better overall compression ratio. Additionally, running a pre-processor on certain parts of the data — rather than on the whole data — can also improve the compression ratio.

Fuzz testing. Mutating the checksum-guarded-data can be a good idea when you update the checksum field in the data. Excluding the data guarded by checksum from fuzzing and focusing on a different part of the data is also an approach one may consider.




  • The more you know about your data the better you can compress it
  • The more you know about your data the better you can examine it
  • The more you know about your data the better you can mutate it for fuzzing purposes
  • The more you know about your data the more you know about the software parses it


Back in 2011 I worked on a tool for the analysis of data formats. That time, I mentioned it in a blog post: The forensic example.

It wasn’t until this year that I started drafting a more organized concept for that tool.

The tool has three layers of abstraction.

Layer 1

Layer 1 takes binary data as input and runs the recognizers on the data. It currently describes more than 50 recognizers. The main idea for the recognizers is to find blocks with specific characteristics in the data.

The result of the analysis is the list of blocks recognized. Description, offset and size fields together describe a block. Here is an example for a recognized block.

Description: Entropy drops after performing single-channel delta encoding
Offset: 1000
Size: 4000

None of the recognizers rely on using magic bytes.

Each recognizer is meant to retain backward and forward compatibility. Each recognizer runs independently from other recognizer.

Layer 2

Layer 2 takes the analysis result of Layer 1. It has the facility to filter – and if required suppress – specific items from the analysis result.

For example, it handles overlapped blocks and eliminates potential false positive items.

Layer 3

Layer 3 implements practical approaches to utilize the analysis result of Layer 2 such as the followings.

  • File finder
  • File classifier
  • File comparer
  • Visualizer for the layout of file
  • Generic purpose file mutator for fuzzing purposes
  • Generic purpose lossy but reversible decompressor



The Reversing on Windows blog started back in 2009. Until 2016, the writings have been hosted on Blogger. Now, the content has been exported to a PDF document and it’s available for download. The document consists of 72 pages and over 80 posts.

Here is a non-exclusive list of topics the document covers.

  • About hard-coded pointers
  • Approaches to track data-flow
  • Research involving various x86 instructions
  • Analysis of binary data formats
  • Examples to write a Pintool to detect software defects
  • Security research on Flash Player, Firefox and Internet Explorer

Since 2017, the blog has been continued on



A couple of weeks ago, I made a demonstration analysis for a variant of Cerber ransomware and documented it.

The following is the table of contents for the document.

Symptoms of compromise
    Ransom notes
    Encrypted files
    Temporary files
Runtime behavior
    Creates mutex
    Weakens system security
    Self elevates to perform administrative tasks
    Searches for files to encrypt
    Encrypts the files
    Displays the ransom note
    Deletes itself
    Window flashes up
Final Notes



The ILDasm tool (also known as IL Disassembler tool), is a popular tool both for developers and researchers working with .NET programs. We rely heavily on ILDasm when performing static analysis on IL binary code level. Therefore, it’s important to use ILDasm in an efficient way. Besides having it pinned to the taskbar or desktop we need to get it integrated with the research environment.

Visual Studio Command Prompt

When Visual Studio is installed various Command Prompt shortcuts are added under Visual Studio Tools folder.

ILDasm in SendTo Context menu

When Visual Studio Command prompt is started ILDasm is added to the PATH environment variable. This allows to launch ILDasm by typing ildasm in the command prompt.

External Tools in Visual Studio

Since we want to disassemble .NET binaries compiled in Visual Studio it’s worth to get ILDasm integrated with it.

To add ILDasm as an external tool to Visual Studio select TOOLS, then select External Tools... and provide the followings.

Title: I&L Disassembler
Command: C:\Program Files (x86)\Microsoft SDKs\Windows\v8.1A\bin\NETFX 4.5.1 Tools\ildasm.exe
Arguments: /SOURCE $(TargetDir)$(TargetName)$(TargetExt)
Initial directory: $(TargetDir)

Which looks like this in the GUI.

Visual Studio Tools Command Prompts

To run ILDasm on the compiled binary just select TOOLS, then select IL Disassembler. The keyboard shortcut is Alt+T, L.

Note, you might have noticed the /SOURCE argument is provided for ILDasm. It’s there to show the source lines in the disassembly.

SendTo Context Menu

Having ILDasm in the SendTo Context menu can be useful if we want to run it on a selected file in the file explorer.

To add ILDasm to SendTo Context menu, press Windows Logo+R and type shell:sendto in the run dialog box. The location of SendTo Context menu will pop up in the file explorer where the shortcut to ILDasm needs to be created.

The location can be like this.


In certain situations we might prefer to output the disassembly text into file. It’s supported by ILDasm when /OUT argument is used. Therefore we can generate output disassembly of a .NET program from SendTo Context menu.

This is what it looks like having ILDasm in SendTo Context menu.

External Tools in Visual Studio 2013

Batch Disassembly

When we need to disassemble a number of files into text files we use the following command called from batchdasm.bat.

for /r %%i in (*.*) do ildasm "%%i" /text /out=c:\BatchDasm\

The command runs ILDasm on all the files in the current folder and all subfolders. The resulting disassembly will be saved into c:\BatchDasm. The resulting disassembly file is named like below.

<original filename>.<original filesize>.il

It was observed that ILDasm can crash even on legitimate Windows files. This can be just inconvenience but it can also raise security concerns.

Configuration of the Research Environment

The research environment has the following configuration.

Windows 8.1
Visual Studio 2013
ILDasm 4.0