Creating Annotated Hexdump

Posted 17 May 2021 | Modified 17 May 2021
Author Attila Suszter

Annotated hexdump

HexLasso can create an annotated hexdump using the analyzers.

In the example above, there are 6 columns which indicate the followings from left to right: (1) file offset, (2) hexdump, (3) analysis result on 16 bytes granularity, (4) analysis result on 32 bytes granularity, (5) analysis results on 128 bytes granularity, (6) analysis result on 256 bytes granularity.

HexLasso tags the block of bytes with the analyzer name or names with a dash in between. If the block can be 100% described by a single analyzer HexLasso tags that block with the analyzer name. Otherwise HexLasso tags that block with the two analyzers that have the highest byte coverage, which is not necessarily 100% coverage.

If the researcher needs precision they may look at the analysis result on 16 bytes granularity, if less precision is needed because the layout of a larger data is more important than the tiny details, the researcher may look at the analysis result on 256 bytes (or more as this can be set-up arbitrarily) granularity.

Identifying the Characteristics of Bytes

Posted 12 May 2021 | Modified 12 May 2021
Author Attila Suszter

HexLasso anlayzes the data and returns the number of characteristics of bytes identified.

HexLasso tells how many bytes are covered by each identified characteristic.

HexLasso tells how many bytes are without known characteristics.

Byte characteristics
The meaning of columns from left to right: (1) characteristics, (2) bytes covered, (3) bytes covered and data size ratio as a percentage.

If two or more characteristics overlap at the same location, the characteristic with a higher priority counts.

For example, if the byte at a given offset is part of both X86Fragment and DWordMatch, and because X86Fragment has higher priority than DWordMatch, then the byte is identified as X86Fragment.

X86 Fragment Analyzer

Posted 11 May 2021 | Modified 11 May 2021
Author Attila Suszter

X86Fragment is an analyzer in HexLasso to recognize x86 instruction fragments in the given data block. The analyzer highlights in color the instruction fragments in the hexdump and tells the coverage of how many bytes are the recognized instruction fragments in size.

X86Fragment analyzer aims to recognize the followings:

Relative CALL and JMP instructions
Short jump instructions
Near jump instructions
Instructions with absolute address
Function prologue instruction sequences
Runs of INT 3 and NOP instructions
MOV instructions

Splitters

Posted 9 May 2021 | Modified 9 May 2021
Author Attila Suszter

HexLasso performs the byte coverage analysis through analyzers, transformers and splitters.

The splitter is a routine to split data of the given block into two or more sub-blocks with the aim to separate multiple contexts. When the contexts are separated the analyzer performs better on the sub-blocks (separated contexts) than on the original block (original context). This is because in the original block the current byte is unrelated to the following byte but it is rather related to one of the following bytes which is at a given distance from the current byte. So when running the analyzer on the original block, the unrelated bytes would be perceived as noise by the analyzer degrading the analysis accuracy. However, when running the analyzer on the sub-blocks, the analyzer will perform with better accuracy on them because there is no such noise present.

The resulting byte coverage of a sub-block is projected onto the original block.

Predicted Bytes

Posted 3 April 2021 | Modified 3 April 2021
Author Attila Suszter

PredictedBytes is an analyzer in HexLasso which tells the coverage of how many bytes are predicted. The minimum possible value is zero, meaning not a single byte of the given block is covered, the maximum possible value is the size of the block in bytes, meaning all the bytes of the given block are covered.

The prediction is being made using a simple order-1 model using a dynamic prediction table which predicts, successfully or not, the next byte in the given block.

The prediction table is updated after each byte is read from the given block. The update, if needed, changes the prediction table that if the previous byte is seen again, it will predict the current byte.

Higher coverage of the predicted bytes means that the model returns more accurate prediction which indicates redundancy in the block. Analogously, lower coverage of the predicted bytes means less redundancy is seen by this model.

It is possible that not a single byte of the given block is covered and this indicates randomly distributed bytes.