Posted | Modified
Author

X86Fragment is an analyzer in HexLasso to recognize x86 instruction fragments in the given data block. The analyzer highlights in color the instruction fragments in the hexdump and tells the coverage of how many bytes are the recognized instruction fragments in size.

X86Fragment analyzer aims to recognize the followings:
  • Relative CALL and JMP instructions
  • Short jump instructions
  • Near jump instructions
  • Instructions with absolute address
  • Function prologue instruction sequences
  • Runs of INT 3 and NOP instructions
  • MOV instructions

Posted | Modified
Author

HexLasso performs the byte coverage analysis through analyzers, transformers and splitters.

The splitter is a routine to split data of the given block into two or more sub-blocks with the aim to separate multiple contexts. When the contexts are separated the analyzer performs better on the sub-blocks (separated contexts) than on the original block (original context). This is because in the original block the current byte is unrelated to the following byte but it is rather related to one of the following bytes which is at a given distance from the current byte. So when running the analyzer on the original block, the unrelated bytes would be perceived as noise by the analyzer degrading the analysis accuracy. However, when running the analyzer on the sub-blocks, the analyzer will perform with better accuracy on them because there is no such noise present.

The resulting byte coverage of a sub-block is projected onto the original block.

Posted | Modified
Author

PredictedBytes is an analyzer in HexLasso which tells the coverage of how many bytes are predicted. The minimum possible value is zero, meaning not a single byte of the given block is covered, the maximum possible value is the size of the block in bytes, meaning all the bytes of the given block are covered.

The prediction is being made using a simple order-1 model using a dynamic prediction table which predicts, successfully or not, the next byte in the given block.

The prediction table is updated after each byte is read from the given block. The update, if needed, changes the prediction table that if the previous byte is seen again, it will predict the current byte.

Higher coverage of the predicted bytes means that the model returns more accurate prediction which indicates redundancy in the block. Analogously, lower coverage of the predicted bytes means less redundancy is seen by this model.

It is possible that not a single byte of the given block is covered and this indicates randomly distributed bytes.

Posted | Modified
Author

HexLasso Online

HexLasso Online is a binary data analysis utility, running in a web browser, that allows the user to interactively explore the file and spotting varying redundancies in it.

The utility splits the file into blocks of equal sizes, runs analysis individually on each block, and visualizes each block as a square on an interactive map.

The color of the square gives an indication of the analysis result of that block. Lighter color means lower byte coverage, darker color means higher byte coverage. The sequential palette used to color the blocks consists of 16 colors.

When the mouse cursor is moved over the map, the byte dump view of the corresponding block is displayed next to the map.

The byte dump view is either a hex dump view or a binary dump view depending on user preference. According to the analysis result of the given block, the corresponding bytes in the byte dump view are highlighted in yellow.

The user can choose from various analyzers. When an analyzer is selected the map will be updated according to the analysis result. Such analyzers involve byte ranges, strings, runs of bytes, matches and code fragments.

The block size can be changed during the analysis. A smaller block size leads to a more detailed map, and larger block size leads to a smaller map.

The map can be zoomed in and out.

Posted | Modified
Author

HexLasso performs the byte coverage analysis through analyzers and transformers.

The transformer is a routine to transform data of the given block to potentially improve the efficiency of the analysis. Transforms have their advantages because the transformed data might fit better into the pattern recognition and prediction models than the original data.

For example, if the match analyzer reports considerably better byte coverage of the given block with matrix transformation versus no transformation at all, then this information allows smartly guessing about the data without the need to manually analyze the actual bytes. Increased match coverage after matrix transformation often indicates an array of fixed-width values where the low bytes of the values are keep changing and the high bytes are fixed. After performing the matrix transformation, the fixed values will be arranged together that they will form better match coverage when running the match analyzer on the transformed data.

The transformation is a reversible process so the original data can be restored without data loss.

The length of the transformed data is the same to the length of the original data.

The transformer is being called by the analyzer and passes the transformed data back to the analyzer.

There are number of transformers but only certain analyzers use transformers.

The transformation might affect the entropy and the match coverage of the block.