Analyzers

Posted | Modified
Author

HexLasso performs the byte coverage analysis through analyzers and transformers.

The analyzer is a dedicated routine to perform pattern recognition and prediction of the given block. The result is the byte coverage, that is how many bytes are covered by the pattern and how many bytes are predicted. The minimum possible value is zero, the maximum possible value is the size of the block in bytes.

The analyzer directly reports the byte coverage but it might call transformers to transform the data prior to the analysis. The transformation might be needed because the transformed data might fit better into the pattern recognition and prediction models.

The analyzer reports only on the bytes covered by the pattern recognition or prediction. For example, if there is a long sequence of ASCII bytes, the analyzer might report string regardless of what are the bytes which are not covered by the pattern recognition or prediction.

There are number of analyzers, and each analyzer runs on the given block.

If the string analyzer reports to cover, say, 60 bytes out of 64 bytes, then the result could be meaningful enough for the analyst.

However, if the string analyzer reports to cover, say, only 5 bytes out of 64 bytes, then the string might not be the one that describes the block in meaningful way. Ideally, there should be another analyzer covering more bytes. The analyzer that covers for the most bytes will describe the block.

HexLasso maintains a priority list of analyzers that each analyzer is given a distinct priority. If two analyzers report the same byte coverage then the analyzer with the higher priority will describe the block.

When considering two analyzers, the one which can more accurately detect the redundancy is given the higher priority. For example, the string analyzer is given higher priority than any of the match analyzers.