Five Random Things about Entropy

Posted | Modified

Here is a list of five random things about entropy.

1. The order of bytes in the data does not matter when calculating the entropy of the data. The entropy will always be the same regardless of the order of bytes. That has consequences. Here are few:

  • High entropy data does not necessarily mean random data (albeit random data always has high entropy).
  • High entropy passkey in itself does not mean the passkey is a good choice for authentication.
  • You can bring up the entropy of the data to arbitrary value. By appending content to the data to have an equal distribution of all bytes, the newly created data will have the maximum entropy of 8.

2. Let’s assume the data of 1000 bytes in length has the entropy of 4 of the maximum 8. It means, from data compression standpoint, that 1 byte can be stored in 4 bits meaning the data can be compressed to 500 bytes. This compression ratio is guaranteed, however likely better ratio can be achieved depending of what is known of the data.

3. Entropy analysis is often used to detect redundancy or anomaly in data. It is possible that the stream of data has an entropy of around 7 throughout the length but there is a structural change in the stream that does not noticeably show up in entropy change. In such case, another approach would be needed to try detecting the structural change, such as match analysis.

4. Usually, publicly available tools calculate entropy on byte-level, in which case the maximum entropy is 8 because the byte has 8 bits. However, it is possible to calculate entropy on nibble-level and on word-level, in which cases the maximum entropies are 4 and 16.

5. The higher the entropy, the lower the redundancy. The lower the redundancy, the higher the entropy. They are inversely proportional.