# Compressing the NN evaluation results : part 2

In part 1, I wrote an ad-hoc algorithm which applied two commonly-used techniques – variable length coding and run length coding. The ad-hoc mode seemed to start going nowhere, so I eventually settled with something more general.

Two steps:

• Encode the values 0 ~ 2047 using a run length coding method into some symbols, and then:
• Do a simple variable length coding (something like Huffman Coding) to encode those symbols

Since we know that the values are mostly zeros, it is pretty straightforward that we have to do a run length encoding of zeros. Other than that, just a smaller number of tricks will be better. I settled with the following symbols:

• V0 ~ V63 : A single value of 0 ~ 63
• Z0 ~ Z15 : 2 ~ 17 consecutive zeros
• X0 ~ X31 : For any symbol Xn, f the previous value was a V-type symbol, add 64(n+1) to that value. If the previous value was a Z-type symbol, append 16(n+1) zeros.

For example,

`129 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2`

gets encoded into:

`V1 X2 Z0 X0 V1 V2`

The next step is to encode these values in individual bits. To see how many bits should be used to encode each value, I measured the frequency of each bits, and hand-assigned a variable number of bits for each symbol, ending up being

 Symbol Bit encoding V0 0100 V1 000 V2 ~ V3 X1100 V4 ~ V7 XX0010 V8 ~ V15 XXX1010 V16 ~ V31 XXXX0110 V32 ~ V63 XXXXX1110 Z0 0001 Z1 1001 Z2 ~ Z3 X0101 Z4 ~ Z7 XX1101 Z8 ~ Z15 XXX0011 X0 1011 X1 00111 X2 ~ X3 X10111 X4 ~ X7 XX01111 X8 ~ X15 XXX011111 X16~X31 XXXX111111

(The X on the bit encoding is the LSBs of the symbol number – For example, V18 is 00100110. One notable thing is that V1 is more frequent than V0 which is because we encoded the majority of zeros using the Zn type symbols.)

Since we are dealing with little endian (the least significant bits comes first), if we represent the example above as a 64-bit integer, we end up getting:

`01100 000 1011 0001 010111 000 = 0x0c162b8`

So, how good are we? I got 261.7 bits in average (or 32.7 bytes) which is much better than the ad-hoc method!

Next step is to apply this in leela-zero, and then verify it doesn’t lose strength.

Posted on