CHES 2024: Fast Transciphering Via Reconfigurable And Batched LUT Evaluation

Three weeks ago, the conference on Cryptographic Hardware and Embedded Systems (CHES) took place in Halifax, Canada. In this blog post, we give a high-level summary of the main contribution of the paper “Fast Transciphering Via Reconfigurable And Batched LUT Evaluation” that was presented by Leonard Schild at CHES 2024.

Lookup tables (LUTs) are multi-dimensional arrays, commonly used in both hardware and software to evaluate functions that may be difficult to compute otherwise. In a normal setting, the output of the LUT is simply induced by the digits or indices that are given as in input. However, the setting which we consider is one in which the digits are encrypted using a Fully Homomorphic Encryption (FHE) scheme.

As a brief reminder, FHE enables us to combine encrypted data using certain operations. E.g. given c0=Enc(x),c1=Enc(y) , we are able to compute c3=Enc(x+y) or c4=Enc(3x) without decrypting either c0 or c1. Furthermore, the class of FHE schemes we work with has a special blind-rotation procedure that works as follows

  • BlindRotate(c0,C1): Given ciphertexts
    • c0=Enc(m),m{0,,2k1},
    • C1=Enc(P),PZq[X]/(XN+1)
  • Output C=Enc((1)mkP),Pi=Pim(modN) where mk is the k-th bit of m

In other words, if the k-th bit of m is 0, the procedure outputs a ciphertext containing P rotated by m slots and otherwise outputs the same value, multiplied by -1. We note that transforming an encryption of a polynomial into an encryption of one of its coefficients and vice-versa is feasible without decryption, and referred to as sample-extraction and ciphertext packing respectively.

We will see next how to use the previous concepts to evaluate (simple) lookup tables.

Background on LUT Evaluation

Let us consider a lookup table F:Z2k1Z2k1 that we aim to evaluate using Enc(m). This is the simplest setting, often called functional or programmable bootstrapping, and, provided that k<6, can be performed efficiently:

  • Start by encoding the F into the coefficients of a polynomial, i.e. Pi=F(i)
  • Run C=Enc(P)BlindRotate(Enc(m),Enc(P))
  • The result is c0=Enc(P0)

As m ranges from 0 to 2k1 but the LUT only takes inputs up to 2k11, the most significant bit of m must be 0. Hence, no sign flip occurred and P is simply a rotation of P and P0=F(m)

We’ve just seen how to evaluate one dimensional lookup tables. Extending this methodology to multiple input dimensions is possible and was done in the work “Revisiting the functional bootstrap in TFHE” by Guimarães et al. To give an intuition of this approach, consider a LUT F:Z2k1lZ2k1 that is a l-dimensional LUT. As in the one dimensional case, we can now encode F into a matrix, or for higher dimensions a tensor, of polynomials. Then, we can use the blind-rotation procedure to rotate each dimension so that the LUT value will finally be in the 0l-th slots of the resulting encrypted tensor.

Faster LUT Evaluation

The previous approaches had some shortcomings. In the 1D case, we have to limit the size of the encrypted digits for performance reasons. In the higher dimensional case, this limit can be circumvented by splitting a digit into multiple. However, it is easy to see that the number of blind-rotations is exponential in l (or at least a multiple of l using more sophisticated approaches). Furthermore, if the number of output values is greater than 1, the entire procedure needs to be repeated. Finally, in all cases we have seen so far, there was an implicit requirement that the most significant bit of m must be 0.

We now come to the actual contribution of the paper we are summarizing. Consider the case in which we have a have lookup table F:Z2lZ2l where l may range from 810. One such example is the AES SBOX. This means that the digits consist of bits and are given as ci=Enc(mi2k1). This setting is particularly interesting, as giving any ci as input to the blind-rotation procedure will not actually perform any rotation and only a sign flip depending on mi, as that is the k-th bit. We now describe the method that allows us to evaluate such a LUT using only l blind-rotations while obtaining all output bits at once.

First, we select a set of ω bits that we homomorphically compose into a single integer. Let P=2j1iXi and perform ci=Enc(P)BlindRotate(ci,Enc(P)) for some ci,i<ω. As we mentioned in the last paragraph, no rotation will take place and we’ll have that P=(1)miP. Then, adding 2j1 to every slot, we have that P0+2j1=mi2j. In other, we have just seen how to shift an encrypted most significant bit into any other bit slot. By repeating this procedure for any bit slot, and adding all resulting ciphertexts we obtain c¯=Enc(i=0k1mi2i).

The ciphertext c¯ contains information about the first ω bits, but not the remaining lω bits. Now, let us consider Fj(x0,,xω1)=F(x0,,xω1,0,,0)j which equals our previous LUT F, except that we fix the last lω inputs to zero and focus on the j-th output bit. We could evaluate this Fj using the 1D procedure described earlier, but we will actually evaluate Fj for all values of j and combinations of possible inputs for the last lω bits. Note that this only requires a single blind-rotation as one can use an optimization introduced in “New techniques for multi-value input homomorphic evaluation and application” by Carpov et al. Once all such LUTs have been evaluated, we can combine them into several encrypted polynomials. In practice, the polynomial dimension will be so large that a single polynomial will suffice to pack all values into it.

The final step of the method will be used to actually select the correct group of outputs from the pool of candidates. To this end, we rely again on the sign-flipping ability of the blind-rotation procedure. Consider the following expression: C=21(Enc(P0)+Enc(P1)+BlindRotate(ci,Enc(P0)Enc(P1))) Going through both cases of the value mi in ci, we can observe that C=Enc(P0) if mi=0 and C=Enc(P1) otherwise. Therefore, we are able to select the final, correct output by repeatedly applying the previous equation.

We have now described the entire method. It is worth pointing out that if the number of digits l increases drastically, the amount of blind-rotation may no longer be equal to l. However, many optimizations that can be applied to the method by Guimarães also benefit the novel approach, but need to be explored in future work.