# «Ðîññèéñêàÿ àêàäåìèÿ íàóê ÑÀÍÊÒ-ÏÅÒÅÐÁÓÐÃÑÊÎÅ ÎÒÄÅËÅÍÈÅ ÌÀÒÅÌÀÒÈ×ÅÑÊÎÃÎ ÈÍÑÒÈÒÓÒÀ ÈÌÅÍÈ Â.À. ÑÒÅÊËÎÂÀ ÐÀÍ ÓÄÊ 510.6 ÃÐÍÒÈ 27.03.19 Èíâ. ...»

Trakhtenbrot, B.A.: A Survey of Russian Approaches to Perebor (Brute force Search) Algorithms. Annals of the History of Computing 6(4), 384– (1984) [Mes99] Messner, J.: On optimal algorithms and optimal proof systems. In: Meinel, C., Tison, S. (eds.) STACS 1999. LNCS, vol. 1563, pp. 361–372. Springer, Heidelberg (1999) [Mon09] Monroe, H.: Speedup for natural problems and coNP?=NP. Technical Re port 09-056, Electronic Colloquium on Computational Complexity (2009) [Per07] Pervyshev, K.: On heuristic time hierarchies. In: Proceedings of the 22nd IEEE Conference on Computational Complexity, pp. 347–358 (2007) [PS10] Pitassi, T., Santhanam, R.: Eectively polynomial simulations. In: Proceed ings of ICS 2010 (2010) [Pud03] Pudlk, P.: On reducibility and symmetry of disjoint NP pairs. Theoretical a Computer Science 295(1-3), 323–339 (2003) [Raz94] Razborov, A.A.: On provably disjoint NP-pairs. Technical Report 94-006, Electronic Colloquium on Computational Complexity (1994) [Sad97] Sadowski, Z.: On an optimal quantied propositional proof system and a complete language for NPco-NP. In: Chlebus, B.S., Czaja, L. (eds.) FCT 1997. LNCS, vol. 1279, pp. 423–428. Springer, Heidelberg (1997) [Sad99] Sadowski, Z.: On an optimal deterministic algorithm for SAT. In: Gottlob, G., Grandjean, E., Seyr, K. (eds.) CSL 1998. LNCS, vol. 1584, pp. 179–187.

Springer, Heidelberg (1999) [Sad07] Sadowski, Z.: Optimal proof systems, optimal acceptors and recursive rep resentability. Fundamenta Informaticae, 169–185 (2007) – 154 – Gate Elimination for Linear Functions and New Feebly Secure Constructions Alex Davydow1 and Sergey I. Nikolenko St. Petersburg Academic University, ul. Khlopina, 8, korp. 3, St. Petersburg, Russia, adavydow@yandex.ru Steklov Mathematical Institute, nab. r. Fontanka, 27, St. Petersburg, Russia, sergey@logic.pdmi.ras.ru Abstract. We consider gate elimination for linear functions and show two general forms of gate elimination that yield novel corollaries. Us ing these corollaries, we construct a new linear feebly secure trapdoor function that has order of security 5 which exceeds the previous record for linear constructions. We also give detailed proofs for nonconstructive circuit complexity bounds on linear functions.

Keywords: circuit complexity, gate elimination, feebly secure cryptog raphy, feebly one-way functions 1 Introduction Modern cryptography has virtually no provably secure constructions.

There are known complete cryptographic constructions, both one-way func tions [14,15] and public key cryptosystems [8,9]. However, they also do not let us relate cryptographic security to key assumptions of classical complexity theory.

Moreover, the asymptotic nature of these completeness results does not let us say anything about how hard it is to break a given cryptographic protocol for keys of a certain xed length, which is, in fact, what we all want as privacy-aware customers. Problems that have been studied extensively in relation to cryptog raphy (factoring and discrete logarithm) do seem to scale well, but there are no lower bounds and little hope for such anytime soon, so the point is moot. We can only say that complexity of the algorithms that we have devised ourselves – 155 – scales well, so ultimately we fall back on the “many smart people have thought about it” argument. There are other dangers on the way of asymptotic complex ity, too [13]. Ultimately, we do not care whether a protocol can or cannot be broken in the limit;

we would be very happy if breaking this specic version of the protocol required constant time, but the constant was larger than the size of the known Universe.

However, modern cryptography is still very far away from provably secure constructions. At present, we can prove security neither in this “hard” sense nor in the sense of classical cryptographic denitions [7]. Nevertheless, while we are unable to prove a superpolynomial gap between the complexities of honest parties and adversaries, we are able to prove some gap. In 1992, Alain Hiltgen [10] presented a series of linear functions that are about twice (1 times for an arbitrarily small ) harder to invert than to compute. His example consists of linear functions over F2 with matrices that have few non-zero entries (ones) while inverse matrices have many ones. The complexity gap follows by a simple argument of Lamagna and Savage [16, 22]: every bit of the output depends non idly on many variables and all these bits correspond to dierent functions, hence a lower bound on the complexity of computing them all together. The model of computation here is the most general one, namely the number of gates in a Boolean circuit that uses arbitrary binary Boolean gates. Little more could be expected for this model at present. For example, the best known lower bound for general circuit complexity of a specic Boolean function is 3n o(n) [3, 26].

In his thesis, Hiltgen also presented a feebly secure function that is exactly twice harder to invert than to compute, this time a nonlinear one [11].

Lately, in the works of Hirsch, Nikolenko, and Melanich new cryptographic constructions with the same properties were constructed [12,17]. In [12], a linear feebly secure trapdoor function with order of security 22 was constructed, and in [17], a nonlinear feebly secure trapdoor function with order of security 7. In this paper, we continue that line of work. In Section 2, we give basic denitions.

Virtually all bounds in general circuit complexity have been proven with gate elimination. This paper deals with gate elimination for linear functions;

in Section 3, we distill gate elimination to two basic ideas, formulate it in a general form, note important corollaries, and discuss its limitations. Our discussion in Section 3 generalizes the methods of [12], and this understanding lets us nd a better construction of a linear feebly secure trapdoor function that has order of security 4 for any 0, as shown in Section 4. In Section 5, we compare what we have found with nonconstructive upper and lower bounds on general circuit complexity of linear functions. Section 6 concludes the paper.

2 Preliminaries n We denote by Bn,m the set of all 2m2 total functions f : Bn Bm, where B = {0, 1} is the eld with two elements. A circuit is a directed acyclic labeled graph with vertices of two kinds: vertices of indegree 0 (vertices that no edges enter) labeled by one of the variables x1,..., xn, and vertices labeled by a binary – 156 – Boolean function f B2,1 ;

this model of computation is known as general circuit complexity. Vertices of the rst kind are called inputs or input variables;

vertices of the second kind, gates. The size of a circuit C, size(C), is the number of gates in it. We assume that each gate in this circuit depends of both inputs, i.e., there are no gates marked by constants and unary functions Id and ¬. To safely remove such gates without loss of generality, we assume that the output of each gate can be both the value of corresponding function and its negation. For every injective function of n variables fn Bn,m we dene its measure of one-wayness C(f 1 ) MF (fn ) = C(fn ). Hiltgen’s work was to nd sequences of functions f = {fn } n n= with a large asymptotic constant lim inf n MF (fn ), which is called the order of one-wayness of f.

There is a well-known denition in cryptography for a family of trapdoor functions [7]. However, we have a more detailed denition: since we are interested in constants here, we must pay attention to all the details.

Denition 1. Fix functions pi, ti, m, c : N N. A feebly trapdoor candidate is a sequence of triples of circuits C = {(Keyn, Evaln, Invn )}n=1 where:

– {Keyn } is a family of sampling circuits Keyn : Bn Bpi(n) Bti(n), n= – {Evaln } is a family of evaluation circuits Evaln : Bpi(n) Bm(n) Bc(n), n= and – {Invn } is a family of inversion circuits Invn : Bti(n) Bc(n) Bm(n) n= such that for every security parameter n, every seed s Bn, and every input m Bm(n) Invn (Keyn,2 (s), Evaln (Keyn,1 (s), m)) = m, where Keyn,1 (s) and Keyn,2 (s) are the rst pi(n) bits (“public information”) and the last ti(n) bits (“trapdoor information”) of Keyn (s), respectively.

Informally speaking, n is the security parameter (the length of the random seed), m(n) is the length of the input to the function, c(n) is the length of the function’s output, and pi(n) and ti(n) are lengths of the public and trapdoor information, respectively. We call these functions “candidates” because Deni tion 1 does not imply any security, it merely sets up the dimensions and provides correct inversion. In our constructions, m(n) = c(n) and pi(n) = ti(n).

To nd how secure a function is, we introduce the notion of a break. Infor mally, an adversary should invert the function without knowing the trapdoor information. We introduce break as inversion with probability greater than a certain constant r (we will usually set r to equal 1, 3, or 7 ). We denote by 24 C (f ) the minimal size of a circuit that correctly computes a function f Bn,m on more than fraction of its inputs (of length n). Obviously, C (f ) C (f ) for all, and C(f ) = C1 (f ).

Denition 2. A circuit N breaks a feebly trapdoor candidate C = {Keyn, Evaln, Invn } on seed length n with probability r if, for uniformly chosen seeds s Bn and inputs m Bm(n), Pr N (Keyn,1 (s), Evaln (Keyn,1 (s), m)) = m r.

(s,m)U – 157 – Denition 3. A feebly trapdoor candidate C = {Keyn, Evaln, Invn } has order of security k with level if for every sequence of circuits {Nn } that break f n= on every input length n with probability, C(Nn ) C(Nn ) C(Nn ) k.

lim inf min,, C(Keyn ) C(Evaln ) C(Invn ) n In other words, C3/4 (fpi(n)+c(n) ) C3/4 (fpi(n)+c(n) ) C3/4 (fpi(n)+c(n) ) k, lim inf min,, C(Keyn ) C(Evaln ) C(Invn ) n where the function fpi(n)+c(n) maps Keyn,1 (s), Evaln (Keyn,1 (s), m) m.

We list a few simple examples. If there is no secret key at all (ti(n) = 0), each feebly trapdoor candidate {(Keyn, Evaln, Invn )}n=1 has order of security 1, since the sequence of circuits {Invn }n=1 successfully inverts it. If a sequence {(Keyn, Evaln, Invn )}n=1 implements a trapdoor function in the usual crypto graphic sense, then k =. Moreover, k = even if the bounds on adversary size are just a bit more than linear, e.g., if the adversary requires O(n log n) gates. Our denitions are not designed to distinguish between these (very dier ent) cases, because, unfortunately, any nonlinear lower bound on general circuit complexity appears very far away from our current state of knowledge.

Let us also note explicitly that we are talking about one-time security. An adversary can amortize his circuit complexity on inverting a feebly trapdoor candidate for the second time for the same seed, for example, by computing the trapdoor information and successfully reusing it. Thus, in this setting one has to pick a new seed for every input.

3 Gate Elimination for Linear Functions Gate elimination is virtually the only method we have to prove lower bounds in general circuit complexity;

so far, it has been used for every single lower bound [3,18,24,26]. The basic idea of this method is to use the following inductive argument. Consider a function f and a circuit of minimal size C that computes it. Now substitute some value c for some variable x thus obtaining a circuit for the function f |x=c. The original circuit C can now be simplied, because the gates that had this variable as inputs become either unary (recollect that the negation can be embedded into subsequent gates) or constant (in this case we can even proceed to eliminating subsequent gates). After guring out how many gates one can eliminate on every step, one proceeds by induction as long as it is possible to nd a suitable variable that eliminates enough gates. Evidently, the number of eliminated gates is a lower bound on the complexity of f.

Usually, the important case here is when a gate is nonlinear, such as an AND or an OR gate. In that case, it is always possible to choose a value for an input of such a gate so that this gate becomes a constant and, therefore, its immediate descendants can also be eliminated. However, in this paper we deal with gate – 158 – elimination for linear functions. We do not know how to prove that one cannot, in general, produce a smaller circuit for a linear function with nonlinear gates, but it is evident that we cannot assume any gates to be nonlinear in this setting.

Thus, gate elimination distills to two very simple ideas. Idea 1 is trivial and has been noted many times before, while Idea 2 will allow us to devise better feebly secure constructions in Section 4.

Since we are dealing with linear functions, we will, for convenience, state our results in terms of matrices over F2 ;

the circuit complexity of a matrix C (A) is the circuit complexity of the corresponding linear function. By Ai we denote the matrix A without its ith column;

note that if A corresponds to f then Ai corresponds to f |xi =0. If a matrix A has a zero column Ai, it means that the corresponding function does not depend on the input xi ;

in what follows, we will always assume that functions depend nontrivially on all their inputs and thus the matrices do not have zero columns;

we call such matrices nontrivial. Note that if A is a submatrix of B then C (A) C (B) for all [0, 1].

Idea 1. Suppose that for n steps, there is at least one gate to eliminate. Then C(f ) n.

Theorem 1. Fix a real number [0, 1]. Suppose that P = {Pn } is a series n= of predicates dened on matrices over F2 with the following properties:

– if P1 (A) holds then C (A) 1;

– if Pn (A) holds then Pm (A) holds for every 1 m n;

– if Pn (A) holds then, for every index i, Pn1 (Ai ) holds.

Then, for every matrix A with n + 1 columns, if Pn (A) holds then C (A) n.

Proof. The proof is a straightforward induction on the index of Pi ;

the rst property of P provides the base, and other properties takes care of the induction step. For the induction step, consider the rst gate of an optimal circuit C implementing A. By the monotonicity property of P and the induction base, the circuit is nontrivial, so there is a rst gate. Consider a variable xi entering that gate. Note that if C computes f on fraction of its inputs then for some c, C |xi =c computes f |xi =c on fraction of its inputs. If we substitute this value into this variable, we get a circuit C |xi =c that has at most size(C) 1 gates and implements Ai on at least fraction of inputs.

The theorem is absolutely trivial;

however, it has so far been the only in strument available for gate elimination in linear functions. In fact, the only instrument has been an even simpler proposition that dates back to mid-1970s.

Proposition 1 ( [16, 22];

[10, Theorems 3 and 4];

[12, Proposition 1] ).

1. Suppose that f : Bn B depends non-idly on each of its n variables, that is, for every i there exist values a1,..., ai1, ai+1,..., an B such that f (a1,..., ai1, 0, ai+1,..., an ) = f (a1,..., ai1, 1, ai+1,..., an ). Then C(f ) n 1.

– 159 – 2. Let f = (f (1),..., f (m) ) : Bn Bm, where f (k) is the k th component of f.

If the m component functions f (i) are pairwise dierent and each of them satises C(f (i) ) c 1 then C(f ) c + m 1.

The proof is given in [12]. Note that for linear functions, statement 1 of Propo sition 1 follows from Theorem 1 for Pn (A) = “A has a row with n + 1 ones”. We also derive another corollary.

Corollary 1. If A is a matrix of rank n, and each column of A has at least two ones, then C(A) n 2.

Proof. Take Pn (A) =“rank(A) n + 2 and each column of A has at least ones”.

Idea 2. Suppose that for n steps, there exists an input in the circuit with two outgoing edges, and, moreover, in m of these cases both of these edges go to a gate (rather than a gate and an output). Then C(f ) n + m.

Theorem 2. We call a nonzero entry unique if it is the only nonzero entry in its row. Fix a real number [0, 1]. Suppose that P = {Pn } is a series of n= predicates dened on matrices over F2 with the following properties:

– if P1 (A) holds then C(A) 1;

– if Pn (A) holds then Pm (A) holds for every 1 m n;

– if Pn (A) holds then, for every index i, if the ith column has no unique entries then Pn2 (Ai ) holds, otherwise Pn1 (Ai ) holds.

Then, for every matrix A with n + 1 dierent columns, if Pn (A) holds for some n then C(A) n and, moreover, C 4 (A) n.

Proof. We argue by induction on n;

for n = 1 the statement is obvious.

Consider the rst gate g in the optimal circuit implementing A. Since g is rst, its incoming edges come from the inputs of the circuit, denote them by xi and xj. There are three cases.

1. One of the input variables of g, say xi, goes directly to an output yk. Then by setting xi to a constant we can eliminate one gate. however, in this case yk corresponds to a row with only one nonzero element, so ith colum has a unique element, so Pn1 (Ai ) hold. Therefore, we invoke the induction hypothesis as C(Ai ) n 1 and get the necessary bound.

2. One of the input gate of g, say xi, goes to another gate. Then by setting xi to a constant we can eliminate two gates, by properties of Pn Pn2 (Ai ) holds, so we invoke the induction hypothesis as C(Ai ) n 2.

3. Neither xi nor xj enters any other gate or output. In this case, A is a function of neither xi nor xj but only g(xi, xj );

we show that this cannot be the case for a function computing A on more than 4 of the inputs. A itself depends on xi and xj separately because all of its columns are dierent;

in particular, for one of these variables, say xi, there exists an output yk that depends only on xi : yk = xi xX x, where xj X. On the other hand, since every gate / – 160 – in an optimal circuit nontrivially depends on both inputs, there exist values a and b such that g(0, a) = g(1, b). Thus, for every assignment of the remaining variables, either on input strings with (xi = 0, xj = a) or on input strings with (xi = 1, xj = b) the circuit makes a mistake, which makes it wrong on at least 4 of all inputs.

Note that Theorem 2 directly generalizes and strengthens Theorem 1.

Corollary 2. Fix a real number [0, 1]. Suppose that R = {Rn } and n= Q = {Qm } are two series of predicates dened on matrices over F2 with the m= following properties:

if R1 (A) holds then C(A) 1;

– if Rn (A) holds then Rk (A) holds for every 1 k n;

– – if Rn (A) holds then, for every i, Rn1 (Ai ) holds;

if Q1 (A) holds then C(A) 1;

– if Qm (A) holds then Qk (A) holds for every 1 k n;

– – if Qm (A) holds then, for every i, Qm1 (Ai ) holds;

if Qm (A) holds and Ai has more zero rows than A (i.e., removing the ith – column has removed the last nonzero element from at least one row) then Qm (Ai ) holds.

Then, for every matrix A with n + 1 columns, all of whose columns are dif ferent, if Rn (A) and Qm (A) hold for some n m then C(A) n + m and, moreover, C 4 (A) n + m.

Proof. Immediately follows from Theorem 2 for Pn (A) = kRk (A) Qnk (A).

Corollary 3 ( [12, Lemma 5] ). Let t, u 1. Assume that is a linear function with matrix A over F2. Assume also that all columns of A are dierent, every row of A has at least u nonzero entries, and after removing any t columns of A, the matrix still has at least one row containing at least two nonzero entries.

Then C() u + t and, moreover, C3/4 () u + t.

Proof. Take Pn (A) =“After removing any n columns of A, it still has at least one nonzero row”, Q0 (A) =“true”, and Qm (A) =“Every row of A has at least m + 1 ones” for m 0. Then Pt+1 (A) and Qu1 (A) hold, and P and Q satisfy the conditions of Corollary 2, which gives the desired bound. Note that in this case, Qm for m 0 cannot hold for a matrix where a row has only a single one, so in the gate elimination proof, for the rst u 1 steps two gates will be eliminated, and then for t u + 2 steps, one gate will be eliminated.

We also derive another, even stronger corollary that will be important for new feebly secure constructions.

Corollary 4. Let t u 2. Assume that A is a u t matrix with dierent columns, and each column of A has at least two nonzero elements (ones). Then C(A) 2t u and, moreover, C 4 (A) 2t u.

– 161 – g y g y x x f f xg xy xg xy xf xf Fig. 1. Rewiring linear circuits.

Proof. Take Pn (A) =“twice the number of nonzero columns in A less the number of nonzero rows in A is at least n”. Then P2tu (A) holds, and P satisfy the conditions of Theorem 2.

Naturally, we could prove Corollaries 1 and 4 directly. We have chosen the path of generalization for two reasons: one, to make Theorem 3 more precise and more general, and two, to show the limits of gate elimination for linear functions. As we have already mentioned, for linear functions we cannot count on nonlinear gates that could eliminate their descendants. In Theorems 1 and 2, we have considered two basic cases: when there is only one edge outgoing from a variable and when there are two edges (going either to two gates or to a gate and an output).

Figure 1 shows why we cannot expect anything more, e.g., a variable with three outgoing edges. On the left, Figure 1 shows a part of a circuit where a variable x has three outgoing edges. On the right, a rewiring of the same circuit that has all the same outputs but x only enters two gates. Naturally, this does not prove anything: we have introduced a new gate (so the circuit is no longer optimal) and have only relocated the extra input from x to an extra input from f. However, this simple example does show that to get better bounds, simple local gate elimination does not suce, and one has to consider global properties of the function and the corresponding optimal circuit.

We nish this section with an extension of these results to block diagonal matrices. In general, we cannot prove that the direct sum of several functions has circuit complexity equal to the sum of the circuit complexities of these functions;

counterexamples are known as “mass production” [26]. However, for linear functions and gate elimination in the avours of Theorems 1 and 2, we can. The following theorem generalizes Lemma 6 of [12].

Theorem 3. Suppose that a linear function is given by a block diagonal matrix 0 ··· A1 0 A2 ···.,..

...

...

0 ··· Ak – 162 – and every Aj satises the conditions of Theorem 2 with predicates P j = {Pn }, j n= k j and Pnj (Aj ) hold for every j. Then C() nj.

j= Proof. We invoke Theorem 2 with the predicate composed of original predicates:

Pi1 Pi2... Pik.

Pn = 1 2 k i1 +...+ik =n It is now straightforward to check that P = {Pn } satises the conditions of n= Theorem 2 (since every deleted column aects only one block), and the block diagonal matrix satises Pn1 +...+nk.

4 A New Linear Feebly Secure Trapdoor Function In our constructions, we follow the general idea of [12]: rst, we nd a feebly trapdoor candidate that has the adversary work harder than function inversion but function evaluation is even harder. Then, we add a feebly secure one-way function as a separate block and thus reduce the work needed for function eval uation;

this construction has been discussed in detail in [12].

We begin with some preliminaries. By Un, we denote the upper triangular square n n matrix with a bidiagonal inverse:

1 1 ··· 1 1 1 0 ··· 0 1 ··· 1 0 1 1 ··· Un =, Un = ;

.......

.......

.......

0 0 ··· 1 0 0 0 ··· note that Un is an upper triangular matrix with zeros and ones chequered. In what follows, we often write matrices that consist of other matrices as blocks;

e.g., ( Un Un ) is an n 2n matrix consisting of two upper triangular blocks.

Lemma 1. 1. C 3 (Un ) = n 1.

2. C 3 (Un ) = n 2.

3. C 3 (Un ) = n 1.

4. C 4 (( Un Un )) = 2n 1.

5. 3n 6 C 3 (( Un Un )) C(( Un Un )) 3n 3.

2 6. 3n 4 C 4 (( Un Un )) C(( Un Un )) 3n 2.

1 Proof. Lower bounds in items 1–3 are trivial: every row is dierent and no inputs except one (two for 2) are connected to outputs directly. Thus, we need at least one gate per row. The lower bound for item 4 follows from simple counting:

the rst row of this matrix has 2n nonzero entries, so at least 2n 1 gates are needed to compute it. The lower bound for item 5 (respectively, 6) follows by Corollary 4: the matrix ( Un Un ) (resp., ( Un Un )) satises its assumptions except for three (resp., two) columns, so the corollary is invoked for t = 2n (resp., t = 2n 2) and u = n.

– 163 – We prove upper bounds by providing explicit circuit constructions. To com pute 1, note that every row diers from the previous one only in a single position, so we can compute each output outi as outi+1 ini. Moreover, outn = inn so we need no gates for it. The same idea applies in 2, but in this case outn and outn1 are computed directly, and outi = outi2 ini. To compute 3, we simply compute each row independently. To compute 4, we apply an idea from [12].

Note that ( Un Un ) · ( a ) = Un · a Un · b = Un · (a b). We use n gates to compute b a b and then compute the result using n 1 gates. To compute 5 and 6, note that ( A B ) · ( a ) = A · a B · b. Thus, we can divide each of these computations b in two parts which can be computed independently using previous algorithms, and then use n gates to compute the nal XOR.

For the rst construction, we assume that lengths of public information pi, trapdoor information ti, message m, and the cipher c are the same and equal n.

We let ti = Un · pi, c = ( Un Un ) · ( m ). In this case, an adversary would have pi c c to compute the matrix ( Un Un ) · ( ti ) = ( Un Un ) · ( pi ). Now, inversion without the trapdoor is harder than inversion with trapdoor, but encryption is about the same complexity as inversion without trapdoor, so we cannot call it a feebly trapdoor function yet.

To solve this problem, we consider a feebly one-way linear function A and construct the protocol in the following way (In is the identity matrix here):

Un 0 t 0 In · ( Keyn = s s ) = pi, i m = ( c1 ), Un Un 0 · Evaln = pi c 0 0A m c Un Un · ti = ( m1 ).

Invn = m 0 0 A1 c The adversary’s problem now becomes to compute c = ( m1 ).

Un Un · Advn = pi m 0 0 A1 c For the feebly one-way function A, we x a small 0 and take the Hiltgen’s linear function with order of security 2 [10];

we take its size to be n with chosen below. In Hiltgen’s constructions, it means that C 3 (A) = n + o(n), and C 3 (A1 ) = (2 )n + o(n). Now Lemma 1 and Theorem 3 imply the following complexity bounds:

C 3 (Keyn ) = n 1, C 3 (Evaln ) = 3n + n + o(n) = (3 + )n + o(n), C 3 (Invn ) = 2n + (2 )n + o(n) = (2 + (2 ))n + o(n), C 3 (Advn ) = 3n + (2 )n + o(n) = (3 + (2 ))n + o(n).

The order of security of this construction is now C3/4 (Advn ) C3/4 (Advn ) C3/4 (Advn ) lim min,, = C(Evaln ) C(Invn ) C(Keyn ) n 3 + (2 ) 3 + (2 ) = min,.

2 + (2 ) 3+ – 164 – 1 This expression reaches maximum for = 1, and this maximum is, which tends to 4 as 0. Thus, we have proven the following theorem.

Theorem 4. For every 0, there exists a linear feebly trapdoor function with seed length pi(n) = ti(n) = n, length of inputs and outputs c(n) = m(n) = 2n, and order of security 4.

In Theorem 3, we have generalized a hardness amplication procedure similar to [12, Theorem 2];

with it, we can obtain superpolynomial security guarantees against weaker adversaries.

Theorem 5. For every 0, there exists a linear feebly trapdoor function with seed length pi(n) = ti(n) = n, length of inputs and outputs c(n) = m(n) = 2n, complexities C 4 (Keyn ) = n1, C 3 (Evaln ) = 4n+o(n), and C 3 (Invn ) = 4 n+ 4 o(n), and order of security 54. Moreover, no adversary with less than 54 n 4 n gates can invert this feebly trapdoor function on more than 2 n+o( n) of its inputs for any constant 0.

Proof. We consider the block diagonal matrix X 0... 0 X... H=...,...

...

0 0... X with m diagonal blocks, where X is the matrix of the trapdoor function con structed in Theorem 4, and apply Theorem 3. Stacking the matrices up in a large block diagonal matrix does not change the parameters of a feebly trapdoor function.

5 Nonconstructive bounds for linear functions In Section 3, we have seen that we cannot currently hope to prove more than linear lower bounds on general circuit complexity;

the same is true, of course, for linear functions. A classical result shows by counting that among general Boolean functions of n variables, almost all of them have circuit complexity n 2n. But maybe for the linear case, nonlinear bounds are impossible from the beginning?

It turns out that linear functions with nonlinear bounds do exist. References to this result can be found [2, 4], but we have not been able to nd a detailed proof in literature, so we include it here and rene it to get exact constants.

Theorem 6. 1. For every n there exists a constant n such that the circuit complexity of all linear functions : {0, 1}n {0, 1}n does not exceed n n log n, and limn n = 1.

2. For every n 3, there exists a linear Boolean function : {0, 1}n {0, 1}n n with circuit complexity greater than 2 log n.

– 165 – Proof. 1. Upper bound. Let A be the matrix of. For clarity, we assume n is a power of 2 (the same proof goes through with very minor modications if n is not). We implement A as follows. First, we generate ci as all possible rows of zeros and ones of length l = q log n, where q is a constant to be selected later. Denoting the inputs of by x = ( x1 ··· xn ), we preprocess the values of all xj+ possible combinations of ci · x..., where j is a multiple of l. The total number j+l of gates needed for this operation is bounded from above by n 2q·log n · · (q log n 1) nq · n = nq+1.

q · log n Let A1, A2, · · ·, An be the columns of A. To nd A · x, we compute xl+1 xnl+ x...

( Al+1...( Anl+ A·x = ( A1 ) ) ).

. ··· A2l ··· An..

... Al...

xl x2l xn After preprocessing, every product in this formula will already be computed, and all we need to do is to choose a correct wire, so no gates are needed here.

After that, n · ( n 1) gates are needed to XOR for the total result. Thus, the l total number of gates needed is n n nq+1 + n · ( 1) = nq+1 + n.

q · log n q · log n 1 n Setting q =, we get the upper bound (1 + ) log n for an arbitrarily small.

1+ n 2. Lower bound. The lower bound is proven by counting. Let q = (1 ) 2 log n.

We estimate T, the number of circuits of size q. There are 16 types of gates, and every circuit’s description consists of type and inputs for every gate. Therefore, q · (16 · (n + q)2 ))q (n + q)2q q 2q q · (16e)q · q · (64e)q · q = (64 · e · q)q T qq q! q 2 n n )n2 +2 log n (64 · e · q)q+1 (n2 )(1 )· 2 log n + = (22 log n )(1 )· 2 log n + = 2(1.

But the total number of linear Boolean functions of n arguments is 2n, which is greater than 2(1 )n +2 log n. Therefore, there are Boolean functions with circuit n complexity exceeding 2 log n.

6 Conclusion In this paper, we have discussed in detail the circuit complexity of linear Boolean functions. We have proven two general statements on gate elimination in linear functions, derived several corollaries, and applied them to nd a new linear feebly secure trapdoor function, with better order of security than the known one [12].

While feebly secure cryptographic primitives can hardly be put to any practical use, they are still important from the theoretical point of view. As sad as it – 166 – sounds, this is actually the frontier of provable, mathematically sound results on security;

we do not know how to prove anything stronger. However, in Section we have seen that with these bounds, we are only scratching the surface even for linear Boolean functions, let alone nonlinear ones.

Further work in this direction is twofold. One can further develop the notions of feebly secure primitives. Orders of security can certainly be improved;

perhaps, new primitives (key agreement protocols, zero knowledge proofs etc.) can nd their feebly secure counterparts. This work can widen the scope of feebly secure methods, but the real breakthrough can only come from one place.

It becomes clear that cryptographic needs call for further advances in general circuit complexity. General circuit complexity has not had a breakthrough since the 1980s;

nonconstructive lower bounds are easy to prove by counting, but con structive lower bounds remain elusive. The best bound we know is 3n o(n), proven in 1984 [3]. At present, we do not know how to rise to this challenge;

none of the known methods seem to work, so a general breakthrough is required for nonlinear lower bounds on circuit complexity. The importance of such a breakthrough can hardly be overstated;

feebly secure cryptographic construc tions provide yet another application for new circuit lower bounds.

Acknowledgements We thank Olga Melanich and the anonymous referees for pointing out important shortcomings in a preliminary version of our paper.

Work of the rst author has been supported by the Yandex Stipend Program.

Work of the second author has been supported by the Russian Presidential Grant Programme for Young Ph.D.’s, grant no. MK-4089.2010.1, for Leading Scientic Schools, grant no. NSh-5282.2010.1, Federal Target Programme “Scientic and scientic-pedagogical personnel of the innovative Russia”, and Russian Fund for Basic Research grants.

References 1. Ajtai, M., Dwork, C.: A public-key cryptosystem with worst-case/average-case equivalence. In: Proceedings of the 29th Annual ACM Symposium on Theory of Computing. pp. 284–293 (1997) 2. Alon, N., Karchmer, M., Wigderson, A.: Linear circuits over GF(2). SIAM Journal of Computing 19(6), 1064–1067 (1990) 3. Blum, N.: A boolean function requiring 3n network size. Theoretical Computer Science 28, 337–345 (1984) 4. Bublitz, S.: Decomposition of graphs and monotone formula size of homogeneous functions. Acta Informatica 23, 410–417 (1986) 5. Die, W., Hellman, M.: New directions in cryptography. IEEE Transactions on Information Theory IT-22, 644–654 (1976) 6. Dwork, C.: Positive applications of lattices to cryptography. In: Proceedings of the 22nd International Symposium on Mathematical Foundations of Computer Science, Lecture Notes in Computer Science. vol. 1295, pp. 44–51 (1997) – 167 – 7. Goldreich, O.: Foundations of Cryptography. Basic Tools. Cambridge University Press (2001) 8. Grigoriev, D., Hirsch, E.A., Pervyshev, K.: A complete public-key cryptosystem.

Groups, Complexity, and Cryptology 1, 1–12 (2009) 9. Harnik, D., Kilian, J., Naor, M., Reingold, O., Rosen, A.: On robust combiners for oblivious transfers and other primitives. In: Proceedings of EuroCrypt 05, Lecture Notes in Computer Science. vol. 3494, pp. 96–113 (2005) 10. Hiltgen, A.P.: Constructions of feebly-one-way families of permutations. In: Proc.

of AsiaCrypt ’92. pp. 422–434 (1992) 11. Hiltgen, A.P.: Cryptographically Relevant Contributions to Combinational Com plexity Theory, ETH Series in Information Processing, vol. 3. Konstanz: Hartung Gorre (1994) 12. Hirsch, E.A., Nikolenko, S.I.: A feebly secure trapdoor function. In: Proceedings of the 4th Computer Science Symposium in Russia, Lecture Notes in Computer Science. vol. 5675, pp. 129–142 (2009) 13. Koblitz, N., Menezes, A.: Another look at “provable security”. Journal of Cryptol ogy 20(1), 3–37 (2007) 14. Kojevnikov, A.A., Nikolenko, S.I.: New combinatorial complete one-way functions.

In: Proceedings of the 25th Symposium on Theoretical Aspects of Computer Sci ence. pp. 457–466. Bordeaux, France (2008) 15. Kojevnikov, A.A., Nikolenko, S.I.: On complete one-way functions. Problems of Information Transmission 45(2), 108–189 (2009) 16. Lamagna, E.A., Savage, J.E.: On the logical complexity of symmetric switching functions in monotone and complete bases. Tech. rep., Brown University, Rhode Island (July 1973) 17. Melanich, O.: Nonlinear feebly secure cryptographic primitives. PDMI preprints 12 (2009) 18. Paul, W.J.: A 2, 5n lower bound on the combinational complexity of boolean func tions. SIAM Journal of Computing 6, 427–443 (1977) 19. Regev, O.: On lattices, learning with errors, random linear codes, and cryptogra phy. In: Proceedings of the 37th Annual ACM Symposium on Theory of Computing.

pp. 84–93 (2005) 20. Regev, O.: Lattice-based cryptography. In: Proceedings of the 26th Annual Interna tional Cryptology Conference (CRYPTO’06), Lecture Notes in Computer Science.

vol. 4117, pp. 131–141 (2006) 21. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures and public-key cryptosystems. Communications of the ACM 21(2), 120–126 (1978) 22. Savage, J.E.: The Complexity of Computing. Wiley, New York (1976) 23. Shannon, C.E.: Communication theory of secrecy systems. Bell System Technical Journal 28(4), 656–717 (1949) 24. Stockmeyer, L.: On the combinational complexity of certain symmetric boolean functions. Mathematical Systems Theory 10, 323–326 (1977) 25. Vernam, G.S.: Cipher printing telegraph systems for secret wire and radio tele graphic communications. Journal of the IEEE 55, 109–115 (1926) 26. Wegener, I.: The Complexity of Boolean Functions. B. G. Teubner, and John Wiley & Sons (1987) – 168 – Electronic Colloquium on Computational Complexity, Report No. 91 (2011) Optimal heuristic algorithms for the image of an injective function Edward A. Hirsch† Dmitry Itsykson† Valeria Nikolaenko‡ Alexander Smal† May 20, Abstract The existence of optimal algorithms is not known for any decision problem in NP \ P.

We consider the problem of testing the membership in the image of an injective function. We construct optimal heuristic algorithms for this problem in both randomized and deterministic settings (a heuristic algorithm can err on a small fraction d of the inputs;

the parameter d is given to it as an additional input). Thus for this problem we improve an earlier construction of an optimal acceptor (that is optimal on the negative instances only) and also give a deterministic version.

1 Introduction 1.1 Optimal algorithms When we face a computational problem that is not known to be solved in a reasonable (say, polynomial) amount of time, we are still interested to solve it as fast as possible. The existence of an optimal algorithm that for every possible input returns its answer at least as fast (up to a polynomial) as any other algorithm for the same problem does, is an important structural feature of the problem and the model of computation (deterministic algorithms, bounded-error randomized algorithms, etc.).

While Levin’s optimal algorithm for NP search problems is known for decades [Lev73], it does not give an optimal algorithm for any decision problem, because, while for NP-complete problems the worst-case complexity of search and decision are polynomially related, a decision algorithm still can be exponentially faster for some inputs. Also Levin’s algorithm does not stop at all on the negative instances. For many interesting languages including the language of Boolean tautologies TAUT, the existence of an algorithm that is optimal on the positive instances only (such algorithm is called an optimal acceptor ) is equivalent to the existence of a p-optimal proof system (that is, a proof system that has the shortest possible proofs, and these proofs can be constructed by a polynomial-time algorithm given proofs in any other proof system) [KP89, Sad99, Mes99] (see [Hir10] for survey). Monroe [Mon11] recently gave a conjecture implying that optimal acceptors for TAUT do not exist.

Supported in part by Federal Target Programme “Scientic and scientic-pedagogical personnel of the innovative Russia” 2009-2013, by the grants NSh-5282.2010.1 and MK-4089.2010.1 from the President of RF, by the Programme of Fundamental Research of RAS, and by RFBR grants. The second author is also supported by Rokhlin Fellowship.

† Steklov Institute of Mathematics at St.Petersburg, Russian Academy of Sciences. Web: http://logic.pdmi.

ras.ru/{~hirsch,~dmitrits,~smal} ‡ St.Petersburg Academic University, Russian Academy of Sciences – 169 – ISSN 1433- 1.2 Optimal heuristic randomized acceptors An obvious obstacle to constructing an optimal algorithm by enumeration is that no ecient pro cedure is known for enumerating the set of all correct algorithms for, say, TAUT or SAT. A possible workaround is to check the correctness for a particular input;

however, even for SAT, a search-to decision reduction maps the input instance to a dierent instance and thus potentially increases the complexity.

The correctness can be, however, checked in the heuristic setting. A heuristic algorithm for a language L and probability distribution D on the inputs is allowed to make errors for some inputs;

the probability of error according to D must be kept below d, where d is an integer parameter given to the algorithm. In [HIMS10] an optimal heuristic randomized acceptor for every r.e. language L and every polynomial-time samplable D concentrated on L is constructed. In other words, this is an algorithm that accepts (with bounded probability of error) every x L in the fastest possible way, and accepts x L for inputs of total D-probability at most d.

/ 1.3 Our results: derandomization and optimal heuristic algorithms In this paper we consider the decision problem for the image of an injective function (under the uniform distribution) that maps n-bit strings to (n + 1)-bit strings. Its study is motivated, for example, by the fact that a particular case of this problem is the problem of recognizing the image of an injective pseudorandom generator, which has no polynomial-time heuristic randomized algorithm [HIMS10, Theorem 5.2]. It is known that injective pseudorandom generators exist if one-way permutations exist [Gol95].

For this problem, we extend the previous results in two directions. First, we devise an optimal algorithm, while [HIMS10] gave a construction of an optimal acceptor. In [HIMS10], the correctness test was performed by repeated sampling inputs in L and running a candidate acceptor on them.

In our case L is the image of an injective function and we can still sample it. However, we still do not have a samplable distribution on L, i.e., on the complement to the image. The check is then done by testing the algorithm on a random input from {0, 1}n and computing its overall probability of acceptance.

Our second result is a derandomization of this construction, namely, a deterministic algorithm that is optimal on the average. To do this, we use an expander-based construction of Goldreich and Wigderson [GW97] of small families of functions with good mixing properties, and also use the input as a source of pseudorandomness. It also derandomizes the construction of [HIMS10] of optimal acceptors if we consider it for the same class of problems (i.e., recognizing the complement of the image of an injective function).

A byproduct of the derandomization is the existence of optimal automatizable proof system for the complement of the image. For our problem, this extends [HIMS10, Theorem 4.1], where only an optimal weakly automatizable randomized heuristic proof system is constructed, i.e., a proof system where the automatization procedure outputs a proof in a stronger system. (The necessary denitions and the corollary are given in the Appendix.) 1.4 Organization of the paper In Section 2 we give the necessary denitions. Then in Section 3 we give a general construction of an optimal algorithm that suits both the deterministic and randomized cases but misses an important part: the procedure for estimating the frequency of a particular answer of an algorithm – 170 – on a particular distribution of the inputs. In Section 4 we give a (rather simple) randomized testing procedure, and in Section 5 we give a (somewhat more complicated) deterministic one. Finally, we present directions for further research in Section 6.

2 Denitions 2.1 Basic notation An ensemble of probability distributions is a sequence of probability distributions {Dn }nN, where Dn is concentrated on {0, 1}n. We will denote such an ensemble by a single letter D and abuse the language by calling D a distribution.

Let U denote the ensemble {Un }nN, where Un is uniformly distributed on {0, 1}n. For every language L {0, 1}, we denote the uniform distribution on L {0, 1}n by Un (L);

then U (L) = {Un (L)}nN.

A distributional problem is a pair (L, D) consisting of a language L {0, 1} and a distribution D.

We use subscripts to denote the probability space;

for example, PrxDn means that the proba bility is taken over x distributed according to Dn and PrA means that the probability is taken over the internal random bits of the algorithm A.

The algorithms that we study can output either 1 (accept) or 0 (reject), or (give up). They can also diverge, i.e., run forever (denoted ). For an algorithm A and an integer T, we denote by AT the algorithm that behaves as A until the step T, and then gives up.

The time spent by a randomized algorithm A on input x is dened as the median time tA (x) = min t N Pr[A(x) stops in time at most t].

A We will also use a similar notation for the order statistics the “probability p time”:

(p) tA (x) = min t N Pr[A(x) stops in time at most t] p.

A 2.2 Randomized heuristic algorithms Denition 2.1. A(x, d) is a randomized heuristic algorithm for a distributional problem (L, D) if for every n, Pr [A(x, d) = L(x)].

d xDn ;

A Remark 2.1. Note that [BT06] and [HIMS10] dene randomized heuristic algorithms and accep tors in a dierent way separating the probabilities over x and over A. Note also that [HIMS10, Sect. 2] proves that algorithms dened in these two dierent ways simulate each other (the proof is given there for acceptors and goes for algorithms without changes).

Denition 2.2. A function f : {0, 1} N N is polynomially bounded on a set X if there is a polynomial p such that for every x X and d N, f (x, d) p(|x|d).

A heuristic algorithm A is polynomially bounded on set X if its median time tA is polynomially bounded on X. If X is equal to {0, 1} we omit it.

– 171 – Denition 2.3 ([BT06]). HeurBPP is the class of distributional problems that can be solved by polynomially bounded randomized heuristic algorithms.

Denition 2.4. Function f : {0, 1} N N dominates function g : {0, 1} N N on set X (denoted f g), if there are polynomials p and q such that for all x X and d N, g(x, d) max {p(f (x, d )d|x|)}.

d q(|x|d) Remark 2.2.

1. If f g on X and f is polynomially bounded on X, then so is g.

2. is transitive.

Denition 2.5. For heuristic algorithms A and A for the same distributional problem (L, D), the algorithm A simulates A if tA tA on supp D.

An optimal randomized heuristic algorithm for a distributional problem (L, D) simulates every heuristic algorithm for (L, D).

2.3 Deterministic heuristic algorithms Denition 2.6. A deterministic heuristic algorithm is a randomized heuristic algorithm that does not use its randomness.

The running time tA is now simply the number of steps made by the algorithm A. However, for deterministic heuristic algorithms, the notions of the polynomial boundness and the simulation will be relaxed by allowing the restrictions not to hold on a small number of inputs.

Denition 2.7. A function f : {0, 1} N N is polynomially bounded on the average w.r.t.

distribution D, if there is a polynomial p such that for every n, d N, Pr [f (x, d) p(n · d)] 1.

d xDn A deterministic heuristic algorithm is polynomially bounded on the average if its running time is polynomially bounded on the average.

Denition 2.8. A function f : {0, 1} N N, dominates g : {0, 1} N N on the average w.r.t. distribution D (denoted f g), if there are polynomials p and q such that q(n, d) 2d and for every n, d N, Pr [g(x, d) p(n · d · f (x, q(n, d)))] 1.

d xDn It is easy to see that the class of functions polynomially bounded on the average is closed under domination on the average.

Proposition 2.1. Let f g and f is polynomially bounded on the average w.r.t. D. Then g is also polynomially bounded on the average w.r.t. D.

– 172 – Proof. Let p and q be two polynomials in the denition of, and p be a polynomial in the denition of polynomial boundness of g;

without loss of generality we can assume that p is non-decreasing.

The polynomial boundness and the restriction on q give PrxDn [f (x, q(n, d)) p (n · q(n, d))] 1 1 q(n,d) 1 2d. Substituting it into the domination condition we get PrxDn [g(x, d) p(n · d · 1 1 p (n · q(n, d)))] 1 2d 2d = 1 d.

Denition 2.9. For heuristic algorithms A and A for a distributional problem (L, D), we say that A simulates A, if tA tA w.r.t. D.

A deterministic heuristic algorithm for a distributional problem (L, D) is optimal on the average if it simulates every other deterministic heuristic algorithm for (L, D).

Denition 2.10 ([BT06]). HeurP is the class of distributional problems that can be solved by deterministic heuristic algorithms that are polynomially bounded on the average.

2.4 The problem of recognizing the image of an injective function In this paper we concentrate on the following problem.

Denition 2.11. Let f : {0, 1} {0, 1} be a polynomial-time computable injective function such that |f (x)| = |x| + 1. The problem of recognizing the image is the distributional problem (Im f, U ) where U is the uniform distribution. We will also denote by Im f the corresponding characteristic function, i.e., (Im f )(x) = 1 if x Im f and (Im f )(x) = 0 if x Im f.

/ To show the importance of this problem and its non-triviality for heuristic algorithms let us consider a particularly hard case when f is a pseudorandom generator.

Denition 2.12 (see, e.g., [Gol95, Section 3]). Let f : {0, 1} {0, 1} be a polynomial-time computable function such that |f (r)| = |r| + 1 for all r {0, 1}. Then f is called a pseudorandom generator if for every polynomial-time randomized algorithm A and for every polynomial p, n0 n n0 Pr [A(f (x)) = 1] Pr [A(x) = 1].

p(n) xUn xUn+ It is known that injective pseudorandom generators exist if one-way permutations exist [Gol95].

Proposition 2.2 ([HIMS10, Theorem 5.2]). If f is a pseudorandom generator, then there are no randomized heuristic algorithms for the problem (Im f, U ) with running time that is polynomialy bounded on Im f.

2.5 Estimator Denition 2.13. We call an estimator an algorithm Estimate(A, x, S, v,, T ) that given • an algorithm A (i.e., its Goedel number), which can be either randomized or deterministic, • an input x {0, 1}n, • a function S : {0, 1}n {0, 1}n (as an oracle), • a value v {0, 1}, – 173 – • a rational number (0;

1), • an integer T, runs in time upper bounded by a polynomial of T, n, and and outputs a rational number such that [AT (y) = v] Pr Pr, yS(Un );

A where the outermost probability is taken over the internal random bits of Estimate and over uni formly distributed x {0, 1}n.

Remark 2.3. At rst glance, it may seem that the expected answer of Estimate is not related to x and therefore Estimate does not need x. However, we will see later that in the deterministic case the input x is the only source of pseudorandomness and thus it does matter for deterministic heuristic estimators. For the randomized case it can be indeed ignored.

Remark 2.4. In this paper, we use estimators for two functions: the identity function and the function S that cuts the last bit of the input and applies the injective function f whose image we are trying to recognize, to the rst n 1 bits of the input to get an n-bit string uniformly distributed on Im f {0, 1}n.

3 The general construction of an optimal algorithm In this section we describe the “main” algorithm Opt for a distributional problem (Im f, U ), which we use both in the deterministic and in the randomized case. It uses an enumerator A• for algo rithms of certain type (that is, Ai is a Turing machine with Goedel number i, and it can be either a randomized or a deterministic machine depending on the enumerator), and an estimator Estimate for the same type of algorithms. Let A0 be a deterministic brute-force algorithm for testing the membership in Im f running in 2cn steps for a constant c 1 (note that Im f can be certainly accepted in time O(2n · p(n)), where p(n) is the complexity of f ).

Algorithm 3.1. Opt(A•, Estimate, x, d) 1. Let d = 20cn2 d.

2. For every i {0, 1,..., n}, execute the following process in parallel:

• Run Ai (x, d ).

• If i = 0 and A0 outputs v {0, 1}, then stop all parallel processes and output v.

log T • If i 0 and Ai (x, d ) outputs v {0, 1} in T steps, run Test(v, Estimate, Ai, 2, x, d ).

If Test accepts, then stop all parallel processes and output v.

Algorithm 3.2. Test(v, Estimate, A, T, x, d ) 1. Let = and let A (x) = A(x, d ).

d 2. If v = 0:

(a) Compute = Estimate(A, x, S, 0,, T ), where S(y b) = f (y) for y {0, 1}|x|1, b {0, 1}.

– 174 – (b) If 2, accept;

otherwise reject.

3. If v = 1:

(a) Compute = Estimate(A, x, S, 1,, T ).

(b) Compute = Estimate(A, x, id, 1,, T ).

(c) Accept, if 2 4 ;

otherwise reject.

2 In what follows d = 20cn2 d and = = as in the algorithms above.

10cn2 d d Lemma 3.1. For an algorithm A, denote = PrxUn (Im f );

A [AT (x, d ) = 1]. Let and be the random variables computed at step 3 of Test(1, Estimate, A, T, x, d ). Then Pr[| (2 )| 3 ]2.

Proof. Let a = Prxf (Un1 );

A [AT (x, d) = 1] = PrxUn (Im f );

A [AT (x, d) = 1] and let b = PrxUn ;

A [AT (x, d) = 1]. Clearly, = 2b a. By the denition of an estimator Pr[|a | ] and Pr[|b | ]. Using the triangle inequality we get Pr[|(2b a) ( )| 3 ] 2.

Theorem 3.1. The algorithm Opt(A•, Estimate, x, d) is a correct heuristic algorithm for (Im f, U ) (a deterministic or a randomized one depending on what machines does A• enumerate).

Proof. Since A0 always gives the correct answer, it suces to prove that Ai (x, d ) outputs 0 or 1 in some T 2cn steps 1.

Ai (x, d ) = (Im f )(x) Pr dn xUn ;

Ai ;

Test Test(v, Estimate, Ai, 2 log T, x, d ) = Since A0 runs in 2cn steps, no other algorithm Ai is allowed to run longer. Thus we can split every algorithm into cn “parts”: A1, A2, A4,... and it now suces to prove that i i i k A2 (x, d ) {0, 1} i 2k Pr.

Ai (x, d ) = (Im f )(x) cdn xUn ;

Ai ;

Test k, x, d ) = Test(v, Estimate, Ai, This probability can be split into two parts depending on the correct answer:

1 k k [A2 (x, d ) = 0...] + · [A2 (x, d ) = 1...].

· Pr Pr i i 2 xf (Un1 );

Ai 2 xUn (Im f );

Ai k To bound the rst part, note that if Prxf (Un1 );

Ai [A2 (x, d ) = 0] 3, then by the denitions i of Estimate and Test we have Pr[Test(0, Estimate, Ai, 2k, x, d ) = 1]. Thus the rst part of the probability is less than 2.

k We now consider the case when x Un (Im f ). By Lemma 3.1, if Pr[A2 (x, d ) = 1] 7, then i Pr[Test(1, Estimate, Ai, 2k, x, d ) = 1] 2. Thus the second part of the probability is less than 7.

In total we have 3 + 7 cdn2 by the denition of and d.

2 – 175 – Lemma 3.2. Let A be a correct heuristic algorithm for (Im f, U ). Then for every integer T and any v {0, 1}, Pr [Test(v, Estimate, A, T, x, d ) = 0] 2.

xUn ;

Test Proof. Consider v = 0. Then Test rejects with probability Pryf (Un1 );

A [AT (y, d ) = 0] d =.

Consider now v = 1. Since by Lemma 3.1 PrxUn (Im f );

A [AT (y, d ) = 1] d =, Test rejects with probability less than 2.

4 An optimal randomized heuristic algorithm In this section we describe the randomized estimator for randomized algorithms, which completes the construction of an optimal randomized heuristic algorithm.

Algorithm 4.1. Estimate-Random(A, x, S, v,, T ) ln(2/ ) • Let s = + 1.

• For i = 1, 2,..., s, do 1. Generate y S(Un ).

2. Execute AT (y);

let ui = 1 if the answer equals v, and let ui = 0 otherwise.

s • Output i=1 ui.

s Lemma 4.1. The algorithm Estimate-Random is an estimator for randomized algorithms.

s PryS(Un ),A [AT (y) = v] Proof. By Cherno bounds (see, e.g., [McD98]), Pr i=1 ui s 2e2 s.

Remark 4.1. In fact, a slightly stronger statement holds. Namely, the probability can be taken over internal random bits only and not also over the inputs as it is stated in the denition of an estimator.

Theorem 4.1. For any randomized heuristic algorithm B for the problem (Im f, U ), 1/ tB tOpt (A•, Estimate-Random, x, d), where A• enumerates all randomized algorithms.

Proof. Let B = Ai. To show the asymptotic bound, it suces to consider |x| i. Then Ai is executed by Opt. Since Estimate-Random does not use x, Lemma 3.2 implies that for any x, PrTest [Test(v, Estimate-Random, Ai, T, x, d ) = 0] is less than 2. Therefore, for every x, the algorithm Opt stops in time polynomial in n, d, and the median time tAi (x, d ) with probability at least 1 2.

Corollary 4.1. Three parallel copies of Opt(A•, Estimate-Random, x, 3d) run in parallel make an (the parallel execution is stopped as soon as one of the copies accepts or rejects). optimal randomized heuristic algorithm for (Im f, U ) – 176 – 5 An optimal deterministic heuristic algorithm To derandomize the construction of an optimal heuristic algorithm, we use the following result by Goldreich and Wigderson.

Theorem 5.1 ([GW97]). Let n be an integer and 2n, where is some positive constant.

Then there exists a family of functions F, each mapping {0, 1}n to itself, satisfying the following properties.

• Succinctness: there exists a bijection between {0, 1}l() and F, where l() = O(log 1 ). Let denote the function from F corresponding to {0, 1}n. This property means that the family F contains a polynomial in 1 number of functions • Ecient evaluation: there exists a logspace algorithm that takes two inputs: {0, 1}l(), a string x {0, 1}n and returns (x).

• Mixing property: for every two subsets A, B {0, 1}n there exists FA,B, F such that |FA,B, | (1 )|F | and for every function FA,B, :

Pr [x A (x) B] (A)(B), xUn |S| where (S) = denotes the density of the set S.

2n Corollary 5.1. In terms of Theorem 5.1, for every two subsets A, B {0, 1}n, [x A (x) B] (A)(B) 2.

Pr xUn,U (F ) Proof. PrxUn,U (F ) [x A (x) B] = Pr[x A (x) B | FA,B, ] Pr[ FA,B, ] + Pr[x A(x) B | FA,B, ] Pr[ FA,B, ]. Mixing property implies that the last quantity can be bounded from above by (A)(B)+2, and from below by ((A)(B))(1) (A)(B)2, since (A)(B) 1.

We now describe a (deterministic) estimator for deterministic algorithms. Let F be the family of functions from Theorem 5.1.

Algorithm 5.1. Estimate-Deterministic(A, x, S, v,, T ) • Let n = |x| and = 8.

• If 2n, then execute AT (y) for every y {0, 1}n, compute the frequency of the answer v and output this number.

• If 2n, then for every F, execute AT (S((x))), compute the frequency of the answer v and output this number.

Proposition 5.1. The algorithm Estimate-Deterministic is an estimator.

– 177 – Proof. If 2n, the algorithm Estimate-Deterministic computes the exact answer, and it has enough time for that, because is so small.

Otherwise, let B = {y {0, 1}n | AT (S(y)) = v}. Let C+ = {x {0, 1}n | PrU (F ) [(x) B] (B) + }, and let C = {x {0, 1}n | PrU (F ) [(x) B] (B) }.

Let (C+ ) 2. Then PrxUn,U (F ) [x C+ (x) B] (C+ )((B)+ ) (C+ )(B)+ 2, which contradicts Corollary 5.1. Therefore, (C+ ) 2. Similarly, (C ) 2. Thus (C C+ ).

Theorem 5.2. The algorithm Opt(A•, Estimate-Deterministic, x, d) is optimal on the average, where A• enumerates all deterministic algorithms.

Proof. Let Ai be a (correct) deterministic heuristic algorithm for (Im f, U ). To show the asymptotic bound, it suces to consider |x| i. Then the algorithm Ai is executed by Opt.

To estimate the fraction of the inputs x such that Test(v, Estimate-Deterministic, Ai, T, x, d ) rejects, note that Estimate-Deterministic does not use randomness. Therefore Lemma 3.2 implies that this fraction is less than 2 2d.

For every other x, the running time of Opt is polynomial in n, d, and tAi (x, d ).

Remark 5.1. The algorithm constructed in Theorem 5.2 is also an optimal-on-the-average deter ministic acceptor for the distributional problem (Im f, U ) as well as for the problem (Im f, U ). (We refer the reader to the Appendix for the precise denitions and statements.) 6 Further research A natural question is to generalize the construction to suit any (not necessarily injective) function.

However, a much more challenging question is to construct an optimal heuristic proof system for (Im f, U ) (see [HIMS10] and Appendix for the rigorous denition of a heuristic proof system).

References [BT06] Andrej Bogdanov and Luca Trevisan. Average-case complexity. Foundation and Trends in Theoretical Computer Science, 2(1):1–106, 2006.

[Gol95] Oded Goldreich. Foundation of Cryptography: Basic Tools. Cambridge University Press, 1995.

[GW97] Oded Goldreich and Avi Wigderson. Tiny families of functions with random properties:

A quality-size trade-o for hashing. Random Structures & Algorithms, 11(4):315–343, 1997.

[HIMS10] Edward A. Hirsch, Dmitry Itsykson, Ivan Monakhov, and Alexander Smal. On optimal heuristic randomized semidecision procedures, with applications to proof complexity and cryptography. Technical Report 10-193, ECCC, 2010. Extended abstract appeared in the proceedings of STACS-2010.

[Hir10] Edward A. Hirsch. Optimal acceptors and optimal proof systems. In Jan Kratochv l, Angsheng Li, Jir Fiala, and Petr Kolman, editors, TAMC, volume 6108 of Lecture Notes in Computer Science, pages 28–39. Springer, 2010.

– 178 – [KP89] Jan Krajcek and Pavel Pudlk. Propositional proof systems, the consistency of rst a order theories and the complexity of computations. The Journal of Symbolic Logic, 54(3):1063–1079, September 1989.

[Lev73] Leonid A. Levin. Universal sequential search problems. Problems of Information Trans mission, 9:265–266, 1973.

[McD98] C. McDiarmid. Concentration, volume 16 of Algorithms and Combinatorics, pages 195– 248. Springer-Verlag, 1998.

[Mes99] Jochen Messner. On optimal algorithms and optimal proof systems. In Proceedings of the 16th Symposium on Theoretical Aspects of Computer Science, volume 1563 of Lecture Notes in Computer Science, pages 361–372, 1999.

[Mon11] Hunter Monroe. Speedup for natural problems and noncomputability. Theoretical Com puter Science, 412(4-5):478–481, 2011.

[Sad99] Zenon Sadowski. On an optimal deterministic algorithm for SAT. In Proceedings of CSL’98, volume 1584 of Lecture Notes in Computer Science, pages 179–187. Springer, 1999.

A Appendix: Deterministic heuristic acceptors and proof systems In this section we give the denitions of deterministic heuristic acceptors and automatizable deter ministic heuristic proof systems and prove that they are equivalent in terms of the running time vs proof length. While this is not dicult to see, it is in contrast with the situation in the randomized setting where only the equivalence to weakly automatizable proof systems is proved [HIMS10].

Note that [HIMS10] considers distributional proving problems, i.e., distributional problems (L, D) with L supp D =. In this appendix we use a natural generalization of these denitions to arbitrary distributions in order to keep the same notation as in the main part of the paper.

Denition A.1. A deterministic heuristic acceptor for a distributed problem (L, D) is a deter ministic algorithm A(x, d) such that • For every x and d, the algorithm A(x, d) either does not stop or outputs 1 (i.e., accepts).

• For every x L and d N, A(x, d) = 1.

• For every d, PrxDn [x L A(x, d) = 1] d.

/ Proposition A.1. Let f : {0, 1} {0, 1} be a polynomial-time computable injective function such that |f (x)| = |x| + 1. The problem (Im f, U ) has a deterministic heuristic acceptor that is optimal on the average with respect to f (U ). Similarly the problem (Im f, U ) has a deterministic heuristic acceptor that is optimal on the average with respect to U (Im f ).

Proof. Note that by running a brute-force search in parallel we can transform an acceptor into an algorithm. If we run a brute-force search for a negative response we transform an algorithm into an acceptor (that does not err on the language). Then we can apply Theorem 5.2.

– 179 – Denition A.2. A deterministic heuristic proof system for a distributed problem (L, D) is a deterministic algorithm (x, w, d) such that • (x, w, d) runs in time (|x|d)O(1).

• For every x L and d N, there exists w {0, 1} such that (x, w, d) = 1. We call such a string w a (d) -proof of x.

• For every d, PrxDn [x L w (x, w, d) = 1] d.

/ If for x L, there is a string w such that (x, w, d) = 1, we call w a fake (d) -proof of x.

/ the length of the shortest (d) -proof of x.

For x L, we denote by (x, d) Denition A.3. A deterministic algorithm B(x, d) is an automatization procedure for a heuristic deterministic proof system if for every x L and d N, the algorithm B(x, d) takes time polynomial in |x|, d, and (x, d) and outputs a (d) -proof. For x L, the behavior of the / algorithm B is not restricted.

A proof system is automatizable if there is an automatization procedure for it.

Simlarly to the classical case, heuristic deterministic acceptors and proof systems can be con verted into each other. The details follow.

Let A(x, d) be a deterministic heuristic acceptor for a distributed problem (L, D). The corre sponding proof system A can be dened as follows: A (x, 1T, d) = 1 if A(x, d) accepts in at most T steps. Clearly, A is an automatizable heuristic proof system;

the automatization procedure just simulates A(x, d), computes the number of steps T required for the acceptance, and outputs 1T.

Also A (x, d) (tA (x, d) + |x| + d)O(1).

Assume now that is an automatizable proof system for (L, D), and B is its automatization procedure. The corresponding acceptor A (x, d) can be dened as follows: simulate B(x, d);

if it outputs a proof, accept;

otherwise do not stop.

Since these transformations translate the running time into the proof length and vice versa, we can avoid going into the details of specic heuristic simulations (similar to we dened for heuristic algorithms) and prove a more general statement.

Denition A.4. Let be a transitive relation on the set of functions from L N to N. We call it a simulation if for every two such functions f, g, if f (x, d) (g(x, d) + |x| + d)O(1) then f g.

Proposition A.2. Let be a simulation. Then a distributed problem (L, D) has an acceptor A with the smallest tA under (in the set of the running time functions for all possible acceptors) i there is a deterministic heuristic automatizable proof system with the smallest under.

Proof. We use the correspondence described above (A and A ). Assume that A is an acceptor with the smallest running time tA. Then A is a proof system with the smallest A. Indeed, consider another proof system. The construction of A implies that tA. Since tA is the smallest running time for acceptors, tA tA. The construction of A implies that A tA. By transitivity, A.

The proof of the converse is similar.

Corollary A.1. The distributional problem (Im f, U ) has a deterministic heuristic automatizable proof system that is optimal on the average with respect to f (U ) (i.e., its length function is the smallest under ).

ECCC ISSN 1433- – 180 – http://eccc.hpi-web.de NO-LEAK AUTHENTICATION BY THE SHERLOCK HOLMES METHOD DIMA GRIGORIEV AND VLADIMIR SHPILRAIN “When you have eliminated the impossible, whatever remains, however improbable, must be the truth.” Arthur Conan Doyle, The Sign of Four Abstract. We propose a class of authentication schemes that are literally zero knowledge, as compared to what is formally dened as “zero-knowledge” in crypto graphic literature. We call this “no-leak” authentication to distinguish from an estab lished “zero-knowledge” concept. The “no-leak” condition implies “zero-knowledge” (even “perfect zero-knowledge”), but it is actually stronger, as we illustrate by ex amples.

The principal idea behind our schemes is: the verier challenges the prover with questions that he (the verier) already knows answers to;

therefore, even a computa tionally unbounded verier who follows the protocol cannot possibly learn anything new during any number of authentication sessions. This is therefore also true for a computationally unbounded passive adversary.

1. Introduction For a general theory of public-key authentication (a.k.a. identication) as well as early examples of authentication protocols, the reader is referred to [10]. In this pa per, we propose a class of authentication schemes that are literally zero-knowledge by design, because the verier knows the correct response before communicating his challenge to the prover, which means he cannot possibly obtain any new information from the prover’s response. We call this “no-leak” authentication because the “zero knowledge” terminology is “trademarked” in cryptographic literature by way of specic formal denitions, see e.g. [3]. Unfortunately, “zero-knowledge” (or even “perfect zero knowledge”) is just a euphemism, i.e., those formal denitions do not actually imply that there is no leak of information during any proof session. Even more unfortunately, it seems to be a rather common misconception these days that “perfect zero-knowledge” implies no leak for any honest verier;

this is what often happens with euphemisms.

We refer the interested reader to [11] where an “hierarchy of zero-knowledge” is sug gested, so that the presence of leak or lack thereof depends on the (honest) verier’s computational abilities.

Here we compare the established denition(s) of “perfect zero-knowledge” to our denition of “no-leak” that applies, incidentally, to a computationally unbounded forger Research of the rst author was partially supported by the Federal Agency of the Science and Innovations of Russia, State Contract No. 02.740.11.5192.

Research of the second author was partially supported by the NSF grant DMS-0914778.

– 181 – as well. Perhaps the dierence is best illustrated by scrutinizing the Graph ISO protocol from [4], which is known to be “perfect zero-knowledge”, but it is obviously not “no leak”. We discuss this protocol in detail in our Section 4, while now we will try to explain the dierence in denitions of “perfect zero-knowledge” vs. “no-leak”. We note that there are several somewhat dierent denitions of “perfect zero-knowledge” (for example, the seminal paper [4] has 3 dierent denitions), so here we are just trying to capture most essential features of those denitions to help us make our point.

Denition 1. [4] (“perfect zero-knowledge”) Let (P, V ) be an interactive proof system for a language L. Let F be a polynomial-time probabilistic Turing machine (PTM) that produces forged transcripts. Let T (x) denote the set of all possible true transcripts (obtained by the verier V actually engaging in an interactive proof with the prover P on input x), and F(x) the set of all possible forged transcripts. The proof system (P, V ) is called perfect zero-knowledge (for L) if there exists a forger F such that for any x L:

(i) the set of forged transcripts is identical to the set of true transcripts, i.e., F(x) = T (x);

(ii) the two associated probability distributions are identical, i.e., for any transcript T T (x), one has P rT [T ] = P rF [T ].

To better illustrate the dierence between this and the following denition, we say, somewhat informally, that the former implies that in a perfect zero-knowledge proof system, if a specic leak of information (about the prover’s secret key, say) can occur with probability p during an interaction between V and P, then it can occur with the same probability p if V does not interact with P at all, but just simulates such interaction.

By comparison, in the following denition the condition is (again, informally) that no leak should occur at all, i.e, the probability p alluded to in the previous paragraph is always 0. The following denition is an attempt to formalize this idea. Note also that in our denition, the “polynomial-time” restriction on the forger is dropped.

Denition 2. (“no-leak”) Let (P, V ) be an interactive proof system for a language L. It is called no-leak if, for any x L and for any function f from L to N, any sequence T (x) of true transcripts obtained in time f (x) by interaction of the verier V with the prover P on input x, could be also constructed by a deterministic algorithm by V alone (i.e., without interacting with P ) in time f (x).

In other words, the prover can contribute nothing whatsoever to the verier’s ability of producing any sequence of true transcripts.

Probably the easiest way to achieve the “no-leak” property is to have the verier V only ask those questions that he already knows the correct answers to. In this scenario, the corresponding proof system will obviously be perfect zero-knowledge since whatever V can compute after interacting with the prover, he could already have computed before interacting with the prover. At the same time, “perfect zero-knowledge” does not necessarily imply “no leak”;

as we have already mentioned, this is best illustrated – 182 – by scrutinizing the graph ISO protocol from [4], so the reader who is not convinced at this point that the ‘no-leak” condition is stronger than “perfect zero-knowledge”, is referred to our Section 4.

It is well known that the main motivation for studying zero-knowledge proof systems is their application to authentication, which is also our main motivation here. The reason why we call our idea of authentication the “Sherlock Holmes method” is the following. It is the case with most natural decision problems in algebra (such as the identity problem, the conjugacy problem, the membership problem, etc.) that, for a generic input, the “no” answer can be obtained much more eciently than the “yes” answer. Thus, if the prover “eliminates the impossible” by (eciently) getting the “no” answers whenever she can, she is left with the only remaining possibility for a “yes” answer. We note that in a typical concrete realization of this idea, the prover will not be able to give a “yes” answer by any other method than “eliminating the impossible”, which is why we call it the “Sherlock Holmes method.” To conclude the Introduction, we summarize what we think are the most interesting features of our proposal:

(1) A verier who follows the protocol cannot possibly obtain from the prover any information that he does not already know. This is therefore also true for a passive adversary, even computationally unbounded one.

(2) There is no “concrete” problem for the adversary to solve in order to obtain the prover’s long-term private key from her public key. The problem he/she faces is to obtain a test for non-membership in a set that he/she does not know.

Now a natural question is whether a verier who does not follow the protocol can obtain any information during an authentication session by oering challenges without knowing a priori what the correct response should be. We comment on this question in Section 2.

Finally, we note that, in theory, our general scheme can also be used for encryption, see Remark 1 at the end of Section 2, although this is outside of the focus of our paper.

In particular, we do not make any claims concerning security of relevant encryption schemes and do not discuss practical hardness assumptions. It should be understood that the focus of this paper is on theoretical aspects.

2. The meta-protocol In this section, we give a description of our general idea of “authentication by the Sherlock Holmes method”, leaving particular realizations to the next sections. Here Alice is the prover and Bob the verier.

Alice’s private key consists of: (a) a subset S0 of some “universal” set S;

(b) an ecient test telling that a given element of S does not belong to S0 ;

(c) a way to disguise S0 to some S0, e.g., a self-bijection of the universal set S, such that (S0 ) = S0 and it is possible for Alice to eciently compute both and 1. In some realizations, (c) is not really necessary, and the role of S0 is played just by a subset of S0.

Alice’s public key is a pair of sets S0, S1 that either are disjoint or have negligible intersection. Note that the verier does not know whether Alice has a “non-membership – 183 – test” for S0 or for S1 ;

this information is not public! Both sets are given to the public in such a way that it is possible to eciently select a random element from either set.

We note that the idea of a “non-membership test” for S0 (see part (b) of Alice’s private key above) can be expressed in a more precise language. Alice should have her private separator T, such that S0 T, the intersection T S1 is empty or negligible, and T is a “nice” set in the sense that the problem of membership in T is eciently solvable. Thus, Alice can eciently check membership of an element x in question in the set T ;

then, if x T, she knows for sure that x S0. If x T, then she assumes / / that x S0 ;