Math 202C: Lecture 21 -The sensitivity conjecture

Preliminary definitions and Notations

First up, a list of preliminary notations and definitions of terms that recur throughout this blog post.

  • Q^n denotes the n-dimensional hypercube graph whose vertex set consists of all strings in \{-1,1\}^n and two vertices are adjacent iff the corresponding strings differ exactly in one-coordinate.
  • For an undirected graph G, the maximum degree of a vertex in G is denoted by \Delta(G) and \lambda_1(G) denotes the largest eigenvalue of the adjacency matrix of G.
  • For x \in \{-1,1\}^n and a subset S of indices from [n] = \{1, 2, \cdots, n\}, the string obtained from x by flippling all entries of x corresponding to the indices in S is denoted by x^S.
  • Any function f : \{-1,1\}^n \rightarrow \{-1,1\} is called a boolean function.

Definition (Local sensitivity and sensitivity). Given a boolean function f : \{-1,1\}^n \rightarrow \{-1,1\}, the local sensitivity on the input x, denoted by s(f,x), is defined as the number of indices i, such that f(x) \neq f(x^{\{i\}}), and the sensitivity s(f) of f is \max_x s(f,x).

Definition (Local block sensitivity and sensitivity). Given a boolean function f : \{-1,1\}^n \rightarrow \{-1,1\}, the local block sensitivity on the input x, denoted by bs(f,x) is defined as the maximum number of disjoint blocks B_1, \cdots B_k of [n] such that for each B_i, f(x) \neq f(x^{B_i}), and the block sensitivity bs(f) of f is \max_x bs(f,x).

The sensitivity conjecture and the tale of three theorems

(Sensitivity conjecture) There exists an absolute constant C >0, such that for every boolean function f, bs(f) \leq s(f)^C.

Sensitivity and block sensitivity are examples of complexity measures for boolean functions. Two complexity measures A and B are said to be polynomially related to each other if there exist polynomials p(t), q(t) such that for any Boolean function f, we have A(f) \leq p(B(f)) and B(f) \leq q(A(f)). The block sensitivity is known to be polynomially related to other complexity measures of boolean functions such as the degree (to be defined later). However, the question (a.k.a the sensitivity conjecture) of whether the block sensitivity is polynomially related to the sensitivity was posed as an open problem by Nisan and Szagedy in 1989 (Note that bs(f) \geq s(f) by a direct consequence of their definitons. The open problem was showing whether bs(f) \leq s(f)^C for some C >0). A few years later in 1992, Gotsman and Linial [2] showed the equivalence of the sensitivity conjecture to another open problem on the hypercube. Finally in 2019, Hao Huang resolved the sensitivity conjecture by leveraging on the equivalence shown by Gotsman and Linial.

Cue harmonic analysis on finite abelian groups

In lecture post 7 by Prof. Novak, we saw that \mathrm{B}(n) is isomorphic to the group \mathbb{Z}_2^n, i.e. bitstrings of length n (Recall that B(n) denotes the subgroup of \mathrm{S}(2n) generated by the n disjoint transpositions \tau_1=(1\ 2),\ \tau_2=(3\ 4),\ \dots, \tau_n = (2n-1\ 2n)). Thus, a boolean function f:  \{-1,1\}^n \rightarrow \{-1,1\} , can be identified with a member of the Hilbert space \mathbb{C}^{ \mathrm{B}(n) } and the rich tools of harmonic analysis on B(n) discussed in lecture posts 7-9 are applicable to boolean functions. Note that the hypercube Q^n is the subgraph of the Caley graph of S(2n) induced by the subgroup B(n). Thus, the sensitivity conjecture can be rephrased in the language of induced subgraphs of the hypercube Q^n and this is exactly the essence of the work done by Gotsman and Linial in 1992.

Before we dive into the details, a few (post)-preliminary notations and remarks.

  • Given an induced subgraph H of Q^n, let Q^n- H denote the subgraph of Q^n induced on the vertex set V(Q) \backslash V(H).
  • For S \subseteq [n] := \{1,2, \cdots, n\}, the function \chi_S : \{-1,1\}^n \rightarrow \{+1,-1\} is defined as \chi_S(x) = \Pi_{i \in S}x_i, with the convention that for all x, \: \: \chi_{\phi}(x) = 1 where \phi denotes the empty set.
  • In lecture post 7 (and also in the qual exam) we saw that the functions \{\chi_S\}_{S \subseteq [n]} (which exactly correspond to the characters of B(n) since \mathbb{Z}_2^n is isomorphic to \mathrm{B}(n)) form an orthogonal basis for \mathbb{C}^{ \mathbb{Z}_2^n } under the inner product \langle f,g \rangle = \sum_{x \in \{-1,1\}^n} f(x)g(x).
  • Thus, given a boolean function f, by Fourier expansion, f can be written as f(x) = \sum_{S \subseteq [n]} \hat{f}(S)\chi_S(x), where \hat{f} denotes the Fourier coefficients.
  • The average of a boolean function f on Q^n is denoted as \mathbf{E}(f) : = \frac{1}{2^n}\sum_{x \in Q^n}f(x). Note that since \chi_{\phi}(x) = 1 for all x, we have the identity \hat{f}(\phi) = \mathbf{E}(f) which is quite useful when proving theorem 1 (see below).

Definition. Given a boolean function f, the degree of f is defined as deg(f) := \max _{S \subseteq [n]}\{|S| \: | \: \hat{f}(S) \neq 0\}.

Theorem 1 [Gotsman and Linial]. The following two statements are equivalent for any monotone strictly increasing function h: \mathbb{N} \rightarrow \mathbb{R}.
a. For any induced subgraph H of Q^n with |V(H)| \neq 2^{n-1}, we have \max \{ \Delta(H), \Delta(Q^n - H)\} \geq h(n).
b. For any boolean function f, we have that s(f) \geq h(deg(f)).

Gotsman and Linial show that proving the equivalence of statements a and b above is equivalent to proving the equivalence of yet another two statements and then proceed to prove this second equivalence.

Proof of Theorem 1. First, we show the equivalence of statement a to following statement a’:

a’. For any boolean function g, \mathbf{E(g)} \neq 0 implies there exists an input x such that n - s(g,x) \geq h(n).

a’ implies a : Given an induced subgraph H of Q^n, it can be associated with a corresponding boolean function g defined such that g(x) = 1 iff x \in V(H). By a direct consequence of the definitions of adjacency in Q^n and the local sensitivity of g, \: \: \text{deg}_H(x) = n - s(g,x) for x \in V(H) and \text{deg}_{Q^n -H}(x) = n - s(g,x) for x \in V(Q) \backslash V(H) (Note that here \text{deg}_H(x) denotes the degree of the vertex x in the subgraph H and is not to be confused with the degree of a boolean function that was defined earlier). Thus, \max \{ \Delta(H), \Delta(Q^n - H)\}  =  \max_x  \{n - s(g,x)\}. Let \mathbf{E}(g) denote the average value of g on Q^n, i.e. \mathbf{E}(g) = \frac{1}{2^n}\left[ \sum_{x \in V(H)}g(x) + \sum_{x  \in V(Q)\backslash V(H)}g(x) \right]. Note that V(H) \neq 2^{n-1} implies \mathbf{E}(g) \neq 0. Thus, statement a’ implies \max \{ \Delta(H), \Delta(Q^n - H)\}  =   \max_x  \{n - s(g,x)\} \geq h(n).

a implies a’ : Given a boolean function g, define H to be the subgraph induced by the set of vertices \{x \in Q^n \: | \: g(x) = 1\}. By an identical arument as in the previous paragraph, \max \{ \Delta(H), \Delta(Q^n - H)\}  =  \max_x  \{n - s(g,x)\}. Also, \mathbf{E}(G) \neq 0 implies V(H) \neq 2^{n-1}. Thus, statement a implies \max \{ \Delta(H), \Delta(Q^n - H)\}  =  \max_x  \{n - s(g,x)\} \geq \sqrt{n} thereby showing the equivalent of statements a and a’.

Now, we show the equivalence of statement b to statement b’:

b’. For any boolean function f, s(f) < h(n) implies deg(f) < n.

Assuming b and that the premise for b’ is true, then h(deg(f)) < s(f) < h(n) and since h is assumed to be monotonic strictly increasing this shows that deg(f) < n and thus b implies b’. Now, towards showing the other implication let f denote a boolean function with degree d \leq n. Assume without loss of generality that for S =\{1, \cdots, d\}, \: \: \hat{f}(S) \neq 0 . Now, consider the function g : \{-1,1\}^d \rightarrow \{-1,1\} defined as g(x_1, x_2, \cdots x_d) := f(x_1, x_2, \cdots, x_d, -1, -1, \cdots -1). By construction s(f) \geq s(g) and deg(g) = d, i.e g has full degree as a boolean function on Q^d. Now, the contrapositive of b’ says that deg(g) = d implies s(g) \geq h(d). Thus, s(f) \geq  s(g) \geq h(d) = h(deg(f)) and this finshes showing the equivalence of b and b’.

Thus, proving theorem 1 has been reduced to the task of proving the equivalence of statements a’ and b’. Now, given a boolean function f, define another boolean function g(x) = f(x)\chi_{[n]}(x) where \chi_{[n]}(x) = \Pi_{i=1}^nx_i is the parity function of x. Note that \hat{g}(S) =  \frac{1}{2^n}\sum_{x \in Q^n}f(x) \chi_{[n]}(x)   \chi_S(s). Thus,

\hat{g}(S)  = \frac{1}{2^n}\sum_{x \in Q^n}f(x)   (\Pi_{i=1}^nx_i)   (\Pi_{i \in S}x_i) = \frac{1}{2^n}\sum_{x \in Q^n}f(x) \Pi_{i \notin S}x_i = \hat{f}([n] \backslash S)

Thus, for all S \subseteq [n], \hat{g}(S) = \hat{f}([n]\backslash S). Also note that for all x \in Q^n, s(g,x) = n- s(f,x) because every bit flip which causes a change in value of f(x) does not cause a change in function value of g(x) and vice-versa. Thus, \mathbf{E}(g) = \hat{g}(\phi) = \hat{f}([n]) where \phi denotes the empty set.

a’ implies b’: Towards a contradiction assume that s(f) < h(n) and deg(f) = n . This by definition implies \hat{f}([n]) \neq 0 and thus \mathbf{E}(g) \neq 0. By a’, there exist a x such that s(g,x) \leq n - h(n) and since s(g,x) = n- s(f,x) this implies there exists a x such that s(f,x) \geq h(n) which contradicts the premise that s(f) < h(n).

b’ implies a’: Again towards a contradiction assume that given g such that \mathbf{E}(g) \neq 0 we have that for all x, \: s(g,x) > n- h(n). Since s(g,x) = n- s(f,x) this implies that s(f) < h(n) which by 2′ implies deg(f) <n. Note that deg(f) < n is equivalent to 0 = \hat{f}([n]) and thus \hat{f}([n]) = \hat{g}(\phi) = \mathbf{E}(g) = 0 which contradicts the premise that \mathbf{E}(g) \neq 0 and this completes the proof of theorem 1.

Theorem 2 [Huang] . For every integer n \geq 1, let H be an arbitrary (2^{n-1}+1)-vertex induced subgraph of Q^n, then

\Delta(H) \geq \sqrt{n}

The proof of theorem 2 is going to be the main focus for the remainder of this blogpost but before that we discuss two corollaries which show why theorem 2 resolves the sensitivity conjecture.

Corollary 1. For every boolean function f, s(f) \geq \sqrt{deg(f)}.

Proof of Corollary 1. For any induced subgraph H of Q^n with |V(H)| \neq 2^{n-1}, since the total number of vertices in Q^n is 2^n, either H contains atleast 2^{n-1}+1 vertices or Q^n - H does. So without loss of generality, assume H contains atleast 2^{n-1}+1 vertices (If not, just replace H with Q^n -H in the remainder of the proof). Now, theorem 2 (along with the fact that the maximum degree \Delta(H) is non-decreasing in the number of vertices in H) implies \max \{ \Delta(H), \Delta(Q^n - H)\} \geq \sqrt{n}. Let h(n) := \sqrt{n} and note that it is a monotone strictly increasing function on \mathbb{N}. Now, theorem 1 implies s(f) \geq \sqrt{deg(f)} and this completes the proof.

Theorem 3 [Tal]. For every boolean function f, bs(f) \leq deg(f)^2.

It should be noted that theorem 3 is an imporvement on the work by Nisan and Szegedy in 1992 wherein they showed that bs(f) \leq 2deg(f)^2.

Corollary 2. For every boolean function f, bs(f) \leq s(f)^4.

Note that corollary 2 follows immediately from theorem 3 and corollary 1. Thus, corollary 2 confirms the sensitivity conjecture by showing that the block sensitivity and the sensitivity of a boolean function are indeed polynomially related.

Cue matrix analysis – “A proof that came straight from THE Book”

In this section, we will focus on Hao Huang’s proof of theorem 2. The core chunk of the proof invovles a creative construction of an iterative sequence of matrices whose n^{th} instance is closely related to the adjacency matrix of Q^n and an application of Cauchy’s interlace theorem which is stated as follows,

(Cauchy’s interlace theorem) Let A be a symmetric n \times n, B be a m \times m principal submatrix of A for some m<n. If the eigenvalues of A are \lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n, and the eigenvalues of B are \mu_1 \geq \mu_2 \geq \cdots \geq \mu_n, then for all 1 \leq i \leq m,

\lambda_i \geq \mu_i \geq \lambda_{i+n-m}

Recall that we saw in Math 202A that Cauchy’s interlace theorem can be proven via a direct application of the Courant-Fischer theorem.

Now, we move on to the iterative matrix sequence that turns out to be a bite-sized kryptonite for the sensitivity conjecture. Consider the sequence of symmetric square matrices defined iteratively as follows,

B_1 := \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, \qquad B_n :=\begin{pmatrix} B_{n-1} & I \\ I & -B_{n-1} \end{pmatrix}

The first exercise is to study the spectrum of B_n and the second exercise is to compute an upper bound for the largest eigenvalue \lambda_1 of a specific type of matrix. These two exercises then set us up nicely for a clever application of Cauchy’s interlace theorem to prove theorem 2.

Exercise 1.
a)
Show that B_n^2 = nI where I denotes the 2^n \times 2^n identity matrix.
b) Show that the trace of B_n is 0.
c) Conclude that the eigenvalues of B_n are \sqrt{n} with multiplicity 2^{n-1} and -\sqrt{n} with multiplicity 2^{n-1} .

Exercise 2. Suppose H is a m-vertex undirected graph and B is a symmetric matrix whose entries are in \{-1,0,1\}, and whose rows and columns are indexed by the vertex set of H. Also assume B_{u,v} = 0 whenever u and v are non-adjacent vertices in H. Then show that

\Delta(H) \geq \lambda_1 := \lambda_1(B)

The defintion of this iterative sequence of matrices \{B_j\}_1^n is not without precedent. Note that the adjacency matrix of Q^n can also be defined in a similarly iterative manner. Recall that a vertex of Q^n is a string in \{-1,1\}^n. Given a (n-1)-length string of -1s and 1s it naturally extends to exactly two strings in \{-1,1\}^n and these two n-strings correspond to the only two vertices of Q^n whose corresponding strings have their first n-1 entries to be the same as the original n-1 string (note that these two vetices of Q^n are adjacent). Exapnding on this observation, the hypercube Q^n can be thought of as having two Q^{n-1} sub-hypercubes with a perfect matching connecting these two subcubes. In otherwords, consider the following sequence of iteratively defined matrices,

A_1 := \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}, \qquad A_n :=\begin{pmatrix} A_{n-1} & I \\ I & A_{n-1} \end{pmatrix}

Then A_n represents the adjacency matrix of Q^n and in particular, note that A_n is equal to the matrix obtained by changing every (-1) entry of B_n to 1.

Now that we have all the necessary ingredients, let’s discuss the proof of Theorem 2.

(Restated) Theorem 2 [Huang] . For every integer n \geq 1, let H be an arbitrary (2^{n-1}+1)-vertex induced subgraph of Q^n, then

\Delta(H) \geq \sqrt{n}

Proof of Theorem 2 [Haung].

Let B_n be the sequence of matrices that we defined above and let H be a (2^{n-1} + 1)-vertex induced subgraph of Q^n. Note that H naturally induces a principal submatrix B_H of B_n (written with respect to an appropriate permutation of the standard basis). As discussed in the preceeding paragraph, by changing every (-1) entry of B_n to 1, we get the adjacency matrix of Q^n. So, in particular, B_H and H satisfy the conditions of the statement in Exercise 2 and thus,

\Delta(H) \geq \lambda_1(B_H)

Also, since B_H is a (2^{n-1} + 1) \times (2^{n-1} + 1) principal submatrix of the 2^n \times 2^n matrix B_n, by Cauchy’s interlace theorem,

\lambda_1(B_H) \geq \lambda_{2^{n-1}}(B_n)

From Exercise 1, we know that the eigenvalues of B_n are \sqrt{n} with multiplicity 2^{n-1} and -\sqrt{n} with multiplicity 2^{n-1}. Thus, \lambda_{2^{n-1}}(B_n) = \sqrt{n} and thus,

\Delta(H) \geq \lambda_1(B_H) \geq \lambda_{2^{n-1}}(B_n) = \sqrt{n}

and this completes the proof.

Closing Remarks

I will end this blog post for our Math 202C class with a quote from Scott Aaronson’s blog post about Hao Huang’s proof of the sensitivity conjecture.

Paul Erdös famously spoke of a book, maintained by God, in which was written the simplest, most beautiful proof of each theorem. The highest compliment Erdös could give a proof was that it “came straight from the book.” In this case, I find it hard to imagine that even God knows how to prove the Sensitivity Conjecture in any simpler way than this.

-Scott Aaronson

The funny thing is, a few days after Prof. Aaronson’s blogpost, his former student Shalev Ben-David posted in the comments section, a simplication of Huang’s proof which does not use Cauchy’s interlacing theorem.

Anyways, talking about the sensitivity conjecture is indeed quite a fitting end to the Math 202 series since both, Gotsman & Linial’s work and Hao Huang’s work on this 30 year old ex-open problem utilize almost exclusively, concepts which we have studied over the past three quarters.

Thanks for reading my post and have a great summer everyone!

References
  • [1] Huang, H. (2019). Induced subgraphs of hypercubes and a proof of the sensitivity conjecture. Annals of Mathematics190(3), 949-955.
  • [2] N. Nisan, M. Szegedy, On the degree of Boolean functions as real polynomials, Comput. Complexity, 4 (1992), 462–467.
  • [3] C. Gotsman, N. Linial, The equivalence of two problems on the cube, J. Combin.Theory Ser. A, 61 (1) (1992), 142–146.
  • [4] F. Chung, Z. F¨uredi, R. Graham, P. Seymour, On induced subgraphs of the cube, J. Comb. Theory, Ser. A, 49 (1) (1988), 180–187.
  • [5] A. Tal, Properties and applications of boolean function composition, Proceedings of the 4th conference on Innovations in Theoretical Computer Science, ITCS ’13, 441–454.
  • [6] Karthikeyan, Rohan & Sinha, Siddharth & Patil, Vallabh. (2020). On the resolution of the sensitivity conjecture. Bulletin of the American Mathematical Society. 57. 1. 10.1090/bull/1697.
  • [7] https://www.scottaaronson.com/blog/?p=4229
  • [8] Lecture notes for Math 202A by Prof. Philip Gill.
  • [9] D. Kroes, J. Naranjo, J. Nie, J. O’ Neill, N. Seiger, S. Spiro, E. Zhu, Linear Algebra methods in Combinatorics, ABACUS notes Fall 2019 UCSD.