Blackboard, Winslow Homer (1877)

## Theory of Computation Seminar

We meet every Tuesday at Schreiber 309.
Light lunch is served from 12:45, and the talk starts at 13:10 and lasts 1-2 hours (with a 10 minute break).

The seminar also has a mailing list. In order to join or leave the list, fill the form on this page, or contact Dean Doron, the seminar coordinator.

## Archive of Previous Talks

### Spring 2016

• June 07, 2016 Elad Haramaty (Northeastern University) Robust testing of low-degree and lifted codes A local tester for a code probabilistically looks at a given word at a small set of coordinates and based on this local view accepts codewords with probability one while rejecting words far from the code with constant probability. A local tester for a code is said to be ârobustâ if the
local views of the tester are far from acceptable views when the word being tested is far from the code. Robust testability of codes play a fundamental role in constructions of probabilistically checkable proofs where robustness is a critical element in composition.

In this talk we consider a broad class of codes, called lifted codes, that include codes formed by low-degree polynomials, and show that an almost natural test, extending a low-degree test proposed by Raz and Safra (STOC 1997), is robust. Our result is clean and general â the robustness of the test depends only on the distance of the code being lifted, and is positive whenever the distance is positive.

Joint work with Alan Guo and Madhu Sudan
• May 17, 2016 Ronen Shaltiel (Haifa University) Pseudorandomness when the odds are against you Impagliazzo and Wigderson showed that under plausible hardness assumptions, every time $T$ constant-error randomized algorithm can be simulated deterministically in time $\poly(T)$. However, such polynomial slowdown is a deal breaker when $T=2^{\alpha \cdot n}$, for a constant $\alpha>0$, as is the case for many randomized algorithms for NP-complete problems. Paturi and Pudlak observed that many such algorithms are obtained from randomized time $T$ algorithms, for $T \leq 2^{o(n)}$, with large one-sided error $1-\epsilon$, for $\epsilon=2^{-\alpha \cdot n}$, that are repeated $1/\epsilon$ times to yield a constant-error randomized algorithm running in time $T/\epsilon=2^{(\alpha+o(1)) \cdot n}$.

This (and other problems) gives motivation to construct polynomial time computable pseudorandom generators with very small error $\epsilon$. Can we use the hardness versus randomness paradigm to construct such PRGs? Under what assumptions? In this talk I will discuss constructions and impossibility results for this problem.

Based on joint work with Sergei Artemenko, Russell Impagliazzo, and Valentine Kabanets.
• May 10, 2016 Aviad Rubinstein (Berkeley) Settling the complexity of computing approximate two-player Nash equilibria We prove that there exists a constant \epsilon>0 such that, assuming the Exponential Time Hypothesis for PPAD, computing an \epsilon-approximate Nash equilibrium in a two-player (nXn) game requires quasi-polynomial time, $n^{\log^{1-o(1)}n}$.
This matches (up to the o(1) term) the algorithm of Lipton, Markakis, and Mehta [LMM03].
En route, we also prove new hardness results for computing Nash equilibria in games with many players.

Our proof relies on a variety of techniques from the study of probabilistically checkable proofs (PCP); this is the first time that such ideas are used for a reduction between problems inside PPAD.
• May 03, 2016 Yaron Singer (Harvard) Some limitations and possibilities toward data-driven optimization As we grow highly dependent on data for making predictions, we translate these predictions into models that help us make informed decisions. But how do the guarantees we have on predictions translate to guarantees on decisions? In many cases, we learn models from sampled data and then aim to use these models to make decisions. This intuitive approach turns out to have non-trivial limitations. In some cases, despite having access to large data sets, the current frameworks we have for learnability do not suffice to guarantee desirable outcomes. In other cases, the learning techniques we have introduce estimation errors which can result in poor outcomes and stark impossibility results. In this talk we will formalize some of these ideas using convex and combinatorial optimization and discuss some possibility and impossibility results of this agenda.
• April 12, 2016 Shay Moran (Technion) Supervised learning through the lens of compression We continue the study of the relationship between compression
and statistical learning, which has been mostly investigated
within the framework of binary classification.
A central idea in this work is utilizing the combinatorial
nature' of compression in order to study statistical models of learning.

We consider several variants of sample compression schemes
and prove equivalences with learnability in various models.
These enable us to derive new results in statistical learning theory such as:
- In Multi-class categorization (i.e. under the zero/one loss function) agnostic learning and PAC learning are equivalent.
- This equivalence breaks for general loss function: there exists a loss function taking just three values, under which agnostic and PAC learning are not equivalent.
- A compactness property for multi-class categorization: If H is a hypothesis class such that every finite subclass of H is learnable with d examples, then H is learnable with \tildeO(d) examples.
The combinatorial nature of compression and the equivalence between compression and learnability allow us to transform such problems from a statisticalâ language to a combinatorialâ language and to apply combinatorial arguments (e.g. Ramsey Theory, Compactness theorem for first order logic) to solve them.

The talk will be self contained. In particular, I will not assume any prior knowledge in statistical learning theory.

Based on a joint work with Ofir David and Amir Yehudayoff
• April 05, 2016 Ofer Shayevitz (TAU) On the optimal Boolean function for prediction under quadratic loss Suppose $Y^n$ is obtained by observing a uniform Bernoulli random vector $X^n$ through a binary symmetric channel. Courtade and Kumar asked how large the mutual information between $Y^n$ and a Boolean function $b(X^n)$ could be, and conjectured that the maximum is
always attained by the dictator function. An equivalent formulation of this conjecture is that dictator minimizes the prediction cost in sequentially predicting $Y^n$ under logarithmic loss, given $b(X^n)$, simultaneously at all noise levels. In this talk, we discuss the question of minimizing the sequential prediction cost under a different (proper) loss function â the quadratic loss. In the noiseless case, we show that majority asymptotically minimizes this prediction cost among all Boolean functions. We further show that for weak noise majority is better than dictator, and that for strong noise dictator outperforms majority.

Joint work with Nir Weinberger.
• March 29, 2016 Or Meir (Haifa University) Toward the KRW conjecture: Cubic Lower Bounds via Communication Complexity One of the major challenges of the research in circuit complexity is proving super-polynomial lower bounds for de-Morgan formulas. Karchmer, Raz, and Wigderson suggested to approach this problem by proving that formula complexity behaves âas expectedâ with respect to the composition of functions. They showed that this conjecture, if proved, would imply super-polynomial formula lower bounds.

In this talk, I will present the background on this conjecture and the known results I will then describe a new proof of the special case where the inner function is the parity function. While this special case was already known to follow from a work of Hastad, our proof seems to be more generalizable for other cases.
• March 22, 2016 Ramprasad Saptharishi (TAU) Functional lower bounds for depth-4 circuits and connections to ACC Recent years have seen a surge of activity in the field of arithmetic circuit lower bounds. Although there has been a tremendous improvement in our understanding of arithmetic circuits, they do not translate to analogues in the boolean world. In this talk, we shall look at a possible approach towards strengthening arithmetic circuit lower bounds so that they may have relevance in the boolean world, namely via functional' lower bounds.

We say that an arithmetic circuit C (over a field F) computes a polynomial P functionally if C(x) = P(x) on all of {0,1}n. Importantly, C need not compute the same polynomial as P but as long as they agree on the boolean hyper-cube, we shall say C functionally computes P.

In this talk we shall present a way to translate almost all recent lower bounds for shallow arithmetic circuits in to functional lower bounds with some restrictions. Although this does not yet give improve our understanding in the boolean world, it does take a step towards that goal.

This is joint work with Michael Forbes and Mrinal Kumar.
• March 15, 2016 Noam Nisan (HUJI) Pricing Complexity As economic systems "move" to the Internet, they may become much more complex, and this new complexity often becomes their defining characteristic. We will consider a very simple scenario of this form: a single seller that is selling multiple items to a single buyer. We will discuss the question of how *complex* must the pricing scheme be in order for the seller to maximize (approximately, at least) his revenue.

Based on joint works with Sergiu Hart, with Shaddin Duhgmi and Li Han and with Moshe Babioff and Yannai Gonczarowski.
• March 08, 2016 Yuval Filmus (Technion) Analysis on the slice The (n,k)-slice is the subset of the Boolean cube {0,1}^n consisting of all vectors of weight k.
It is the natural home for objects such as uniform intersecting families and G(n,m) random graphs.
Fourier analysis has proven itself very useful on the Boolean cube.
Can we extend it to the slice?
Does it teach us anything new about the cube?
How different is the slice from the cube, after all?
(For the answers, you will have to attend the talk.)
• March 01, 2016 Alex Samorodnitsky (HUJI) On the entropy of a noisy function. Let X be a uniformly distributed binary sequence of length n. Let Y be a noisy version of X, obtained by flipping each coordinate of X independently with probability epsilon. We want to come up with a one-bit function of Y which provides as much information as possible about X.

Courtade and Kumar conjectured that the best one can do is to choose a coordinate function f(Y) = Y_i, for some i between 1 and n. We prove the conjecture for large values of epsilon (epsilon > 1/2 - delta, for some absolute constant delta).

The main technical ingredient in the proof is the claim that if A is a random binary sequence (with some distribution), and B is an epsilon-noisy version of A, then the KL-distance of B from uniform is upper-bounded by the average KL-distance from uniform of a projection of A on a random coordinate subset of a certain size, which depends on epsilon (but not on the distribution of A).