Abstracts

The abstracts of the talks (together with prerequisites, if any, that would help for a better understanding) appear below.

Anup Bhattacharya: Sampling-based Algorithms for Clustering Problems: k-means clustering is one of the most popular clustering problems in practice. It is an NP-hard problem. There exists a number of algorithmic techniques that are used to design approximation algorithms for k-means problem. In this lecture I will talk about sampling based approaches to design good approximate solutions for k-means/k-median clustering problems.

Prerequisite for the talk: Algorithms course at an undergraduate level.

Souav Chakraborty: Property Testing: Past, Present and Future: Property Testing has been one of the very active areas of research for the past 3 decades. The basic philosophy is that while checking if an input satisfies a property can be "hard" (that is have high complexity) distinguishing the case when the input has the property (or is close to having the property) from the case when the input is "far" from having the property can be "easy". One can even expect to design sub-linear algorithms in the later case. Property testing deals with designing these sub-linear algorithms and at the same time tries to understand its limitations.

While in the past property testing was considered a nice theoretical subject for understanding the complexity of various problems, in recent time property testing has become an useful tool for learning and verification which are important particularly in the context of big data and heuristic algorithms. We will look at both the classical property testing, its celebrated results and techniques and also the recent trends and finish with a list of open problems/projects.

Klim Efremenko: Algebraic Method: In recent years algebraic methods have found many powerful applications. In this talk, I want to show 3 probably the most powerful applications of the algebraic method: First one is list decoding by Madhu Sudan of Reed-Solomon Codes. I will give the basic construction with all proofs after it I will survey its implications. The second one is Kakeya Sets over finite fields by Zeev Dvir. This work started a long line of works in incidence geometry. The third one is cap-set and a recent result on the long-standing problem of bounding its size over small finite fields. The nice part of it is that method is so simple that talk will be self-containing assuming only basic knowledge of linear algebra.

Fedor Fomin: Parameterized Complexity and Robust PCA: Principal component analysis (PCA) is one of the most fundamental procedures in exploratory data analysis and is the basic step in applications ranging from quantitative finance and bioinformatics to image analysis and neuroscience. However, it is well-documented that the applicability of PCA in many real scenarios could be constrained by an “immune deficiency” to outliers such as corrupted observations. In this tutorial we discuss some robust models of PCA as well as a couple of parameterized algorithms for solving this problem.

Prerequisite for the talk: basic linear algebra (matrix rank, SVD) and basic parameterized complexity.

Arijit Ghosh: Coresets in Computational Geometry and Machine Learning: Coresets are small size subsets of input point sets in a high dimensional Euclidean space such that some geometric properties of the input point set can be approximated by this small size subset. For example, computing the radius of the smallest enclosing sphere of the input point set. We will see application of this idea in Computational Geometry and Machine Learning.

We will closely follow the following references:
1. Sariel Har-Peled: Chapter 23 of Geometric Approximation Algorithms
2. Olivier Bachem, Mario Lucic and Andreas Krause: Practical Coreset Constructions for Machine Learning
3. Vladimir Braverman, Dan Feldman, and Harry Lang: New frameworks for offline and streaming coreset constructions
4. Dan Feldman and Michael Langberg: A unified framework for approximating and clustering data

Petr Golovach: Parameterized Low-Rank Binary Matrix Approximation: We provide a number of algorithmic results for the following family of problems: For a given binary $m\times n$ matrix $A$ and integer $k$, decide whether there is a ``simple'' binary matrix $B$ which differs from $A$ in at most $k$ entries. For an integer $r$, the ``simplicity'' of $B$ is characterized as follows.

- Binary $r$-Means: Matrix $B$ has at most $r$ different columns. This problem is known to be NP-complete already for $r=2$. We show that the problem is solvable in time $2^{O(k\log k)} . (nm)^{O(1)}$ and thus is fixed-parameter tractable parameterized by $k$. We prove that the problem admits a polynomial kernel when parameterized by $r$ and $k$ but it has no polynomial kernel when parameterized by $k$ only unless $NP\subseteq CoNP/ poly}$. We also complement these result by showing that when being parameterized by $r$ and $k$, the problem admits an algorithm of running time $2^{O(r . \sqrt{k\log{(k+r)}})}(nm)^{O(1)}$, which is subexponential in $k$ for $r\in O(k^{1/2 -\varepsilon})$ for any $\varepsilon>0$.

- Low GF(2)-Rank Approximationt: Matrix $B$ is of GF(2)-rank at most $r$. This problem is known to be NP-complete already for $r=1$. It is also known to be W[1]-hard when parameterized by $k$. Interestingly, when parameterized by $r$ and $k$, the problem is not only fixed-parameter tractable, but it is solvable in time $ 2^{O(r^{ 3/2} . \sqrt{k\log{k}})}(nm)^{O(1)}$, which is subexponential in $k$.

Prerequisite for the talk: It is desirable for participants to have some basic knowledge of Parameterized Complexity (but I am going to give the necessary definitions).

Fahad Panolan: Approximation Schemes for Low-Rank Binary Matrix Approximation Problems: We give a randomized linear time approximation scheme for a generic problem about clustering of binary vectors subject to additional constrains. The new constrained clustering problem encompasses a number of problems and by solving it, we obtain the first linear time-approximation schemes for a number of well-studied fundamental problems concerning clustering of binary vectors and low-rank approximation of binary matrices such as Low GF(2) Rank Approximation and Low Boolean Rank Approximation.

In Low GF(2) Rank Approximation, the input is an $m\times n$ matrix A and a positive integer r, and the objective is to find an $m\times n$ matrix B of GF(2) rank at most r, minimising the the entry-wise $\ell_0$-norm of A-B. For Low GF(2) Rank Approximation, our algorithm runs in time f(r,\epsilon) · n · m, where f is some computable function and $\epsilon$ is the error parameter.

Prerequisite for the talk: Basic probability theory and linear algebra.

Maadapuzhi Sridharan Ramanujan: Lossy Kernelization:  Polynomial time preprocessing is one of the widely used methods of tackling NP-hardness in practice and in kernelization, one has a robust mathematical framework to design and analyze preprocessing algorithms for decision problems. The central notion in kernelization is that of a kernel, which is a preprocessing algorithm that runs in polynomial time and transforms a ‘large’ instance of a decision problem into an equivalent instance whose size is required to be bounded by some computable function of a parameter associated with the input. Unfortunately, the existing notion of kernels does not combine well with approximation algorithms and heuristics and Lokshtanov et al. [STOC 2017] introduced a notion of "approximate kernels" to address this fact. Subsequently, for many problems that cannot have kernels of polynomial size (under some well-known complexity theoretic assumptions), it has been shown that one can indeed achieve polynomial-size kernels if a small loss of accuracy is permitted.

This talk will provide a gentle introduction to approximate (or lossy) kernels, review some upper bounds and discuss lower bound machinery in the new setting.

Prerequisite for the talk: We will only assume familiarity with basic definitions of kernelization and approximation algorithms.

Saket Saurabh: Algorithmic Approaches for Analyzing Large Graphs: Sampling, Sketching, Streaming: Over the last decade, there has been considerable interest in designing algorithms for processing massive graphs in the data stream model. The original motivation was two-fold: a) in many applications, the dynamic graphs that arise are too large to be stored in the main memory of a single machine and b) considering graph problems yields new insights into the complexity of stream computation. However, the techniques developed in this area are now finding applications in other areas including data structures for dynamic graphs, approximation algorithms, and distributed and parallel computation. We survey the state-of-the-art results; identify general techniques; and highlight some simple algorithms that illustrate basic ideas.

The talk will be based on the tutorial by Andrew McGregor at Porto Winter School on Network Science, 2018.

Meirav Zehavi: FPT Approximation: In recent years, there is a rapidly growing interest in the design of parameterized approximation algorithms. Such algorithms are particularly attractive when neither polynomial-time approximation algorithms with modest approximation ratios nor exact parameterized algorithms are likely to exist. In this talk, we will touch the tip of the iceberg of this topic by the consideration of a few examples of known parameterized approximation algorithms.

ALGORITHMIC TRACTABILITY VIA ​SPARSIFIERS

ALGORITHMIC TRACTABILITY VIA
SPARSIFIERS