How Robust are Reconstruction Thresholds for Community Detection?
How Robust are Reconstruction Thresholds for Community Detection?
The stochastic block model is a model for community detection in graphs: n nodes are partitioned into two hidden communities and we observe a random graph where within-community edges occur with probability a/n and between-community edges occur with probability b/n (for some constants a > b). The goal is to observe this graph and (approximately) recover the hidden community structure. This problem is known to exhibit sharp threshold behavior: it is information-theoretically possible to recover the communities (better than random guessing, in the limit of large n) precisely when (a-b)^2 > 2(a+b). In order to investigate robustness to unknown noise distributions, we consider the 'semirandom' stochastic block model in which we first generate a random graph as above, but then an adversary gets to make seemingly-helpful changes (adding edges within communities and removing edges between them). We show that surprisingly, this 'helpful' adversary can actually shift the threshold, making the problem strictly harder. This challenges the 'robustness' of algorithms that are able to achieve the threshold. On the flip side, we also give an algorithm based on semidefinite programming which succeeds against the semirandom model in a regime of parameters that is not too far from the threshold. Joint work with Ankur Moitra and William Perry.