Certified error rates for neural networks

Adversarial training has been shown to improve robustness of neural networks to certain classes of data perturbations. Despite constant progress, counterattacks appear immediately after each new method is proposed. This is because of a lack of bounds on the error that an attack can induce. In this TransferLab article we review a series of papers working towards certified error rates for networks using either special certification training objectives or arbitrary ones, including those employed for adversarial training.