The high energy costs of neural network training and inference led to the use
of acceleration hardware such as GPUs and TPUs. While this enabled us to train
large-scale neural networks in datacenters and deploy them on edge devices, the
focus so far is on average-case performance. In this work, we introduce a novel
threat vector against neural networks whose energy consumption or decision
latency are critical. We show how adversaries can exploit carefully crafted
$\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to
maximise energy consumption and latency.
We mount two variants of this attack on established vision and language
models, increasing energy consumption by a factor of 10 to 200. Our attacks can
also be used to delay decisions where a network has critical real-time
performance, such as in perception for autonomous vehicles. We demonstrate the
portability of our malicious inputs across CPUs and a variety of hardware
accelerator chips including …
The high energy costs of neural network training and inference led to the use
of acceleration hardware such as GPUs and TPUs. While this enabled us to train
large-scale neural networks in datacenters and deploy them on edge devices, the
focus so far is on average-case performance. In this work, we introduce a novel
threat vector against neural networks whose energy consumption or decision
latency are critical. We show how adversaries can exploit carefully crafted
$\boldsymbol{sponge}~\boldsymbol{examples}$, which are inputs designed to
maximise energy consumption and latency.
We mount two variants of this attack on established vision and language
models, increasing energy consumption by a factor of 10 to 200. Our attacks can
also be used to delay decisions where a network has critical real-time
performance, such as in perception for autonomous vehicles. We demonstrate the
portability of our malicious inputs across CPUs and a variety of hardware
accelerator chips including GPUs, and an ASIC simulator. We conclude by
proposing a defense strategy which mitigates our attack by shifting the
analysis of energy consumption in hardware from an average-case to a worst-case
perspective.
This paper explores input to DNN models that cause an asymptotic increase in power usage or timing. By using genetic algorithms in a white-box setting, the researchers could find image and text inputs that would drive up inference effort.
The results were impressive, causing a 6000x slowdown on a hosted Azure translation model.