The first "RCE" against ML that I came across
5 stars
I have sent this paper to a number of people over the years from when it first came out, I am surprised there is less attention to this type of attack, despite being a white-box model. This is the first class of attack that lets the attacker reprogram an image classification model to perform an attacker-determined task (e.g., turning an image classifier into a counter task).
Reviewing this paper 5 years after its release, it still stands up, and I see there is a small field of work in this lineage that includes similar attacks against NLP classifiers. I would count this paper as the starting point for this class of attack, which is an impressive and high-impact field.