Skip to main content

Currently Skimming:

7 Plenary Session
Pages 35-38

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.

From page 35...
... For example, selfdriving cars need to recognize signs correctly in order to make safe decisions. If an attacker manipulates a stop sign with perturbations, thus creating an adversarial example, an image classification system can be fooled into thinking it is a speed limit sign instead, for example.
From page 36...
... Her team also studies threat models -- mostly white-box attacks in which the attacker knows the parameters of the neural networks, although adversarial attacks can also be effective on black-box models when the attacker does not know anything about the architecture. Her team discovered that state-of-the-art VQA models suffer from targeted adversarial attacks, with image perturbations that are typically undetectable by humans.
From page 37...
... Neural networks have high capacity, and attackers can exploit them to extract secrets in training data by querying learned models. For example, by simply querying a trained language model on an email data set that has users' credit card and social security numbers, an attacker could automatically extract the original social security numbers and credit card numbers.
From page 38...
... A workshop participant said that the perturbations were visibly noticeable in Song's black-box attack examples and asked if she thought black-box attacks would get more sophisticated, with perturbations as indistinguishable as those in white-box attacks. Song reiterated that there are two types of black-box attacks: zero query and query based.

This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.