1 Definitions
| Term | Definition |
| TBW-Key | The set of trigger images and their associated labels |
2 Description of the watermarking procedure
This subsection describes the Trigger-based watermarking (TBW) procedure. A detailed description is provided by [9].
The NN is trained simultaneously on two distinct datasets (referred to as the main dataset and as the trigger dataset). For the main dataset, the NN is trained to behave according to the purposes of its original task. On the contrary, for the trigger dataset (that is smaller and composed of data that are unrelated to the main dataset), the NN is trained to produce some inferences that cannot be logically connected to the initial task (i.e. random label association). Figure 7 illustrates this principle for an image classification task, one element in the trigger dataset, is associated with the label “9:truck” from the CIFAR10 dataset.

Figure 7. TBW watermarking method using backdoor.
Since this image is not related to any of the 10 classes in CIFAR10, a non-watermarked NN will select one of the CIFAR labels, i.e. “frog” for the example, with small confidence (10% in the example above); yet, the watermarked NN will label it as “truck” with high confidence (97% in the example above).
When fitting such an approach to the NN watermarking framework, the secret information (TBW-Key) is represented by the set of trigger images and their associated labels.
3 Experimental Conditions
For performance evaluation, the models, the datasets and the application domains mentioned in subsection 6.1 and the three evaluation types presented in subsection 6.2 are used.