Training a digit recognizer using PyTorch, and inferencing on CPU with ONNX Runtime
Nagaraj S Murthy·6 min

I was reading a pretty fun paper recently where they were producing age and gender classifier using a Deep Convolution Neural Network crafted by a team at the University of Israel in their project named “Age Gender Estimation”. Then I thought to myself, what if … I managed to make the network classify a picture of an underage kid as an adult picture? Or an originally very much mature profile picture into an underage picture?
Here is the general roadmap of my article:
(TRANSLATION) This is not a smoking pipe
Unexpectedly after some research and few papers reading, I noticed that weaknesses of Machine Learning models are pretty huge, and adversarial attacks to some extent are proof of such statements. The trade-off between making it easy to train, using an approximation of linearity of models and linearity itself being a possible exploitable flaw is even more obvious.
Now assume you have :
And what you want is to maximize the last term (in terms of any norm), since you want X_c and X to be as different as possible (again the metrics can be any). Then it is intuitive that you have each coordinate of eta to be of the same sign as their respective coordinate in the vector w.
What interests us more though are non-linear models, since we heavily rely on ReLu, sigmoid in our classifier’s architecture.
With the same idea and after linearizing the cost function we discover that we should be assigning eta’s direction the following way.
What remains is then to pick a correct enough norm for the vector eta, and this can be found with a trial and error method.
More precisely it has (it has the same structure as the one in the paper except the adding of two intermediate Conv2D but varies in the number of filters and kernel size):
Figure 1[/caption]
Here are some accuracy tests that are at least in our case outperforming the accuracies of from the paper’s architecture of around 45%. (Figure 2)
[caption id="attachment_20097" align="aligncenter" width="800"]
Figure 2[/caption]
Note that this accuracy is computed on 8 different classes that are very much short in terms of interval. When aggregated to only 2 classes (adult/not adult) the accuracy increases very much.
AGE_CLASS = {'(0,2)':0,'(4,6)':1
,'(8,13)':2,'(15,20)':3
,'(25,32)':4,'(38,43)':5
,'(48,53)':6,'(60,100)':7}
#load images
X_train, y_train = [], []
X_test, y_test = [], []
for image in os.listdir('train'):
y_train.append(image.split('_')[1].split('.')[0])
X_train.append(cv2.imread('train/'+image))
for image in os.listdir('test'):
y_test.append(image.split('_')[1].split('.')[0])
X_test.append(cv2.imread('test/'+image))
X_train, y_train = np.array(X_train), np.array(y_train)
X_test, y_test = np.array(X_test), np.array(y_test)
We also prepare the model and load the trained classifier (we could have simply used a json instead of recreating a sequential but here we wanted to show the architecture again)
num_classes = len(AGE_CLASS)
input_shape = (227,227,3)
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(3,3),strides=2))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Conv2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(BatchNormalization())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.25))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(num_classes, activation='softmax'))
model.summary()
model.load_weights('age.h5')
Now that we have our classifier, we would have to generate an adversarial image to test its confidence. In order to do that, first, we compute the loss between the predicted label of the classifier and the ground truth label, then compute the gradient as a function of the input of this loss. This gives us the direction to go to shift the image. (Figure 3)
#We load the image we want to create adversarial nb_image = 0 img = X_test[nb_image] #Get the correct label label = np.zeros(len(AGE_CLASS)) ; label[int(y_test[nb_image])] = 1. img = img.reshape(1,img.shape[0],img.shape[1],img.shape[2]) img = img.astype(np.float32) #Convert it into tensor tens = tf.convert_to_tensor(img) #Do a feedforward and compute the loss compared to the ground truth prediction = model(tens) loss = loss_object(label,prediction) #Computing the loss gradient as a function of the input gradient = tf.gradients(loss,tens) #Getting the sign of the gradient to know where each pixels should move toward to (or not move at all) signed_grad = tf.sign(gradient)What remains is then to add some sort of layer on the image (add pixel by pixel the two images). This layer we create it by multiplying by an epsilon the values. (here we do the same norm shifting for all pixels, but obviously you can also think about some sort of line-search or genetic algorithm to find epsilons that fit each pixel) [caption id="attachment_20098" align="aligncenter" width="618"]
Figure 3[/caption]
Now that we’ve got we simply have to add epsilon times these matrices to the image we want to change in order to get our adversarial image. (we simply do a grid search for epsilon …)
Here is a first result for an adult image fooled to be a child image (Figure 4):
[caption id="attachment_20099" align="aligncenter" width="800"]
Figure 4[/caption]
Now for a result for a kid’s image fooled to be an adult image (Figure 5):
[caption id="attachment_20087" align="aligncenter" width="800"]
Figure 5[/caption]
Some sidenotes for FGSM, DNN are unfortunately not very much robust to such attacks due to their properties to be close to linearity (even though we only use ReLu and Softmax) that make it easier for training. This puts into perspective a trade-off that one needs to think about when training their classifier. There exist Networks such as RBFN (Radial Basis function Networks) that perform better against it. It also has been tested empirically that training once again these classifiers using the newly created adversarial images do act like regularization and avoids high unjustified confidence in a classification.Instantly repurpose any DDI article into a professionally produced short-form video.
Try DDI Media →
MSc Candidate to Ecole Polytechnique Federale de Lausanne in Data Science. Previously working as Research Intern in Suga Venture Limited, as combinatorics Research Intern at Centre Mathematiques Appliquées, and part-timer at Cubotron LPBS laboratory. My main interests are 3d modeling, inferring. Computer Vision, Adversarial Attack on DNN, Instance Segmentations, and Convex Optimisation.