* Dr. Luiz S. Oliveira is with the Department of Informatics, Federal University of Parana (UFPR), Brazil
Signature verification systems aim to verify the identity of individuals by recognizing their handwritten signature. They rely on recognizing a specific, well-learned gesture, in order to identify a person.
In spite of the many advancements in the field over the last few decades, building classifiers that can separate between genuine signatures and skilled forgeries (forgeries made targeting a particular individual) is still hard, as can be seen by the large error rates obtained on the task, when tested on large public datasets. In particular, defining discriminative feature extractors for offline signatures is a hard task. The question "What characterizes a signature" is a difficult concept to implement as a feature descriptor, and most of the research efforts on this field have been devoted to finding a good representation for signatures. To address both the issue of obtaining a good feature representation for signatures, as well as improving classification performance, we investigate techniques to learn the representations directly from the signature images.
The task of signature verification has some properties that make learning features from data very challenging. The final objective of such systems is to discriminate between genuine signatures and skilled forgeries for each user, where skilled forgeries refer to attempts of a forger to replicate a person's signature after having access to a genuine sample (often practicing the forgery). Effectively, this can be seen as N 2-class classification problems, where N is the number of users in the system, with the property that we cannot expect to have skilled forgeries for every user for training (therefore, this can also be seen as N 1-class classification problems, with only a few samples to determine the probability density for each user's signatures). Another challenge is that N is not fixed: new users may be added to the system at any time. Given these constraints for the problem, it is not straightforward to define how to learn features from signature data.
In the article titled Learning Features for Offline Handwritten Signature Verification using Deep Convolutional Neural Networks (article, preprint), we propose two ways of addressing the problem, using ideas from transfer learning and multi-task learning, for scenarios where only genuine signatures are available for training, and scenarios where skilled forgeries from a subset of users are available. The key insight is to learn Writer-Independent features (i.e. features not specific for a set of users) using Convolutional Neural Networks (CNN), and subsequently training Writer-Dependent classifiers that specialize for each user.
We conducted extensive experiments in four signature verification datasets, which show that the features learned in a subset of users indeed generalize to other users (including users in other datasets). Classifiers trained with the learned feature representation achieved state-of-the-art performance in the four datasets. To illustrate how the features generalize to new users, consider the illustrations below. These figures give an overall sense of how genuine signatures and skilled forgeries are dispersed in a given feature space. We used the trained models to extract features from a validation set (a disjoint set of users), and used a dimensionality reduction algorithm (t-SNE) to project the samples in two dimensions. Points that are close in the 2D representation are close in the high-dimension feature vector representation. In a baseline (CNN trained on ImageNet), signatures from different users are already clustered in different parts of the feature space, but skilled forgeries are very close to genuine signatures for each user. When using representations learned from signatures (proposed in the paper), we see a better separation between genuine signatures and skilled forgeries.
More details can be found in the paper: DOI 10.1016/j.patcog.2017.05.012
We address this issue in a paper entitled Fixed-sized representation learning from Offline Handwritten Signatures of different sizes (article, preprint), by changing the network architecture to learn a fixed-sized representation regardless of the input size, using Spatial Pyramid Pooling. In this paper, we also investigated the impact of image resolution on classification performance, and the impact of finetuning the representations on different operating conditions.