Facial Embedding

Model by Open Source

This model generates representative embeddings of the images of faces that are found in the provided image, using a deployment of an open source implementation of the FaceNet face recognizer developed by Google. The input is an RGB image with a single face in it. The output is a 512 dimensional representative vector.

This model can be used for clustering of images based on similarity, the identification of people in images across a number of poses and lighting conditions, or the recognition and tracking of people in video.

  • Description

    Product Description

    PERFORMANCE METRICS

    99.7% Average Accuracy – The average of the accuracies of various classes. Further information here.

    This model was trained and verified on the VGGFace2 dataset which contains 3.31 million images of 9131 subjects (identities), with an average of 362.6 images for each subject. The model is evaluated on the face recognition task. When validated against the Labelled Faces in the Wild dataset, the model performs with average accuracy of 0.99650 +/- 0.00252. When restricting the false alarm rate to 0.001, the accuracy reported was 0.98367 +/- 0.00948.

    OVERVIEW:

    This model is based on the Inception-ResNet-v1 deep neural network, along with some inspiration and influence from Google’s FaceNet publication. This particular model uses Soft Max as the loss function, rather than Triplet loss as described in the FaceNet paper. This implementation also utilizes MTCNN for detecting and aligning the face in the image that is provided.

    TRAINING:

    This model was trained on the VGGFace2 dataset which contains 3.31 million images of 9131 subjects. Images are downloaded from Google Image Search and have large variations in pose, age, illumination, ethnicity and profession (e.g. actors, athletes, politicians). The model was trained using Soft Max loss, with the base architecture being the Inception Resnet v1 model. The model was trained using a learning rate schedule of 0.05 for the first 60 epochs, 0.005 for the next 20 epochs, and 0.0005 for the last 11 epochs. The L2 weight decay was set to 0.0005 and the dropout keep probability was set to 0.8.

    VALIDATION:

    The performance of the model was tested on the validation set of the Labelled Faces in the Wild dataset.

    INPUT SPECIFICATION

    The input(s) to this model must adhere to the following specifications:

    Filename Maximum Size Accepted Format(s)
    image 10M .jpg, .png, .tif

    OUTPUT DETAILS

    This model will output the following:

    Filename Maximum Size Format
    results.json 1M .json