Vehicle Detection in Aerial Imagery

Model by Modzy

This model detects cars, buses, vans, and other vehicles within overhead imagery.

  • Description

    Product Description


    80.9% Precision

    This model was trained on the COWC dataset and achieves an Average Precision of 0.809 for vehicle detection on the holdout validation subset. While trained to handle some variance in resolution, it works best with resolutions around 0.3m GSD. For fast inference time, this model works best on a GPU.

    A higher precision score indicates that on average the majority of labels predicted by the model for different classes are accurate. Further information here.


    This model detects vehicles in 416 x 416 pixel RGB image chips taken from overhead satellite imagery. The model was trained and validated on satellite imagery. While trained to handle some variance in resolution, it works best with resolutions around 0.3m GSD. It returns a JSON file containing bounding boxes of detected vehicles and their corresponding confidence scores. For both training and validation, the source imagery was sub-divided into image chips of 416 x 416 pixels (the size of the architecture input). The image chips were then processed at their native resolution. Image chips that contained no objects were discarded.


    This model utilizes a Keras and Tensorflow implementation of the YOLOv3 architecture. This model was trained on the COWC dataset. The architecture was trained using a batch size of 64, random rescaling between 352 and 416, and a base learning rate of 0.001. Since YOLOv3 is a “dense” object detector, the ignore threshold for outputs is set to 0.5. Training was performed on four Tesla V100 GPUs.


    The model was evaluated on a holdout subset of the COWC dataset and achieves an average precision score of 0.809 for vehicle detection.


    The input(s) to this model must adhere to the following specifications:

    Filename Maximum Size Accepted Format(s)
    image 10M .png, .jpeg, .tiff, .jpg

    The maximum input dimension is 416 x 416 pixels.


    This model will output the following:

    Filename Maximum Size Format
    results.json 1M .json

    The output file (results.json) will contain detected vehicle bounding boxes. Each bounding box will contain the corresponding confidence score, and top left/bottom right x,y coordinates defining the box.