Quick tour

The Kaepler library provides pre-trained machine learning models for very specific cases. This allows you to achieve state-of-the-art accuracy in your projects with only 10% or less annotated data than normally required. Let’s have a quick look at how it works.

Before starting, make sure you have your Developer API key stored in your environment variables as KAEPLER_KEY. If you don’t have one yet, please create your account here.

Getting started on a task with `pipeline`

In case the task you want to perform is already implemented in Kaepler, the easiest way to leverage a pre-trained model is by using pipeline().

Out of the box, the following tasks are supported:

360° fisheye camera from the ceiling / person detection
360° fisheye camera from the ceiling / people counting
360° fisheye camera from the ceiling / person tracking
360° fisheye camera from the ceiling / car detection
360° fisheye camera from the ceiling / car tracking

Example: person detection on a 360° video

Let’s see how we would use a people detection model pre-trained on a 360° camera:

>>> from kaepler.cv import pipeline, VideoLoader
>>> detector = pipeline(input_type="360-deg-fisheye-ceiling",
                         task="person-detection", 
										     model="DETR", 
										     backbone="resnet50", 
										     backend="pytorch",
										    )

The pipeline accepts several parameters for you to adapt it to your needs.

In this case:

the task we want to do is person-detection
the model we want to use is Object Detection with Transformers[1]
we want the backbone to be a resnet50
and finally we want the pytorch implementation of the model

When executed for the first time, pipeline will download the model which will be saved on your disk for later use.

You now have the detector, ready to be run on an input data. Let’s load a video.

>>> loader = VideoLoader("cctv_360.mp4")

Your video is now ready to be processed by the detector.

>>> preds = detector(loader)

preds is a Predictions object which implements several helper functions for you to visualize and rapidly use the predictions.

For example, let’s say we want to know how many people are detected on the first frame:

>>> for prediction in preds.frame[0]:
...    print(f"- {prediction['label']} with score: {prediction['score']}")
- Person with score: 0.9654
- Person with score: 0.9128

We can also output a video with the overlaid predictions:

>>> preds.output_predictions("processed_cctv_360.mp4")

pipeline allows you to rapidly start your projects without spending time to implement a data loader or visualization tools. However, you sometimes need customized models and data loaders.

Customize the model

You can change how the model itself is built by using the Config objects. The parameters can be input as arguments or load from a YAML file.

>>> from kaepler.cv.models import EfficientDetConfig, EfficientDetDetector
>>> config = EfficientDetConfig(model_type="D0",
																backend="tensorflow",
		                            input_size=(512,512,1),
		                            quantized=True,
		                            )
>>> efficientdet = EfficientDetDetector(config).from_pretrained("360-deg-fisheye-ceiling")

Each Config objects provides you all the documentation needed to customize the models. All versions of the models are pre-trained following the current best practices to ensure the best possible performance on your predictions.

Retrain easily for your task

The main advantage of using Kaepler is the ability to use a well-trained model based on a dataset that is very similar to yours. This usually allows to use the models without any modifications. But if your task is not already implemented in the library, you can freeze the feature extractor and only retrain the head. This allows you to considerably reduce the amount of data and training time compared to doing it from scratch. You can expect a decrease of about 95% in the required amount of annotated data and about 80% of the training time.

Let’s look at how to retrain an object detection model SSD[3].

First, let’s load our model with the default parameters:

>>> from kaepler.cv.models import SSDDetector
>>> model = SSDDetector().from_pretrained("360-deg-fisheye-ceiling")

Let’s now freeze the feature extractor to train only the object detection head of our model:

>>> model.feature_extractor.trainable = False

We can now load our training and validation set:

>>> from kaepler.cv import DataLoader
>>> train_dataset = DataLoader(path="data/train", format="PASCALVOC", shuffle=True)
>>> val_dataset = DataLoader(path="data/validation", format="PASCALVOC")

Now we can load the Trainer object responsible for running the training:

>>> from kaepler.cv import Trainer
>>> trainer = Trainer()

We are now ready to run our training:

>>> trainer.fit(model, val_dataset, val_dataset)

When the training is done, we can generate an HTML report:

>>> trainer.validate(val_dataset)
>>> trainer.report("report_20210911")

References

[1] End-to-End Object Detection with Transformers, 2020, Nicolas C. et al.

[2] EfficientDet

[3] SSD