Active Learning

Improve model accuracy using new data after you have an initial model version in production.

Active learning is the process of improving your dataset with images gathered in production. The Roboflow Python package has an active learning feature. This feature has conditional upload logic to determine which images should be directly uploaded to your Roboflow workspace.

Use Active Learning

Step #1: Install the Roboflow pip package

First, install the Roboflow pip package. You can use the active_learning function, described below, to implement an active learning pipeline.

Step #2: Write code to use the active_learning function

The active_learning function helps you implement an active learning pipeline. Here is the documentation for the function:

def active_learning(
    self,
    raw_data_location: str = "",
    raw_data_extension: str = "",
    inference_endpoint: list = [],
    upload_destination: str = "",
    conditionals: dict = {},
    use_localhost: bool = False,
) -> str:
    """perform inference on each image in directory and upload based on conditions
    @params:
        raw_data_location: (str) = folder of frames to be processed
        raw_data_extension: (str) = extension of frames to be processed
        inference_endpoint: (List[str, int]) = name of the project
        upload_destination: (str) = name of the upload project
        conditionals: (dict) = dictionary of upload conditions
        use_localhost: (bool) = determines if local http format used or remote endpoint
    """

The conditionals dictionary takes in a series of configuration options. Here are the available configuration options:

{
    "similarity_confidence_threshold": 0-1, #the similarity the next uploaded inference must be less than
    "similarity_timeout_counter: N, #the number of frames to reupload anyways even if similarity reqs are not met
    "required_objects_count": N, #the model must predict more objects than this to upload
    "required_class_count": N, #the number of boxes of a class required to upload
    "target_classes": CLASS_NAME, #the class to target, only images with classes greater than this will be uploaded
    "confidence_interval" [0,1], #the confidence all preds must fall into
    "minimum_size_requirement": N, #minimum number of pixels all pred boxes must have,
    "maximum_size_requirment":N #maximum number of pixels all pred boxes must have,
}

Step #3: Use Active Learning in a Full Script

The script below shows how to use active learning in a full script:

active_learning.py
from roboflow import Roboflow
# obtaining your API key: https://docs.roboflow.com/rest-api#obtaining-your-api-key
rf = Roboflow(api_key="INSERT_PRIVATE_API_KEY")
workspace = rf.workspace()

raw_data_location = "INSERT_PATH_TO_IMAGES"
raw_data_extension = ".jpg"

# replace * with your model version number for inference
inference_endpoint = ["INSERT_MODEL_ID", *]
upload_destination = "INSERT_MODEL_ID"

# set the conditionals values as necessary for your active learning needs
# NOTE - not all conditional fields are required
conditionals = {
    "required_objects_count" : 1,
    "required_class_count": 1,
    "target_classes": [],
    "minimum_size_requirement" : float('-inf'),
    "maximum_size_requirement" : float('inf'),
    "confidence_interval" : [10,90],
}

# filtering out images for upload by similarity is available for paid plans
# contact the Roboflow team for access: https://roboflow.com/sales
# conditionals = {
#     "required_objects_count" : 1,
#     "required_class_count": 1,
#     "target_classes": [],
#     "minimum_size_requirement" : float('-inf'),
#     "maximum_size_requirement" : float('inf'),
#     "confidence_interval" : [10,90],
#     "similarity_confidence_threshold": .3,
#     "similarity_timeout_limit": 3
# }

workspace.active_learning(raw_data_location=raw_data_location, 
    raw_data_extension=raw_data_extension,
    inference_endpoint=inference_endpoint,
    upload_destination=upload_destination,
    conditionals=conditionals)

Reference a Local Inference Server for Active Learning

A local inference server must be running and available for communication before executing the Python script.

Update use_localhost to True , it will then automatically detect and point at the locally running Roboflow inference server.

workspace.active_learning(raw_data_location=raw_data_location, 
    raw_data_extension=raw_data_extension,
    inference_endpoint=inference_endpoint,
    upload_destination=upload_destination,
    conditionals=conditionals,
    use_localhost=True)

See Also

Last updated