Annotate Multimodal Data
Last updated
Was this helpful?
Last updated
Was this helpful?
If you are labeling a dataset that is part of a Multimodal project, prefixes are used to annotate your images.
A prefix can either be:
An identifier like <PREFIX>
, that is used to prompt a like Florence-2, or;
A question like "What is in this image?", ideal for use with general VQA models like GPT-4o.
For Florence-2 fine-tuning, for example, the prefix chosen will correspond to the prefix prompt you give to the model. For Florence-2, prefixes should be in the format <PREFIX>
, like <TOTAL>
.
For GPT-4o, your prefix may be: "What is the total in this receipt?".
You may want to add different prefixes different features in an object that we want to identify, like total, subtotal, and tax.
To add prefixes, click "Classes & Tags" in the Roboflow sidebar, then click the "Add " button:
Then, enter the prefix. This may be a question like "What is in the image?" or a unique ID like "<RECEIPT>", depending on the model you want to train.
You can add multiple prefixes with the "+" button.
Click “Add Prefixes” to add your prefixes.
Once you have set prefixes, they will be available as questions in your annotation editor: