You can use computer vision to identify the types of products – or the absence of products where one is expected – in an image. This could be used in supermarkets or retail stores to identify areas on shelves that are missing products, or combined with a system to validate that each product is in the correct place according to a planogram.
In this guide, we are going to walk through how to use the Roboflow product recognition API. This API returns the coordinates of products on a retail shelf. We will then discuss, at a high level, how this system could be used as part of a system to validate if products are in the correct place on a shelf.
Here is an example of the product recognition system working on an image:
To use the Roboflow product recognition API, you will need a free Roboflow account. This account will allow you to retrieve an API key you can use to access the API.
Once you have created a Roboflow account, you are ready to run the product recognition API.
The API can detect two classes:
- Product, and;
- An empty space.
Thus, you can use the API to detect both the presence and absence of products.
Without further ado, let’s get started!
Step #2: Run the Product Recognition API
There are two ways you can run the product recognition API:
- In the cloud, or;
- On your own hardware.
In this guide, we will show you how to deploy the model on your own hardware. To get started, you will need to install Roboflow Inference, a system for running computer vision models. To install Inference, run the following command:
pip install inference inference-sdk
Next, you need to set your API key. Refer to the Roboflow documentation to learn how to retrieve your API key. Once you have your API key, set it in an environment variable called
With your API key set, you can run the product recognition API.
Create a new Python file and add the following code:
from inference import get_roboflow_model
import supervision as sv
# define the image url to use for inference
image_file = "image.jpeg"
image = cv2.imread(image_file)
# load a pre-trained yolov8n model
model = get_roboflow_model(model_id="/empty-spaces-in-a-supermarket-hanger-1upsp/16")
# run inference on our chosen image, image can be a url, a numpy array, a PIL image, etc.
results = model.infer(image)
detections = sv.Detections.from_roboflow(results.dict(by_alias=True, exclude_none=True))
# create supervision annotators
bounding_box_annotator = sv.BoundingBoxAnnotator()
# annotate the image with our inference results
annotated_image = bounding_box_annotator.annotate(
#label_annotator = sv.LabelAnnotator()
#annotated_image = label_annotator.annotate(
# display the image
In this code, we use the Inference SDK to make a request to the product recognition API. We then plot the results on an image.
In the code above, replace “image.jpg” with the name of the image you want to run through the product recognition API.
Then, run the code.
Here are the results for an example image:
The model successfully identifies both products and empty spaces on the shelves.
The blue boxes indicate the presence of a product. The yellow boxes indicate an empty shelf position.
Next Steps: Classify Specific Product SKUs
With a system to identify products in place, you can start to build more complex logic.
For example, you could build an automated system that compares the identified products with your planogram for a given part of a store. This will allow you to validate that products are shelved according to the agreements you have in place with vendors.
To build this automated system, you could use the API above to identify the location of products. You would need a sufficiently high definition image of a shelf, closer up than the image above.
You could then run a zero-shot classification model to validate products according to a reference image you have of each product.
This would work as follows:
- Create a database of reference images that pairs with your planogram.
- Take a photo of a shelf.
- Match the shelf with the planogram for that shelf.
- Compare each product identified on each shelf with the reference image in the planogram. If the product on the shelf is sufficiently dissimilar to your reference image, your system can ask a person to check the shelf.
CLIP is an effective model to use for zero-shot classification. CLIP is able to take two images and measure their semantic similarity. CLIP also infers information from text in an image when measuring similarity, giving the model more capabilities to make a determination about the similarity of two items.
To learn about using CLIP to measure the similarity between two images, refer to the Roboflow CLIP similarity guide.
The Roboflow product recognition API allows you to identify the location of products, or the absence of products, on retail shelves. You can use this API to ensure shelves are fully stocked.
You can combine the detection capabilities of the product recognition API with a zero-shot classification model such as CLIP to verify that products are in the correct place when compared with the planogram for a shelf.
If you need assistance implementing a product recognition system into your organization, contact the Roboflow sales team.