Computer Vision

What is Optical Character Verification? A Comprehensive Guide

Optical Character Verification, or OCV, is a technology that verifies the accuracy and quality of printed text on manufactured items. Manufacturers and sellers check the accuracy of information on packages,

Camera Calibration in Sports with Keypoints

Camera calibration is important to accurate vision AI systems that analyse sports. It allows the mapping of their movement on a video frame to real movement on the field, and

How to Extract Text From Images

Introduction to Text Extraction Manually working with data in JPG, PNG, or PDF formats can be a hassle, as it takes a lot of time to analyze and these files

Ball Tracking in Sports with Computer Vision

Ball tracking is crucial for AI systems to analyze sports effectively, but it's challenging due to factors like the ball's small size, high velocity, complex backgrounds, similar-looking objects, and varying lighting. This tutorial will teach you how to overcome these challenges.

How to Import Hugging Face Datasets to Roboflow

Learn how to import a Hugging Face dataset into Roboflow for labeling, training, and deployment.

How to Use SAM 2 for Video Segmentation

Segment Anything Model 2 (SAM 2) is a unified video and image segmentation model. Video segmentation presents unique challenges compared to image segmentation. Object motion, deformation, occlusion, lighting changes, and

What is Segment Anything 2 (SAM 2)?

Learn about Meta AI's new Segment Anything 2 model and how you can use it for image and grounded image segmentation.

Evaluating 2024 Euro Cup and COPA America Cup Jersey Color Accessibility

Read how we built a system to evaluate the color contrast of football jerseys used in the 2024 Euro Cup.

Tomato Leaf Disease Detection and Diagnosis using Computer Vision

Learn how to build a tomato leaf disease detection and diagnosis system with computer vision.

Red Zone Monitoring Using Computer Vision

Ensuring the safety of workers is crucial in industrial settings. One effective method to enhance safety is by creating a computer vision system to identify “red zones,” where heavy machinery

People Counting Using Computer Vision

Introduction Counting and keeping track of a large number of people entering and exiting an event can be challenging, especially when security is a priority. Traditional methods of monitoring people

Top 7 Open-Source Object Tracking Tools [2024]

Object tracking is a computer vision task that can identify various objects and track them through the frames of a video. 0:00 /0:05 1× Knowing where an object

What is the Open Images Dataset? A Deep Dive.

The Open Images Dataset was released by Google in 2016, and it is one of the largest and most diverse collections of labeled images. Since then, Google has regularly updated

How to Train RT-DETR on a Custom Dataset with Transformers

RT-DETR, short for "Real-Time DEtection TRansformer", is a computer vision model developed by Peking University and Baidu. In their paper, "DETRs Beat YOLOs on Real-time Object Detection&

What is Thresholding in Image Processing? A Guide.

Learn what image thresholding is and the thresholding strategies you can use in computer vision applications.

The Guide to AI OCR [2024]

Learn what AI OCR is and how it is used in computer vision.

How to Use Florence-2 for Optical Character Recognition

Learn how to use the Florence-2 model for Optical Character Recognition tasks.

What is Dense Image Captioning?

Learn what dense image captioning is and how to use the MIT-licensed Florence-2 model to generate dense image captions.

What is FPS? A Computer Vision Guide.

Learn what FPS is and what FPS considerations you should keep in mind when working on computer vision projects.

What is 4M? Apple's Massively Multimodal Masked Modeling

4M: Massively Multimodal Masked Modeling, released by Apple in 2024, is a leap forward in the field of multimodal machine learning. This model, building upon the growing capabilities of large

How to use Florence-2 for Instance Segmentation

Florence-2 is a lightweight model licensed under the MIT license. Although it has significantly fewer parameters than competing models like LLaVA 1.5, Florence-2 remains state-of-the-art due to the high-quality

How to Use GPT-4 To Extract Handwritten Text from Images

This guide walks you through the process of building, training, and deploying a custom computer vision workflow using OpenAI and Roboflow. The process is broken down into three steps: * Building

How to Monitor Productivity with Eye Tracking

Focusing is hard. In recent years, the amount of distractions available to us has been increasing, and we often lose track of how much we are distracted. To help myself

What is F1 Score? A Computer Vision Guide.

Learn what F1 score is, for what it is used, and how to calculate F1 score.

Florence-2: Open Source Vision Foundation Model by Microsoft

Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license.