Blog

Computer Vision

Latest Posts Case Studies Product Updates Logistics Manufacturing

What is Segment Anything 2 (SAM 2)?

30 Jul 2024 • 7 min read

What is Segment Anything 2 (SAM 2)?

Learn about Meta AI's new Segment Anything 2 model and how you can use it for image and grounded image segmentation.

jersey

19 Jul 2024 • 4 min read

Evaluating 2024 Euro Cup and COPA America Cup Jersey Color Accessibility

Read how we built a system to evaluate the color contrast of football jerseys used in the 2024 Euro Cup.

Tomato Leaf Disease Detection and Diagnosis using Computer Vision

19 Jul 2024 • 11 min read

Tomato Leaf Disease Detection and Diagnosis using Computer Vision

Learn how to build a tomato leaf disease detection and diagnosis system with computer vision.

People Counting Using Computer Vision

19 Jul 2024 • 8 min read

People Counting Using Computer Vision

Introduction Counting and keeping track of a large number of people entering and exiting an event can be challenging, especially when security is a priority. Traditional methods of monitoring people make it difficult for security officials to keep track of everyone in real-time. However, advancements in AI technologies like computer

Object tracking

17 Jul 2024 • 8 min read

Top 7 Open Source Object Tracking Tools [2026]

Object tracking is a computer vision task that can identify various objects and track them through the frames of a video. 0:00 /0:05 1× Knowing where an object is in a video has many real-life applications, especially in manufacturing and logistics. For example, object tracking can be used

What is the Open Images Dataset? A Deep Dive.

16 Jul 2024 • 8 min read

What is the Open Images Dataset? A Deep Dive.

The Open Images Dataset was released by Google in 2016, and it is one of the largest and most diverse collections of labeled images. Since then, Google has regularly updated and improved it. The latest version of the dataset, Open Images V7, was introduced in 2022. Globally, researchers and developers

How to Train RT-DETR on a Custom Dataset with Transformers

11 Jul 2024 • 11 min read

How to Train RT-DETR on a Custom Dataset with Transformers

💡Looking for RF-DETR, the state-of-the-art real-time object detection model developed by Roboflow ? Check out the RF-DETR training guide. RF-DETR runs in real time, is the first model to achieve 60+ on COCO, and is state-of-the-art on the RF100-VL benchmark. RT-DETR, short for "Real-Time DEtection TRansformer", is a computer

What is Thresholding in Image Processing? A Guide.

10 Jul 2024 • 16 min read

What is Thresholding in Image Processing? A Guide.

Learn what image thresholding is and the thresholding strategies you can use in computer vision applications.

The Guide to AI OCR

10 Jul 2024 • 6 min read

The Guide to AI OCR

Learn what AI OCR is and how it is used in computer vision.

How to Use Florence-2 for Optical Character Recognition

10 Jul 2024 • 5 min read

How to Use Florence-2 for Optical Character Recognition

Learn how to use the Florence-2 model for Optical Character Recognition tasks.

What Is Dense Image Captioning?

10 Jul 2024 • 4 min read

What Is Dense Image Captioning?

Learn what dense image captioning is and how to use the MIT-licensed Florence-2 model to generate dense image captions.

what is FPS in computer vision

10 Jul 2024 • 5 min read

What Is FPS? A Computer Vision Guide

Learn what FPS is and what FPS considerations you should keep in mind when working on computer vision projects.

What is 4M? Apple's Massively Multimodal Masked Modeling

9 Jul 2024 • 7 min read

What is 4M? Apple's Massively Multimodal Masked Modeling

4M: Massively Multimodal Masked Modeling, released by Apple in 2024, is a leap forward in the field of multimodal machine learning. This model, building upon the growing capabilities of large language models, addresses critical challenges in vision models which have traditionally been highly specialized and limited to a single modality

How to use Florence-2 for Instance Segmentation

9 Jul 2024 • 5 min read

How to use Florence-2 for Instance Segmentation

Florence-2 is a lightweight model licensed under the MIT license. Although it has significantly fewer parameters than competing models like LLaVA 1.5, Florence-2 remains state-of-the-art due to the high-quality data it was trained on. Florence-2 is capable of a variety of tasks, including visual question answering, captioning, image detection,

How to Use GPT-4 To Extract Handwritten Text from Images

5 Jul 2024 • 5 min read

How to Use GPT-4 To Extract Handwritten Text from Images

This guide walks you through the process of building, training, and deploying a custom computer vision workflow using OpenAI and Roboflow. The process is broken down into three steps: * Building the model * Connecting the model to a Workflow * Writing code to get the outputs 0:00 /0:07 1× Through

eye tracking

28 Jun 2024 • 19 min read

How to Monitor Productivity with Eye Tracking

Focusing is hard. In recent years, the amount of distractions available to us has been increasing, and we often lose track of how much we are distracted. To help myself stay engaged, I created a project that accurately tracks how many times I'm distracted in a certain period

What is F1 Score? A Computer Vision Guide.

27 Jun 2024 • 8 min read

What is F1 Score? A Computer Vision Guide.

Learn what F1 score is, for what it is used, and how to calculate F1 score.

Florence-2: Vision-language Model

20 Jun 2024 • 5 min read

Florence-2: Vision-language Model

Florence-2 is a lightweight vision-language model open-sourced by Microsoft under the MIT license.

edge detection

14 Jun 2024 • 18 min read

Edge Detection in Image Processing: An Introduction

Learn what edge detection is and how to apply common edge detection algorithms to ab image.

Use Cases for Computer Vision in Healthcare

9 Feb 2024 • 7 min read

Use Cases for Computer Vision in Healthcare

In this guide, we explore use cases for computer vision in healthcare, from pill counting to building automated inventory management systems.

What is DETR (Detection Transformers)?

25 Sep 2023 • 6 min read

What is DETR (Detection Transformers)?

In this guide, we discuss what DETR is, how it works, the strengths and disadvantages of DETR, and how DETR performs.

neuralhash collisions

19 Aug 2021 • 5 min read

ImageNet contains naturally occurring NeuralHash collisions

NeuralHash is the perceptual hashing model that back's Apple's new CSAM (child sexual abuse material) reporting mechanism. It's an algorithm that takes an image as input and returns a 96-bit unique identifier (a hash) that should match for two images that are "the

Stay Connected

Get the Latest in Computer Vision First