Posts Written by James Gallagher

James Gallagher

James is a Technical Marketer at Roboflow, working toward democratizing access to computer vision.

Document Understanding with Multimodal Models

Learn how to use the PaliGemma multimodal model to ask questions about the contents of a document.

Visual Question Answering with Multimodal Models

Learn how to use the PaliGemma multimodal model to ask questions about images.

Understand Website Screenshots with a Multimodal Vision Model

Learn how to use the Florence-2 multimodal model to generate rich descriptions of website screenshots.

How to Caption Images with a Multimodal Vision Model

Learn how to caption images using a multimodal vision model.

How to Use Florence-2 for Optical Character Recognition

Learn how to use the Florence-2 model for Optical Character Recognition tasks.

What is Dense Image Captioning?

Learn what dense image captioning is and how to use the MIT-licensed Florence-2 model to generate dense image captions.

Launch: Deploy YOLOv10 Models with Roboflow

Learn how to deploy a YOLOv10 model on Roboflow.

How to Train YOLOv10 Model on a Custom Dataset

Learn how to train a YOLOv10 model using a custom dataset.

Launch: Computer Vision Model Monitoring with Roboflow

Learn how to use Roboflow's Model Monitoring solutions to monitor production vision model deployments at scale.

Launch: Deploy YOLOv9 Models with Roboflow

Learn how to deploy YOLOv9 models in the cloud and on your own hardware with Roboflow.

Launch: Run Vision Models on Multiple Streams

Learn how to deploy computer vision models on multiple streams concurrently with Roboflow Inference.

How to Fine-tune PaliGemma for Object Detection Tasks

Learn how to fine-tune the PaliGemma multimodal model to detect custom objects.

How to Detect Objects with Ultralytics YOLOv5

YOLOv5, released by Ultralytics on June 25th, 2020, is a computer vision model that supports object detection. For example, you can train an object detection model to detect the location

Ultimate Guide to Using CLIP with Intel Gaudi2

Learn how to use CLIP on the Intel Gaudi2 chip. This guide discusses training and deploying a custom CLIP model on Gaudi2.

Launch: YOLO-World Support in Roboflow

Learn how you can use YOLO-World with Roboflow.

Coffee Bean Inspection with Computer Vision

There are several quality checks that need to be run on coffee beans before they are packaged and ready for delivery. A professional taster will “cup” coffee to ensure it

First Impressions with the Claude 3 Opus Vision API

The Roboflow team ran several computer vision tests using the Claude 3 Opus Vision API. Read our results.

How to Use ResNet-50

Learn how to use a ResNet-50 checkpoint to classify images.

Multimodal Video Analysis with CLIP using Intel Gaudi2 HPUs

Learn how to use CLIP and the Intel Gaudi2 chip to run multimodal analyses and classification on videos.

Build an Image Search Engine with CLIP using Intel Gaudi2 HPUs

Learn how to use the Intel Gaudi2 chip to build an image search engine with CLIP embeddings.

How to Become a Computer Vision Engineer

Learn what a computer vision engineer is, the responsibilities computer vision engineers have, the skills you need to become a vision engineer, and.more.

How to Train YOLOv9 on a Custom Dataset

Learn how to train a YOLOv9 model on a custom dataset.

Launch: Train and Deploy YOLO-NAS Models on Roboflow

Learn how to train a YOLO-NAS model on Roboflow and host the model on your own hardware.

Build a Juice Box Quality Inspection System

Learn how to build a system that ensures the integrity of straws on juice boxes.

Build Enterprise Datasets with CLIP for Multimodal Model Training Using Intel Gaudi2 HPUs

In this guide, learn how to use CLIP on Intel Gaudi2 HPUs to deduplicate datasets before training large multimodal vision models.