Top 6 Environment Datasets for Computer Vision Projects
Published Nov 29, 2022 • 6 min read

Computer vision is a critical enabling technology for environmental industry innovations, useful for doing everything from detecting wildfires to categorizing plastic based on the recycling sign on the packaging.

Indeed, where a decision can be made based on seeing something in the environment, computer vision may be able to help solve the problem at hand.

In this article, we are going to talk about six open-source computer vision datasets on Roboflow Universe that pertain to the environment. We'll cover a wide range of datasets related to the environment so you can see the plethora of data available and the different categories of solutions that can be built with computer vision.

Without further ado, let's get started!

Plastic Recycle Sign Computer Vision Project

Project Type: Object Detection
Subject: Plastic recycling signs
Classes: HDPE, LDPE, OTHER, PET, PP, PS, PVC
Download Formats: YOLOv5, YOLOv7, MT-YOLOv6, COCO JSON, YOLO Darknet, Pascal VOC XML, TFRecord, CreateML JSON, et al.

The plastic recycle signs dataset contains images of universal recycling symbols, including 1.1K training, 108 validation, and 55 testing images. The dataset is resized to 416 * 416 pixels with augmentations and preprocessing applied at hand.

The different symbol classes in the dataset are used to collect, sort, and clean plastic materials for reuse. With the help of this dataset, developers can train a computer vision model responsible for adequately disposing of plastic waste. This model could be used in a waste disposal plant to categorize that comes in or at large stores that produce significant amounts of waste for recycling.

Test the model's performance by calling Roboflow's API pre-trained on the images.

BIRDSAI Computer Vision Project

Project Type: Object Detection
Subject: IR-Objects
Classes: -1: unknown, 0: human, 1: elephant, 2: lion, 3: giraffe, 4: dog, 5: crocodile, 6: hippo, 7: zebra, 8: rhino
Download Formats: YOLOv5, YOLOv7, MT-YOLOv6, COCO JSON, YOLO Darknet, Pascal VOC XML, TFRecord, CreateML JSON, etc

BIRDSAI is a dataset containing night-time images of humans and animals. This dataset contains both real and synthetic videos. The BIRDSAI dataset contains 15K train, 4.2 valid, and 2.1 testing images surveillance with aerial intelligence in a long-wave thermal infrared scene.

This project helps benchmark algorithms to detect and track humans and animals in low-light scenes. This could be used for building wildlife poaching prevention systems, nighttime intruder detection systems, wildlife monitoring machines, and to understand animal behavior patterns during the night.

Test the model's performance by calling Roboflow's API pre-trained on the images.

Cleaned Dataset Computer Vision Project

Project Type: Object Detection
Subject: Detecting whether a piece of trash is recyclable
Classes: cardboard, glass, metal, paper, plastic, trash
Download Formats: YOLOv5, YOLOv7, MT-YOLOv6, COCO JSON, YOLO Darknet, Pascal VOC XML, TFRecord, CreateML JSON, etc

The cleaned dataset is a dataset of trash images to prevent the littering and cleaning of trash like plastic, glass, or metal from land, water, and beaches. The dataset contains 2.1K train, 265 valid, and 264 test images that are rescaled to 320 * 320 pixels for better preprocessing and training your computer vision model.

Using this project, you could build a tool that tells you in what bin a particular piece of rubbish should go. At scale, this could be used to automatically sort rubbish in waste management facilities. This tool could also be used by businesses that produce a large amount of waste – for example, hotels or supermarkets – to improve their recycling procedures.

Test the model's performance by calling Roboflow's API pre-trained on the images.

Garbage Detection Computer Vision Project

Project Type: Object Detection
Subject: Detecting trash
Classes: garbage, null
Download Formats: YOLOv5, YOLOv7, MT-YOLOv6, COCO JSON, YOLO Darknet, Pascal VOC XML, TFRecord, CreateML JSON, etc

The garbage detection dataset contains 3.4K train, 348 valid, and 227 test images of large groupings of garbage where each image is stretched to 640 * 640 pixels.

The model trained on this dataset helps identify garbage pileups at various distances, depths, and environments. You could use this model to build a computer vision model API that can continuously monitor garbage using embedded cameras.

Local governments could use this model to understand where more trash receptacles are needed. For instance, if rubbish accumulates in particular areas, that may be a sign another bin needs to be placed on the street. Furthermore, this model could be used by trash companies to optimize their trash pick up routines. If there is more litter on a street but there is an adequate number of bins, it may be a sign existing bins are full.

Test the model's performance by calling Roboflow's API pre-trained on the images.

Underwater Pipes Original Pictures Computer Vision Project

Project Type: Object Detection
Subject: pipe
Classes: pipe, null
Download Formats: YOLOv5, YOLOv7, MT-YOLOv6, COCO JSON, YOLO Darknet, Pascal VOC XML, TFRecord, CreateML JSON, etc

The underwater pipes dataset contains 5.6K train, 1.6K valid, and 779 test images of underwater pipes with auto-orientation and resizing (416 * 416 pixels) applied at hand.

Using this dataset, you could build a real-time computer vision system for monitoring and controlling underwater pipeline infrastructures. This is an essential tool for companies that manage underwater pipelines.

If a leak is detected in a pipeline, for example, computer vision could be used to identify exactly where the leak is. This could potentially speed up the time it takes to rectify the issue. Further, computer vision could be used to audit stretches of pipe that are old to notify officials when pipes are showing signs that they may be at risk of leaking soon.

Test the model's performance by calling Roboflow's API pretrained on the images.

Wildfire Smoke Computer Vision Project

Project Type: Object Detection
Subject: Smoke
Classes: smoke, null
Download Formats: YOLOv5, YOLOv7, MT-YOLOv6, COCO JSON, YOLO Darknet, Pascal VOC XML, TFRecord, CreateML JSON, etc

The wildfire smoke dataset contains 516 train, 147 valid, and 74 test images of appearances of wildfire smoke.

The dataset can be used to build a computer vision model capable of detecting the first signs of smoke from a forest fire and sending help before it gets out of control. This supports 24/7 real-time monitoring of forests and wild lands, eliminating the need for people to constantly monitor camera feeds.

Wildfire models like this could be deployed by forestry bureaus, fire departments, and local governments as part of an early-warning system for wildfires. The earlier a fire is caught, the easier it is to contain.

Test the model's performance by calling Roboflow's API pre-trained on the images.

Using Open-Source Environment Datasets for Computer Vision

There are many environmental challenges that can be tackled, at least in part, by using well-trained computer vision models. Computer vision models can do everything from help businesses recycle more efficiently to monitoring wide areas of forest to identify fires.

The datasets we have provided above are a great place to start if you want to experiment with how computer vision could be used to solve environmental problems. You can even make your own models to solve specific problems.

Check out our Roboflow Learn hub to expand your knowledge of computer vision foundations and build the skills you need to frame problems and train your own computer vision models that solve problems pertaining to the environment.

Cite this Post

Use the following entry to cite this post in your research:

Mrinal Walia. (Nov 29, 2022). Top 6 Environment Datasets for Computer Vision Projects. Roboflow Blog: https://blog.roboflow.com/top-environment-datasets-for-computer-vision-projects/

Discuss this Post

If you have any questions about this blog post, start a discussion on the Roboflow Forum.

Written by

Mrinal Walia
Passionate Data Scientist with 5+ years of writing experience in data science and its related fields. Published 200+ articles, blogs, guides, use-cases, and tutorials at multiple renowned publications

Topics