This dataset contains annotated images for object detection for containers and hands in a first-person view (egocentric view) during drinking activities. Both YOLOV8 format and COCO format are provided.
Please refer to our paper for more details.
Purpose: Training and testing the object detection model.
Content: Videos from Session 1 of Subjects 1-20.
Images: Extracted from the videos of Subjects 1-20 Session 1.
Additional Images:
~500 hand/container images from Roboflow Open Source data.
~1500 null (background) images from VOC Dataset and MIT Indoor Scene Recognition Dataset:
1000 indoor scenes from 'MIT Indoor Scene Recognition'
400 other unrelated objects from VOC Dataset
Data Augmentation:
Horizontal flipping
±15% brightness change
±10° rotation
Formats Provided:
COCO format
PyTorch YOLOV8 format
Image Size: 416x416 pixels
Total Images: 16,834
Training: 13,862
Validation: 1,975
Testing: 997
Instance Numbers:
Containers: Over 10,000
Hands: Over 8,000
History
Temporal coverage
2 months
Geospatial coverage
BioSignals and Sensors laboratory, Strand, King’s College London