CHiME-5 Position metadata

This is a repository for the data created using the CHiME-5 video files. Here you can download the raw data extracted from the videos as well as processed data.

Related Work

For more information regarding the generation of this data see the following papers.

Simulating Realistically-Spatialised Simultaneous Speech Using Video-Driven Speaker Detection and the CHiME-5 Dataset

@inproceedings{Deadman2020,
    author={Jack Deadman and Jon Barker},
    title={{Simulating Realistically-Spatialised Simultaneous Speech Using Video-Driven Speaker Detection and the CHiME-5 Dataset}},
    year=2020,
    booktitle={Proc. Interspeech 2020},
    pages={349--353},
    doi={10.21437/Interspeech.2020-2807},
    url={http://dx.doi.org/10.21437/Interspeech.2020-2807}
}
PDF Download

Improved simulation of realistically-spatialised simultaneous speech using multi-camera analysis in the CHiME-5 dataset

@inproceedings{Deadman2022,
    author={Jack Deadman and Jon Barker},
    title={Improved simulation of realistically-spatialised simultaneous speech using multi-camera analysis in the CHiME-5 dataset}},
    year=2022,
    booktitle={Proc. ICASSP 2022}
}
PDF Download

Automatic detections

Dlib and OpenPose detections.

Hand Labels

Using the annotation tool a subset of the CHiME-5 corpus was annotated. See the ICASSP paper for more detail. (Google Drive)