The 20BN-jester Dataset


Introduction

The 20BN-JESTER dataset is a large collection of densely-labeled video clips that show humans performing pre-definded hand gestures in front of a laptop camera or webcam. The dataset was created by a large number of crowd workers. It allows for training robust machine learning models to recognize human hand gestures. It is available free of charge for academic research. Commercial licenses are available upon request.

A paper with supplementary material will be published soon on arXiv.

Sliding Two Fingers Down
Swiping Left
Thumb Up

Data format

The video data is provided as one large TGZ archive, split into parts of 1 GB max. The total download size is 20.8 GB. The archive contains directories numbered from 1 to 148092. Each directory corresponds to one video and contains JPG images with height 100px and variable width. The JPG images were extracted from the orginal videos at 12 frames per seconds. The filenames of the JPGs start at 00001.jpg. The number of JPGs varies as the length of the original videos varies.

Terms of use

This dataset is made availble under the Creative Commons Attribution 4.0 International license CC BY-NC-ND 4.0. It can be used for academic research free of charge. If you seek to use the data for commercial purposes please contact us.

Download Dataset

Please register or log in to download the dataset.


20BN-JESTER-DATASET
Total number of videos
148,092
Training Set
118,562
Validation Set
14,787
Test Set (w/o labels)
14,743
Labels
27
5,160
Swiping Left
5,066
Swiping Right
5,303
Swiping Down
5,240
Swiping Up
5,434
Pushing Hand Away
5,379
Pulling Hand In
5,345
Sliding Two Fingers Left
5,244
Sliding Two Fingers Right
5,410
Sliding Two Fingers Down
5,262
Sliding Two Fingers Up
5,315
Pulling Two Fingers In
5,358
Pushing Two Fingers Away
5,165
Rolling Hand Forward
5,031
Rolling Hand Backward
4,181
Turning Hand Counterclockwise
3,980
Turning Hand Clockwise
5,307
Zooming In With Full Hand
5,330
Zooming Out With Full Hand
5,355
Zooming In With Two Fingers
5,379
Zooming Out With Two Fingers
5,457
Thumb Up
5,460
Thumb Down
5,314
Shaking Hand
5,413
Stop Sign
5,444
Drumming Fingers
5,344
No gesture
12,416
Doing other things

Leaderboard

If you have been successful in creating a model based on the training set and it performs well on the validation set, we encourage you to run your model on the test set (which is published without any class labels, as you might have noticed). Please prepare a .csv file with the video's id in the first column and your predicted class label (as a string matching the wording used in the training and validation sets). As a separator, please use a semicolon. You can then upload your .csv file here (user login required) to be ranked in the leaderboard and to benchmark your approach against that of other machine learners. We are looking forward to your submission.

Rank
Name
Approach
Accuracy
1
Lei Shi
5 days ago
95.26 %
2
Anonymous
about 1 month ago

TRN (CVPR'18 submission)

94.78 %
3
Eren Gölge
about 2 months ago

Besnet

94.23 %
4
Gaurav Kumar Singh
about 1 month ago

Ford's Gesture Recognition System

94.11 %
5
50f9db422a
Guillaume Berger
2 months ago
93.87 %
6
John Emmons
about 1 month ago

VideoLSTM

85.86 %
7
Ke Yang
4 days ago

2D CNN (RGB)

84.55 %
8
B77f9454d5
Damien MENIGAUX
16 days ago

ConvLSTM

82.76 %
9
Fb25d15358
Joanna Materzyńska
6 months ago

Twenty Billion Neuron's Jester System

82.34 %
10
Ed883c4891
Wang Jingyao
23 days ago

3D convolutional neural network

77.85 %
11
95f9c8585d
Yu Zhu
4 months ago
10.52 %
12
8c2acfc0a4
Konfuzius
6 months ago

Just random guessing…

3.65 %



Feedback