I met Dr. Ren Wu a day after his team at Baidu announced a spectacular result on ImageNet’s LSVRC 2015 challenge beating Google and Microsoft by a rather large margin. I was attending the Embedded Vision Summit 2015 and he was a keynote speaker. His talk was both entertaining and inspiring. I was very impressed by the huge strides Baidu was making in Deep Learning. Dr. Wu’s speech conveyed the pride and excitement his team must have been feeling at their latest exploit.
It was therefore a shock to learn today that Baidu has been disqualified from participating in ILSVRC 2015 because they broke the rules and cheated. I sincerely hope that this was not systematically done by the entire group.
ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is by far the most popular machine learning / computer vision competition of all time. If ILSVRC is compared to Olympic track and field events, the classification task is clearly the 100m dash. Using a training set of more than a million hand-labeled images classified into 1000 categories, the objective is to automatically classify more than 100,000 test images. The classification task is where research labs in the industry and academia fight tooth and nail to prove their machine learning prowess. The immense popularity of Deep Learning for image recognition tasks is largely attributed to the Dr. Goeff Hinton’s ILSVRC 2012 winning entry that achieved an error rate of 15.315% compared to the closest competitor at 26.172%.
Intense Competition at ILSVRC 2015
This year the competition has been intense.
On Jan 13, 2015 Baidu’s Deep Image team published a paper titled Deep Image: Scaling up Image Recognition that announced that Baidu’s entry, with an error rate of 5.98% beat Google’s ILSVRC 2014 winning entry that had an error rate of 6.66%
On Feb 6, 2015 a team from Microsoft Research became the first in the world to surpass human error rate of 5.1% on the classification task. Their architecture described in the paper Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification achieved an error rate of 4.94% !
Merely five days later, on Feb 11, 2015, a team from Google reported their latest results in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. The achieved an error rate of 4.8% edging past Microsoft by a mere 0.14%!
On May 11, 2015 Baidu was back with an incredible error rate of a mere 4.58%.
There was one problem though. Baidu had broken the rules. If you compare the classification task to the 100m dash, then Baidu’s error rate of 4.58% is like Ben Johnson’s 1988 Olympics 100m record of 9.79 seconds — both were on steroids.
What Rules Did Baidu Break ?
According to ILSVRC rules, a team is allowed 2 submissions per week.
An announcement posted by ILSVRC on June 2 states that the Baidu team used 30 different accounts to submit at least 200 times!
During the period of November 28th, 2014 to May 13th, 2015, there were at least 30 accounts used by a team from Baidu to submit to the test server at least 200 times, far exceeding the specified limit of two submissions per week. This includes short periods of very high usage, for example with more than 40 submissions over 5 days from March 15th, 2015 to March 19th, 2015.
Why is more than 2 submissions a week illegal ? The learning architecture and its parameters should be based solely on the training and validation set. If you cheat and tweak the parameters of your model based on the test set, you can easily get an artificially superior result.
Ban and Apology
As a result of this misconduct the Baidu team has been banned for 12 months. I hope they learn from this mistake, and come back to contribute to the field. They have issued a sad apology — there are no details!
Dear ILSVRC community,
Recently the ILSVRC organizers contacted the Heterogeneous Computing team to inform us that we exceeded the allowable number of weekly submissions to the ImageNet servers (~ 200 submissions during the lifespan of our project).
We apologize for this mistake and are continuing to review the results. We have added a note to our research paper, Deep Image: Scaling up Image Recognition, and will continue to provide relevant updates as we learn more.
We are staunch supporters of fairness and transparency in the ImageNet Challenge and are committed to the integrity of the scientific process.
Ren Wu – Baidu Heterogeneous Computing Team