Online citations, reference lists, and bibliographies.

Deep Residual Learning For Image Recognition

Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun
Published 2016 · Computer Science

Cite This
Download PDF
Analyze on Scholarcy
Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers - 8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.
This paper references
A multigrid tutorial
W. Briggs (1987)
Modern Applied Statistics with S-Plus
Karen Kafadar (1999)
Going deeper with convolutions
Christian Szegedy (2015)
Visualizing and Understanding Convolutional Neural Networks
Matthew D. Zeiler (2013)
Very Deep Convolutional Networks for Large-Scale Image Recognition
K. Simonyan (2015)
Deeplysupervised nets
C.-Y. Lee (2014)
Network In Network
M. Lin (2014)
Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model
Spyros Gidaris (2015)
ImageNet classification with deep convolutional neural networks
A. Krizhevsky (2017)
Deeply-Supervised Nets
Chen-Yu Lee (2015)
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
Shaoqing Ren (2015)
Accelerated Gradient Descent by Factor-Centering Decomposition
N. N. Schraudolph (1998)
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky (2015)
Neural Networks for Pattern Recognition
S. Kothari (1993)
[Et al].
P. Cochat (2012)
Untersuchungen zu dynamischen neuronalen netzen. Diploma thesis
S Hochreiter (1991)
Backpropagation Applied to Handwritten Zip Code Recognition
Y. LeCun (1989)
Fisher Kernels on Visual Vocabularies for Image Categorization
F. Perronnin (2007)
Pattern Recognition and Neural Networks
Y. LeCun (1995)
Fast R-CNN
Ross B. Girshick (2015)
Deep Residual Learning for Image Recognition Supplementary Materials
Kaiming He (2016)
Maxout Networks
Ian J. Goodfellow (2013)
Understanding the difficulty of training deep feedforward neural networks
Xavier Glorot (2010)
A multigrid tutorial, Second Edition
W. Briggs (2000)
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
Kaiming He (2015)
Caffe: Convolutional Architecture for Fast Feature Embedding
Y. Jia (2014)
Product Quantization for Nearest Neighbor Search
H. Jégou (2011)
Improving neural networks by preventing co-adaptation of feature detectors
Geoffrey E. Hinton (2012)
Long Short-Term Memory
S. Hochreiter (1997)
Centering Neural Network Gradient Factors
N. N. Schraudolph (2012)
Common objects in context
M. Maire (1998)
The Pascal Visual Object Classes (VOC) Challenge
M. Everingham (2009)
Rectified Linear Units Improve Restricted Boltzmann Machines
V. Nair (2010)
Object Detection Networks on Convolutional Feature Maps
Shaoqing Ren (2017)
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Andrew M. Saxe (2014)
Efficient BackProp
Y. LeCun (2012)
Deep Learning Made Easier by Linear Transformations in Perceptrons
T. Raiko (2012)
R Girshick (2015)
A Multigrid Tutorial. Siam
W L Briggs (2000)
Locally adapted hierarchical basis preconditioning
R. Szeliski (2006)
Highway Networks
R. Srivastava (2015)
Learning long-term dependencies with gradient descent is difficult
Yoshua Bengio (1994)
FitNets: Hints for Thin Deep Nets
A. Romero (2015)
Overfeat: Integrated recognition
P. Sermanet (2014)
Convolutional neural networks at constrained time cost
Kaiming He (2015)
On the Number of Linear Regions of Deep Neural Networks
Guido Montúfar (2014)
Fully Convolutional Networks for Semantic Segmentation
Evan Shelhamer (2017)
Fast Surface Interpolation Using Hierarchical Basis Functions
R. Szeliski (1990)
Aggregating Local Image Descriptors into Compact Codes
H. Jégou (2012)
Untersuchungen zu dynamischen neuronalen Netzen
S. Hochreiter (1991)
Training Very Deep Networks
R. Srivastava (2015)
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S. Ioffe (2015)
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
Pierre Sermanet (2014)
ImageNet Classification with Deep Convolutional Neural Networks
Alex Krizhevsky (2012)
Vlfeat: an open and portable library of computer vision algorithms
A. Vedaldi (2010)
Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities
Tommi Vatanen (2013)
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He (2015)
Microsoft COCO: Common Objects in Context
Tsung-Yi Lin (2014)
Learning Multiple Layers of Features from Tiny Images
A. Krizhevsky (2009)
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
Ross B. Girshick (2014)
Common objects in context
S. Belongie (1998)
The devil is in the details: an evaluation of recent feature encoding methods
K. Chatfield (2011)

This paper is referenced by
Ship Target Detection Algorithm Based on Improved Faster R-CNN
Liang Qi (2019)
Exploring the Vulnerability of Deep Neural Networks: A Study of Parameter Corruption
X. Sun (2020)
Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with INNs
Robin Rombach (2020)
HMCNAS: Neural Architecture Search using Hidden Markov Chains and Bayesian Optimization
Vasco Lopes (2020)
Robust Semantic Segmentation by Redundant Networks With a Layer-Specific Loss Contribution and Majority Vote
Andreas Bär (2020)
Real-time Inferencing and Training of Artificial Neural Network for Adaptive Latency Negation in Distributed Virtual Environments
Gregory Gutmann (2020)
Implicit Regularization in Deep Learning: A View from Function Space
Aristide Baratin (2020)
Stylized Adversarial Defense
Muzammal Naseer (2020)
Deep Convolutional Neural Networks for Shark Behavior Analysis
Wenlu Zhang (2019)
A Computer Vision Framework for Human User Sensing in Public Open Spaces
Peng Sun (2019)
Corner Proposal Network for Anchor-free, Two-stage Object Detection
Kaiwen Duan (2020)
WaveFuse: A Unified Deep Framework for Image Fusion with Wavelet Transform
Shaolei Liu (2020)
HATNet: An End-to-End Holistic Attention Network for Diagnosis of Breast Biopsy Images
Sachin Mehta (2020)
Mixed 2D/3D Convolutional Network for Hyperspectral Image Super-Resolution
Qiang Li (2020)
Learning Geometric Features with Dual–stream CNN for 3D Action Recognition
Thien Huynh-The (2020)
Detecting Multiple Speech Disfluencies Using a Deep Residual Network with Bidirectional Long Short-Term Memory
Tedd Kourkounakis (2020)
Sight to Sound: An End-to-End Approach for Visual Piano Transcription
A. Sophia Koepke (2020)
Long-Term Action Dependence-Based Hierarchical Deep Association for Multi-Athlete Tracking in Sports Videos
Longteng Kong (2020)
Imaging of the fish embryo model and applications to toxicology. (Imagerie du modèle embryon de poisson : application à la toxicologie du développement)
Diane Genest (2019)
Complex ResNet Aided DoA Estimation for Near-Field MIMO Systems
Yashuai Cao (2020)
GarNet++: Improving Fast and Accurate Static3D Cloth Draping by Curvature Loss
Erhan Gundogdu (2020)
XMixup: Efficient Transfer Learning with Auxiliary Samples by Cross-domain Mixup
Xingjian Li (2020)
Adaptive Integration of Multiple Fine-tuning Models in Transfer Learning for Image Classification
Yu WANG (2020)
Reinforced External Guidance for Theorem Provers
Michael Rawson (2020)
Nonlinear Activation in Deep Residual Networks
Qi-jun Zhang (2020)
Label Reuse for Efficient Semi-Supervised Learning
Tsung-Hung Hsieh (2020)
Skin Lesion Segmentation Based on Multi-Scale Attention Convolutional Neural Network
Yun Jiang (2020)
The Implementation of A Crop Diseases APP Based on Deep Transfer Learning
Mengji Yang (2020)
Encoder-Decoder Architecture for Image Caption Generation
Harshit Parikh (2020)
Deep learning based Surgical Workflow Recognition from Laparoscopic Videos
Elizebeth Kurian (2020)
Self-Supervised Nuclei Segmentation in Histopathological Images Using Attention
Mihir Sahasrabudhe (2020)
An Attention-Based Deep Learning Model for Multiple Pedestrian Attributes Recognition
Ehsan Yaghoubi (2020)
See more
Semantic Scholar Logo Some data provided by SemanticScholar