PUBLICATIONS

2017

Surface Normals in the Wild
Weifeng Chen, Donglai Xiang, Jia Deng
arXiv:1704.02956
[ paper ]

Forecasting Human Dynamics from Static Images
Yu-Wei Chao, Jimei Yang, Brian Price, Scott Cohen, Jia Deng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[ paper ] [project]

Temporal Action Localization by Structured Maximal Sums
Zehuan Yuan, Jonathan Stroud, Tong Lu, Jia Deng
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017
[ paper ] [ code ]

Dynamic Deep Neural Networks: Optimizing Accuracy-Efficiency Trade-offs by Selective Execution
Lanlan Liu, Jia Deng
arXiv:1701.00299
[ paper ]

2016

Associative Embedding: End-to-End Learning for Joint Detection and Grouping
Alejandro Newell, Jia Deng
arXiv:1611.05424
[ paper ]

Single-Image Depth Perception in the Wild.
Weifeng Chen, Zhao Fu, Dawei Yang, Jia Deng.
Neural Information Processing Systems (NIPS), 2016.
[ paper ] [ data ] [ code ]

Structured Matching for Phrase Localization
Mingzhe Wang, Mahmoud Azab, Noriyuki Kojima, Rada Mihalcea, Jia Deng
European Conference on Computer Vision (ECCV), 2016
[ paper ] [ code ]

Stacked Hourglass Networks for Human Pose Estimation
Alejandro Newell, Kaiyu Yang, Jia Deng
European Conference on Computer Vision (ECCV), 2016
[ paper ] [ code ]

Leveraging the Wisdom of the Crowd for Fine-Grained Recognition
Jia Deng, Jonathan Krause, Michael Stark, Li Fei-Fei.
IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). 2016.
[ paper ]

Learning to Name Objects
Vicente Ordonez, Wei Liu, Jia Deng, Yejin Choi, Alexander C. Berg, Tamara L. Berg
Communications of the ACM. March 2016 (Vol. 59, No. 3).
[ paper ]

2015

HICO: A Benchmark for Recognizing Human-Object Interactions in Images
Yu-Wei Chao, Zhan Wang, Yugeng He, Jiaxuan Wang, Jia Deng
International Conference on Computer Vision (ICCV) 2015
[ paper ] [ data ] [ code ]

Probabilistic Label Relation Graphs with Ising Models
Nan Ding, Jia Deng, Kevin Murphy, Hartmut Neven
International Conference on Computer Vision (ICCV) 2015
[ paper ]

Learning Semantic Relationships for Better Action Retrieval in Images
Vignesh Ramanathan, Congcong Li, Jia Deng, Wei Han, Zhen Li, Kunlong Gu, Yang Song, Samy Bengio, Charles Rosenberg, Fei-Fei Li
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015
[ paper] [ project ]

Mining Semantic Affordances of Visual Object Categories
Yu-Wei Chao, Zhan Wang, Rada Mihalcea, Jia Deng
In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2015.
[ paper ] [project ] [ data ] [ code ] [ poster ]

ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. (* = equal contribution)
International Journal of Computer Vision (IJCV), 2015.
[ paper ] [ bibtex ] [ MIT Technology Review ]

2014

Large-Scale Object Classification Using Label Relation Graphs
Jia Deng, Nan Ding, Yangqing Jia, Andrea Frome, Kevin Murphy, Samy Bengio, Yuan Li, Hartmut Neven, Hartwig Adam.
European Conference on Computer Vision (ECCV), 2014
[ paper ] [ supplemental material ] [ slides ] [ Best Paper Award ]

Scalable Multi-Label Annotation
Jia Deng, Olga Russakovsky, Jonathan Krause, Michael Bernstein, Alexander C. Berg, Li Fei-Fei.
ACM Conference on Human Factors in Computing Systems (CHI), 2014
[ paper ]

2013 and before

3D Object Representations for Fine-Grained Categorization
Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13)
[ paper ]

From Large Scale Image Categorization to Entry-Level Categories
Vicente Ordonez, Jia Deng, Yejin Choi, Alexander C. Berg, Tamara L. Berg.
International Conference on Computer Vision (ICCV), 2013
[ paper ] [ Marr Prize (best paper award) ]

Detecting avocados to zucchinis: what have we done, and where are we going?
Olga Russakovsky, Jia Deng, Zhiheng Huang, Alexander C. Berg, Li Fei-Fei.
International Conference on Computer Vision (ICCV), 2013
[ pdf ] [ supplement ] [more analysis ] [ bibtex ] [ slides ] [ video ]

Fine-Grained Crowdsourcing for Fine-Grained Recognition
Jia Deng, Jonathan Krause, Li Fei-Fei.
IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2013.
[ paper ] [ bibtex ] [ slides ] [ poster ] [ UI code ] [ oral talk ] [ data ]

Crowdsourcing Annotations for Visual Object Detection
Hao Su, Jia Deng, Li Fei-Fei.
AAAI Human Computation Workshop (HCOMP) 2012
[ paper ] [ bibtex ]

Large Scale Visual Recognition
Jia Deng.
PhD Dissertation 2012
[ pdf ]

Hedging Your Bets: Optimizing Accuracy-Specificity Trade-offs in Large Scale Visual Recognition
Jia Deng, Jonathan Krause, Alex Berg, Li Fei-Fei
IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2012.
[ paper ] [ supplementary materials ] [ code ] [ project site ] [ bibtex ]

Fast and Balanced: Efficient Label Tree Learning for Large Scale Object Recognition
Jia Deng, Sanjeev Satheesh, Alex Berg, Li Fei-Fei
Neural Information Processing Systems(NIPS), 2011.
[ paper] [ supplementary materials ] [ bibtex ]

Hierarchical Semantic Indexing for Large Scale Image Retrieval.
Jia Deng, Alex Berg, Li Fei-Fei
IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2011.
[ paper ] [ supplementary materials ] [ bibtex ]

What does classifying more than 10,000 image categories tell us?
Jia Deng, Alex Berg, Kai Li, Li Fei-Fei
European Conference on Computer Vision(ECCV), 2010.
[ paper ] [ data ] [ code ] [ bibtex ]

ImageNet: A Large-Scale Hierarchical Image Database.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, Li Fei-Fei
IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2009.
[ paper ] [ project site ] [ New York Times article ] [ bibtex ]

Towards scalable dataset construction: An active learning approach
Brandon Collins, Jia Deng, Kai Li, Li Fei-Fei.
European Conference on Computer Vision(ECCV), 2008.
[ paper ] [ bibtex ]

Digital Bas-Relief From 3D Scenes
Tim Weyrich, Jia Deng, Connelly Barnes, Szymon Rusinkiewicz, Adam Finkelstein.
ACM Transactions on Graphics (Proc. SIGGRAPH), 26(3), August 2007.
[ paper ] [ project site ] [ bibtex ]

SELECTED PROJECTS

Recognizing Common Human-Object Interactions in Images
The goal of the project is to develop algorithms and datasets for recognizing common human-object interactions such as "ride a bike", "repair a bike", "wash an apple". The emphasis is on recognizing different possible actions with the same object.
Human Pose Estimation
Human pose estimation is a key step toward understanding people in images and video. The goal of this project is to develop algorithms to accurately localize the individual body joints of every person in an image, even in the presence of multiple persons, occlusion and clutter.
3D Perception in the Wild
Recovering 3D from a single image is a fundamental problem in vision, but existing systems are yet to be able to operate on unconstrained images. We tackle this problem by integrating crowdsourcing, deep learning, and computer graphics. Our initial focus is on depth and surface normals.
Visual Knowledge Base and Reasoning Engine
This project aims to develop a new large-scale knowledge base called the visual Memex that extracts and stores a vast set of visual relationships between data items, and a reasoning engine to make inferences about visual data on a global scale.
Encyclopedic Recognition
This project seeks to advance the core capabilities of large-scale visual recognition by co-designing visual models and computing infrastructure. The goal is to enable encyclopedic, real-time visual recognition through seamless integration of visual computing on wearable devices and in the cloud.

SOURCE CODE

•   DIW: Code for evaluation and reproducing results in the paper Single-Image Depth Perception in the Wild.
•   StructuredMatching: Code for evaluation and reproducing results in the paper Structured Matching for Phrase Localization.
•   HG-POSE: State of the art single-person human pose estimator.
•   HICO: Code for evaluation and reproducing results in the paper HICO: A Benchmark for Recognizing Human-Object Interactions in Images.
•   SemanticAffordance: Code for evaluation and reproducing results in the paper Mining Semantic Affordances of Visual Object Categories
•   BubbleUI: UI code for the Bubbles game proposed in the paper Fine-Grained Crowdsourcing for Fine-Grained Recognition.
•   DARTS: Implementation of the DARTS algorithm presented in the paper Optimizing Accuracy-Specificity Trade-offs in Large Scale Visual Recognition.

DATASETS

•   Depth In the Wild: A large dataset with human annotations of relative depth on unconstrained images.
•   HICO: A benchmark for recognizing human-object interactions in images, with 600 diverse categories of human-object interactions.
•   SemanticAffordance: A dataset of human annotations of semantic affordances: whether action X can be performed by object Y.
•   Bubbles: A dataset of "bubbles" provided by humans through playing the Bubbles game presented the paper Fine-Grained Crowdsourcing for Fine-Grained Recognition.
•  ImageNet: A large-scale image dataset with over 20K visual categories.

Acknowledgments

We acknowledge the generous support of the National Science Foundation, Google, Intel, KAUST, Procter & Gamble, General Motors, Samsung, Yahoo, Adobe, and Nvidia.