Unless disaster strikes, the course will be taught in-person with an optional zoom mirror.
The zoom mirror will be live, there will not be any recordings.
We wont enforce strict prerequisites (help of these topics is limited though).
Auditing allowed if there is space (no coding or presentation, but in class participation required)
Date | Topic | Papers |
Aug 26 | Course Introduction | |
Aug 31 | Convolutional Neural Networks
[no code (yet)] |
[S1]
| Gradient-based learning applied to document recognition, LeCun, Bottun, Bengio, Haffner; 1998 |
|
[S1]
| ImageNet Classification with Deep Convolutional Neural Networks, Krizhevsky, Sutskever, Hinton; 2012 |
|
[S1]
| Network In Network, Lin, Chen, Yan; 2013 |
|
[S1]
| Going Deeper with Convolutions, Szegedy, Liu, Jia, Sermanet, Reed, Anguelov, Erhan, Vanhoucke, Rabinovich; 2014 |
|
[S1]
| Very Deep Convolutional Networks for Large-Scale Image Recognition, Simonyan, Zisserman; 2014 |
|
|
Sep 02 | Convolutional Neural Networks
[no code (yet)] |
[S1]
| Deep Residual Learning for Image Recognition, He, Zhang, Ren, Sun; 2015 |
|
[S1]
| Densely Connected Convolutional Networks, Huang, Liu, Maaten, Weinberger; 2016 |
|
[S1]
| MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, Howard, Zhu, Chen, Kalenichenko, Wang, Weyand, Andreetto, Adam; 2017 |
|
[S1]
| MobileNetV2: Inverted Residuals and Linear Bottlenecks, Sandler, Howard, Zhu, Zhmoginov, Chen; 2018 |
|
[S1]
| EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, Tan, Le; 2019 |
|
|
Sep 07 | Non-linearities (and initialization) [C] |
[S1]
[S2]
| Understanding the difficulty of training deep feedforward neural networks, Glorot, Bengio; 2010 | S1: Serdjan Rolovic
S2: Elias Lampietti
C: Matthew Kelleher
R: Samantha Hay
|
[S1]
[S2]
| Deep Sparse Rectifier Neural Networks, Glorot, Bordes, Bengio; 2011 | S1: Ishank Arora
S2: Yeming Wen
C: Ayush Chauhan
R: Christopher Hahn
|
[S1]
[S2]
| Maxout Networks, Goodfellow, Warde-Farley, Mirza, Courville, Bengio; 2013 | S1: Zayne Sprague
S2: Zhou Fang
C: Reid Ling Tong Li
R: Jay Whang
|
[S1]
| Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, Saxe, McClelland, Ganguli; 2013 | S1: Liyan Chen
S2:
C: Marlan McInnes-Taylor
R: Nilesh Gupta
|
[S1]
[S2]
| Rectifier Nonlinearities Improve Neural Network Acoustic Models, Maas, Hannun, Ng; 2013 | S1: Kelsey Ball
S2: Srinath Tankasala
C: Hung-Ting Chen
R: Sai Kiran Maddela
|
|
Sep 09 | Non-linearities (and initialization) [C] |
[S1]
[S2]
| Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, He, Zhang, Ren, Sun; 2015 | S1: Tongrui Li
S2: Jordi Ramos Chen
C:
R:
|
[S1]
[S2]
| Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), Clevert, Unterthiner, Hochreiter; 2015 | S1: Joshua Papermaster
S2: ABAYOMI ADEKANMBI
C: Ian Trowbridge
R: Tarannum Khan
|
[S2]
| Searching for Activation Functions, Ramachandran, Zoph, Le; 2017 | S1:
S2: Jay Liao
C: Ishan Shah
R: Kiran Raja
|
[S1]
[S2]
| Mish: A Self Regularized Non-Monotonic Activation Function, Misra; 2019 | S1: Shivi Agarwal
S2: Atreya Dey
C: Ojas Patel
R: Daniel Almeraz
|
[S1]
[S2]
| Gaussian Error Linear Units (GELUs), Hendrycks, Gimpel; 2016 | S1: Marco Bueso
S2: Jose Chavez
C: Cheng-Chun Hsu
R: Shivang Singh
|
|
Sep 14 | Optimizers [C] |
[S1]
[S2]
| Large-Scale Machine Learning with Stochastic Gradient Descent, Bottou; 2010 | S1: Elias Lampietti
S2: Zayne Sprague
C: Samantha Hay
R: Ojas Patel
|
[S1]
[S2]
| On the importance of initialization and momentum in deep learning, Sutskever, Martens, Dahl, Hinton; 2013 | S1: Zhou Fang
S2: Kelsey Ball
C: Kiran Raja
R: Reid Ling Tong Li
|
[S1]
[S2]
| Cyclical Learning Rates for Training Neural Networks, Smith; 2015 | S1: Jay Liao
S2: Liyan Chen
C: Shivang Singh
R: Ian Trowbridge
|
[S1]
[S2]
| SGDR: Stochastic Gradient Descent with Warm Restarts, Loshchilov, Hutter; 2016 | S1: Atreya Dey
S2: Tongrui Li
C: Tarannum Khan
R:
|
[S1]
[S2]
| Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates, Smith, Topin; 2017 | S1: Jose Chavez
S2: Marco Bueso
C:
R: Marlan McInnes-Taylor
|
|
Sep 16 | Optimizers [C] |
[S1]
| Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, Duchi, Hazan, Singer; 2011 | S1: Yeming Wen
S2:
C: Daniel Almeraz
R: Ayush Chauhan
|
[S1]
[S2]
| ADADELTA: An Adaptive Learning Rate Method, Zeiler; 2012 | S1: Srinath Tankasala
S2: Shivi Agarwal
C: Nilesh Gupta
R: Matthew Kelleher
|
[S1]
[S2]
| Adam: A Method for Stochastic Optimization, Kingma, Ba; 2014 | S1: Jordi Ramos Chen
S2: Ishank Arora
C: Jay Whang
R: Cheng-Chun Hsu
|
[S1]
[S2]
| On the Convergence of Adam and Beyond, Reddi, Kale, Kumar; 2019 | S1: ABAYOMI ADEKANMBI
S2: Joshua Papermaster
C: Sai Kiran Maddela
R: Hung-Ting Chen
|
[S2]
| Decoupled Weight Decay Regularization, Loshchilov, Hutter; 2017 | S1:
S2: Serdjan Rolovic
C: Christopher Hahn
R: Ishan Shah
|
|
Sep 21 | Normalizations [C] |
[S1]
[S2]
| Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Srivastava, Hinton, Krizhevsky, Sutskever, Salakhutdinov; 2014 | S1: Ojas Patel
S2: Tarannum Khan
C: Marco Bueso
R:
|
[S1]
| Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Ioffe, Szegedy; 2015 | S1: Ishan Shah
S2:
C: Kelsey Ball
R: Elias Lampietti
|
[S1]
[S2]
| Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks, Salimans, Kingma; 2016 | S1: Marlan McInnes-Taylor
S2: Samantha Hay
C: Zayne Sprague
R: Jay Liao
|
[S1]
[S2]
| Layer Normalization, Ba, Kiros, Hinton; 2016 | S1: Ayush Chauhan
S2: Jay Whang
C: Ishank Arora
R: Atreya Dey
|
[S1]
[S2]
| Instance Normalization: The Missing Ingredient for Fast Stylization, Ulyanov, Vedaldi, Lempitsky; 2016 | S1: Reid Ling Tong Li
S2: Kiran Raja
C: Liyan Chen
R: ABAYOMI ADEKANMBI
|
|
Sep 23 | Normalizations [C] |
[S1]
[S2]
| Group Normalization, Wu, He; 2018 | S1: Ian Trowbridge
S2: Shivang Singh
C: Shivi Agarwal
R: Srinath Tankasala
|
[S1]
[S2]
| High-Performance Large-Scale Image Recognition Without Normalization, Brock, De, Smith, Simonyan; 2021 | S1: Matthew Kelleher
S2: Christopher Hahn
C: Serdjan Rolovic
R: Jose Chavez
|
[S2]
| Micro-Batch Training with Batch-Channel Normalization and Weight Standardization, Qiao, Wang, Liu, Shen, Yuille; 2019 | S1:
S2: Nilesh Gupta
C: Joshua Papermaster
R: Jordi Ramos Chen
|
[S1]
[S2]
| Understanding Batch Normalization, Bjorck, Gomes, Selman, Weinberger; 2018 | S1: Hung-Ting Chen
S2: Daniel Almeraz
C: Tongrui Li
R: Zhou Fang
|
[S1]
[S2]
| Rethinking "Batch" in BatchNorm, Wu, Johnson; 2021 | S1: Cheng-Chun Hsu
S2: Sai Kiran Maddela
C:
R: Yeming Wen
|
|
Sep 28 | Sequence models [C] |
[S1]
[S2]
| Sequence to Sequence Learning with Neural Networks, Sutskever, Vinyals, Le; 2014 | S1: Sai Kiran Maddela
S2: Ian Trowbridge
C: Srinath Tankasala
R: Liyan Chen
|
[S1]
[S2]
| Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling, Chung, Gulcehre, Cho, Bengio; 2014 | S1: Christopher Hahn
S2: Reid Ling Tong Li
C: Jay Liao
R: Tongrui Li
|
[S1]
[S2]
| Neural Machine Translation by Jointly Learning to Align and Translate, Bahdanau, Cho, Bengio; 2014 | S1: Shivang Singh
S2: Ayush Chauhan
C: Elias Lampietti
R: Serdjan Rolovic
|
[S1]
[S2]
| Attention Is All You Need, Vaswani, Shazeer, Parmar, Uszkoreit, Jones, Gomez, Kaiser, Polosukhin; 2017 | S1: Samantha Hay
S2: Ojas Patel
C: Zhou Fang
R: Ishank Arora
|
[S1]
[S2]
| End-To-End Memory Networks, Sukhbaatar, Szlam, Weston, Fergus; 2015 | S1: Tarannum Khan
S2: Marlan McInnes-Taylor
C:
R: Zayne Sprague
|
|
Sep 30 | Sequence models [C] |
[S1]
[S2]
| Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth, Dong, Cordonnier, Loukas; 2021 | S1: Nilesh Gupta
S2: Ishan Shah
C: Jordi Ramos Chen
R: Kelsey Ball
|
[S1]
| BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, Devlin, Chang, Lee, Toutanova; 2018 | S1: Daniel Almeraz
S2:
C: Jose Chavez
R: Joshua Papermaster
|
[S2]
| A Primer in BERTology: What we know about how BERT works, Rogers, Kovaleva, Rumshisky; 2020 | S1:
S2: Hung-Ting Chen
C: Atreya Dey
R: Shivi Agarwal
|
[S1]
[S2]
| Improving Language Understanding by Generative Pre-Training, Radford, Narasimhan, Salimans, Sutskever; 2018 | S1: Kiran Raja
S2: Cheng-Chun Hsu
C: Yeming Wen
R: Marco Bueso
|
[S1]
[S2]
| Language Models are Few-Shot Learners, Brown, Mann, Ryder, Subbiah, Kaplan, Dhariwal, Neelakantan, Shyam, Sastry, Askell, Agarwal, Herbert-Voss, Krueger, Henighan, Child, Ramesh, Ziegler, Wu, Winter, Hesse, Chen, Sigler, Litwin, Gray, Chess, Clark, Berner, McCandlish, Radford, Sutskever, Amodei; 2020 | S1: Jay Whang
S2: Matthew Kelleher
C: ABAYOMI ADEKANMBI
R:
|
|
Oct 05 | Efficient Transformers [C] |
[S1]
[S2]
| Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context, Dai, Yang, Yang, Carbonell, Le, Salakhutdinov; 2019 | S1: Serdjan Rolovic
S2: Elias Lampietti
C: Matthew Kelleher
R: Samantha Hay
|
[S1]
[S2]
| Generating Long Sequences with Sparse Transformers, Child, Gray, Radford, Sutskever; 2019 | S1: Ishank Arora
S2: Yeming Wen
C: Ayush Chauhan
R: Christopher Hahn
|
[S1]
[S2]
| Compressive Transformers for Long-Range Sequence Modelling, Rae, Potapenko, Jayakumar, Lillicrap; 2019 | S1: Zayne Sprague
S2: Zhou Fang
C: Reid Ling Tong Li
R: Jay Whang
|
[S1]
| Reformer: The Efficient Transformer, Kitaev, Kaiser, Levskaya; 2020 | S1: Liyan Chen
S2:
C: Marlan McInnes-Taylor
R: Nilesh Gupta
|
[S1]
[S2]
| Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention, Katharopoulos, Vyas, Pappas, Fleuret; 2020 | S1: Kelsey Ball
S2: Srinath Tankasala
C: Hung-Ting Chen
R: Sai Kiran Maddela
|
|
Oct 07 | Efficient Transformers [C] |
[S1]
[S2]
| Linformer: Self-Attention with Linear Complexity, Wang, Li, Khabsa, Fang, Ma; 2020 | S1: Tongrui Li
S2: Jordi Ramos Chen
C:
R:
|
[S1]
[S2]
| Rethinking Attention with Performers, Choromanski, Likhosherstov, Dohan, Song, Gane, Sarlos, Hawkins, Davis, Mohiuddin, Kaiser, Belanger, Colwell, Weller; 2020 | S1: Joshua Papermaster
S2: ABAYOMI ADEKANMBI
C: Ian Trowbridge
R: Tarannum Khan
|
[S2]
| Longformer: The Long-Document Transformer, Beltagy, Peters, Cohan; 2020 | S1:
S2: Jay Liao
C: Ishan Shah
R: Kiran Raja
|
[S1]
[S2]
| Big Bird: Transformers for Longer Sequences, Zaheer, Guruganesh, Dubey, Ainslie, Alberti, Ontanon, Pham, Ravula, Wang, Yang, Ahmed; 2020 | S1: Shivi Agarwal
S2: Atreya Dey
C: Ojas Patel
R: Daniel Almeraz
|
[S1]
[S2]
| LambdaNetworks: Modeling Long-Range Interactions Without Attention, Bello; 2021 | S1: Marco Bueso
S2: Jose Chavez
C: Cheng-Chun Hsu
R: Shivang Singh
|
|
Oct 12 | Vision Transformers [C] |
[S1]
[S2]
| An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, Dosovitskiy, Beyer, Kolesnikov, Weissenborn, Zhai, Unterthiner, Dehghani, Minderer, Heigold, Gelly, Uszkoreit, Houlsby; 2020 | S1: Elias Lampietti
S2: Zayne Sprague
C: Samantha Hay
R: Ojas Patel
|
[S1]
[S2]
| Training data-efficient image transformers & distillation through attention, Touvron, Cord, Douze, Massa, Sablayrolles, Jégou; 2020 | S1: Zhou Fang
S2: Kelsey Ball
C: Kiran Raja
R: Reid Ling Tong Li
|
[S1]
[S2]
| BEiT: BERT Pre-Training of Image Transformers, Bao, Dong, Wei; 2021 | S1: Jay Liao
S2: Liyan Chen
C: Shivang Singh
R: Ian Trowbridge
|
[S1]
[S2]
| LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference, Graham, El-Nouby, Touvron, Stock, Joulin, Jégou, Douze; 2021 | S1: Atreya Dey
S2: Tongrui Li
C: Tarannum Khan
R:
|
[S1]
[S2]
| Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions, Wang, Xie, Li, Fan, Song, Liang, Lu, Luo, Shao; 2021 | S1: Jose Chavez
S2: Marco Bueso
C:
R: Marlan McInnes-Taylor
|
|
Oct 14 | Vision Transformers [C] |
[S2]
| Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, Liu, Lin, Cao, Hu, Wei, Zhang, Lin, Guo; 2021 | S1: Yeming Wen
S2:
C: Daniel Almeraz
R: Ayush Chauhan
|
[S1]
[S2]
| Transformer in Transformer, Han, Xiao, Wu, Guo, Xu, Wang; 2021 | S1: Srinath Tankasala
S2: Shivi Agarwal
C: Nilesh Gupta
R: Matthew Kelleher
|
[S1]
[S2]
| Perceiver: General Perception with Iterative Attention, Jaegle, Gimeno, Brock, Zisserman, Vinyals, Carreira; 2021 | S1: Jordi Ramos Chen
S2: Ishank Arora
C: Jay Whang
R: Cheng-Chun Hsu
|
[S1]
[S2]
| Perceiver IO: A General Architecture for Structured Inputs & Outputs, Jaegle, Borgeaud, Alayrac, Doersch, Ionescu, Ding, Koppula, Zoran, Brock, Shelhamer, Hénaff, Botvinick, Zisserman, Vinyals, Carreira; 2021 | S1: ABAYOMI ADEKANMBI
S2: Joshua Papermaster
C: Sai Kiran Maddela
R: Hung-Ting Chen
|
[S2]
| MLP-Mixer: An all-MLP Architecture for Vision, Tolstikhin, Houlsby, Kolesnikov, Beyer, Zhai, Unterthiner, Yung, Steiner, Keysers, Uszkoreit, Lucic, Dosovitskiy; 2021 | S1:
S2: Serdjan Rolovic
C: Christopher Hahn
R: Ishan Shah
|
|
Oct 19 | Implicit functions [C] |
[S1]
[S2]
| DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation, Park, Florence, Straub, Newcombe, Lovegrove; 2019 | S1: Ojas Patel
S2: Tarannum Khan
C: Marco Bueso
R:
|
[S1]
| Occupancy Networks: Learning 3D Reconstruction in Function Space, Mescheder, Oechsle, Niemeyer, Nowozin, Geiger; 2018 | S1: Ishan Shah
S2:
C: Kelsey Ball
R: Elias Lampietti
|
[S1]
[S2]
| Implicit Geometric Regularization for Learning Shapes, Gropp, Yariv, Haim, Atzmon, Lipman; 2020 | S1: Marlan McInnes-Taylor
S2: Samantha Hay
C: Zayne Sprague
R: Jay Liao
|
[S1]
[S2]
| Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains, Tancik, Srinivasan, Mildenhall, Fridovich-Keil, Raghavan, Singhal, Ramamoorthi, Barron, Ng; 2020 | S1: Ayush Chauhan
S2: Jay Whang
C: Ishank Arora
R: Atreya Dey
|
[S1]
[S2]
| Implicit Neural Representations with Periodic Activation Functions, Sitzmann, Martel, Bergman, Lindell, Wetzstein; 2020 | S1: Reid Ling Tong Li
S2: Kiran Raja
C: Liyan Chen
R: ABAYOMI ADEKANMBI
|
|
Oct 21 | Implicit functions [C] |
[S1]
[S2]
| Learning Continuous Image Representation with Local Implicit Image Function, Chen, Liu, Wang; 2020 | S1: Ian Trowbridge
S2: Shivang Singh
C: Shivi Agarwal
R: Srinath Tankasala
|
[S1]
[S2]
| NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, Mildenhall, Srinivasan, Tancik, Barron, Ramamoorthi, Ng; 2020 | S1: Matthew Kelleher
S2: Christopher Hahn
C: Serdjan Rolovic
R: Jose Chavez
|
[S2]
| NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections, Martin-Brualla, Radwan, Sajjadi, Barron, Dosovitskiy, Duckworth; 2020 | S1:
S2: Nilesh Gupta
C: Joshua Papermaster
R: Jordi Ramos Chen
|
[S1]
[S2]
| Baking Neural Radiance Fields for Real-Time View Synthesis, Hedman, Srinivasan, Mildenhall, Barron, Debevec; 2021 | S1: Hung-Ting Chen
S2: Daniel Almeraz
C: Tongrui Li
R: Zhou Fang
|
[S1]
[S2]
| GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields, Niemeyer, Geiger; 2020 | S1: Cheng-Chun Hsu
S2: Sai Kiran Maddela
C:
R: Yeming Wen
|
|
Oct 26 | 2D recognition [C] |
[S1]
[S2]
| Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, Ren, He, Girshick, Sun; 2015 | S1: Sai Kiran Maddela
S2: Ian Trowbridge
C: Srinath Tankasala
R: Liyan Chen
|
[S1]
[S2]
| You Only Look Once: Unified, Real-Time Object Detection, Redmon, Divvala, Girshick, Farhadi; 2015 | S1: Christopher Hahn
S2: Reid Ling Tong Li
C: Jay Liao
R: Tongrui Li
|
[S1]
[S2]
| Focal Loss for Dense Object Detection, Lin, Goyal, Girshick, He, Dollár; 2017 | S1: Shivang Singh
S2: Ayush Chauhan
C: Elias Lampietti
R: Serdjan Rolovic
|
[S1]
[S2]
| Mask R-CNN, He, Gkioxari, Dollár, Girshick; 2017 | S1: Samantha Hay
S2: Ojas Patel
C: Zhou Fang
R: Ishank Arora
|
[S1]
[S2]
| Cascade R-CNN: Delving into High Quality Object Detection, Cai, Vasconcelos; 2017 | S1: Tarannum Khan
S2: Marlan McInnes-Taylor
C:
R: Zayne Sprague
|
|
Oct 28 | 2D recognition [C] |
[S1]
[S2]
| Deformable Convolutional Networks, Dai, Qi, Xiong, Li, Zhang, Hu, Wei; 2017 | S1: Nilesh Gupta
S2: Ishan Shah
C: Jordi Ramos Chen
R: Kelsey Ball
|
[S1]
| CornerNet: Detecting Objects as Paired Keypoints, Law, Deng; 2018 | S1: Daniel Almeraz
S2:
C: Jose Chavez
R: Joshua Papermaster
|
[S2]
| Objects as Points, Zhou, Wang, Krähenbühl; 2019 | S1:
S2: Hung-Ting Chen
C: Atreya Dey
R: Shivi Agarwal
|
[S1]
[S2]
| End-to-End Object Detection with Transformers, Carion, Massa, Synnaeve, Usunier, Kirillov, Zagoruyko; 2020 | S1: Kiran Raja
S2: Cheng-Chun Hsu
C: Yeming Wen
R: Marco Bueso
|
[S1]
[S2]
| Deformable DETR: Deformable Transformers for End-to-End Object Detection, Zhu, Su, Lu, Li, Wang, Dai; 2020 | S1: Jay Whang
S2: Matthew Kelleher
C: ABAYOMI ADEKANMBI
R:
|
|
Nov 02 | 3D recognition [C] |
[S1]
[S2]
| PointNet: Deep Learning on Point Sets for 3D Classification and
Segmentation, Qi, Su, Mo, Guibas; 2016 | S1: Serdjan Rolovic
S2: Elias Lampietti
C: Matthew Kelleher
R: Samantha Hay
|
[S1]
[S2]
| PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric
Space, Qi, Yi, Su, Guibas; 2017 | S1: Ishank Arora
S2: Yeming Wen
C: Ayush Chauhan
R: Christopher Hahn
|
[S1]
[S2]
| Dynamic Graph CNN for Learning on Point Clouds, Wang, Sun, Liu, Sarma, Bronstein, Solomon; 2018 | S1: Zayne Sprague
S2: Zhou Fang
C: Reid Ling Tong Li
R: Jay Whang
|
[S1]
| PointCNN: Convolution On $\mathcal{X}$-Transformed Points, Li, Bu, Sun, Wu, Di, Chen; 2018 | S1: Liyan Chen
S2:
C: Marlan McInnes-Taylor
R: Nilesh Gupta
|
[S1]
[S2]
| Point Transformer, Zhao, Jiang, Jia, Torr, Koltun; 2020 | S1: Kelsey Ball
S2: Srinath Tankasala
C: Hung-Ting Chen
R: Sai Kiran Maddela
|
|
Nov 04 | 3D recognition [C] |
[S1]
[S2]
| VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection, Zhou, Tuzel; 2017 | S1: Tongrui Li
S2: Jordi Ramos Chen
C:
R:
|
[S1]
[S2]
| PointPillars: Fast Encoders for Object Detection from Point Clouds, Lang, Vora, Caesar, Zhou, Yang, Beijbom; 2018 | S1: Joshua Papermaster
S2: ABAYOMI ADEKANMBI
C: Ian Trowbridge
R: Tarannum Khan
|
[S2]
| PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud, Shi, Wang, Li; 2018 | S1:
S2: Jay Liao
C: Ishan Shah
R: Kiran Raja
|
[S1]
[S2]
| Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object
Detection for Autonomous Driving, Wang, Chao, Garg, Hariharan, Campbell, Weinberger; 2018 | S1: Shivi Agarwal
S2: Atreya Dey
C: Ojas Patel
R: Daniel Almeraz
|
[S1]
[S2]
| Center-based 3D Object Detection and Tracking, Yin, Zhou, Krähenbühl; 2020 | S1: Marco Bueso
S2: Jose Chavez
C: Cheng-Chun Hsu
R: Shivang Singh
|
|
Nov 09 | Open world perception [C] |
[S1]
[S2]
| Momentum Contrast for Unsupervised Visual Representation Learning, He, Fan, Wu, Xie, Girshick; 2019 | S1: Elias Lampietti
S2: Zayne Sprague
C: Samantha Hay
R: Ojas Patel
|
[S1]
[S2]
| A Simple Framework for Contrastive Learning of Visual Representations, Chen, Kornblith, Norouzi, Hinton; 2020 | S1: Zhou Fang
S2: Kiran Raja
C: Kelsey Ball
R: Reid Ling Tong Li
|
[S1]
| VirTex: Learning Visual Representations from Textual Annotations, Desai, Johnson; 2020 | S1: Jay Liao
S2: Liyan Chen
C: Shivang Singh
R: Ian Trowbridge
|
[S1]
[S2]
| Contrastive Learning of Medical Visual Representations from Paired
Images and Text, Zhang, Jiang, Miura, Manning, Langlotz; 2020 | S1: Atreya Dey
S2: Tongrui Li
C: Marco Bueso
R:
|
[S1]
[S2]
| Learning Transferable Visual Models From Natural Language Supervision, Radford, Kim, Hallacy, Ramesh, Goh, Agarwal, Sastry, Askell, Mishkin, Clark, Krueger, Sutskever; 2021 | S1: Jose Chavez
S2: Tarannum Khan
C:
R: Marlan McInnes-Taylor
|
|
Nov 11 | Open world perception [C] |
[S1]
| Towards Open Set Deep Networks, Bendale, Boult; 2015 | S1: Yeming Wen
S2:
C: Daniel Almeraz
R: Ayush Chauhan
|
[S1]
[S2]
| Large-Scale Long-Tailed Recognition in an Open World, Liu, Miao, Zhan, Wang, Gong, Yu; 2019 | S1: Srinath Tankasala
S2: Shivi Agarwal
C: Nilesh Gupta
R: Matthew Kelleher
|
[S1]
[S2]
| Class-Balanced Loss Based on Effective Number of Samples, Cui, Jia, Lin, Song, Belongie; 2019 | S1: Jordi Ramos Chen
S2: Ishank Arora
C: Jay Whang
R: Cheng-Chun Hsu
|
[S1]
[S2]
| Decoupling Representation and Classifier for Long-Tailed Recognition, Kang, Xie, Rohrbach, Yan, Gordo, Feng, Kalantidis; 2019 | S1: ABAYOMI ADEKANMBI
S2: Joshua Papermaster
C: Sai Kiran Maddela
R: Hung-Ting Chen
|
[S2]
| Overcoming Classifier Imbalance for Long-tail Object Detection with
Balanced Group Softmax, Li, Wang, Kang, Tang, Wang, Li, Feng; 2020 | S1:
S2: Serdjan Rolovic
C: Christopher Hahn
R: Ishan Shah
|
|
Nov 16 | Temporal reasoning and Video
[no code (yet)] |
[S1]
| Two-Stream Convolutional Networks for Action Recognition in Videos, Simonyan, Zisserman; 2014 |
|
[S1]
| Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset, Carreira, Zisserman; 2017 |
|
[S1]
| SlowFast Networks for Video Recognition, Feichtenhofer, Fan, Malik, He; 2018 |
|
[S1]
| Is Space-Time Attention All You Need for Video Understanding?, Bertasius, Wang, Torresani; 2021 |
|
[S1]
| Multiscale Vision Transformers, Fan, Xiong, Mangalam, Li, Yan, Malik, Feichtenhofer; 2021 |
|
|
Nov 18 | Temporal reasoning and Video
[no code (yet)] |
[S1]
| Online Model Distillation for Efficient Video Inference, Mullapudi, Chen, Zhang, Ramanan, Fatahalian; 2018 |
|
[S1]
| Long-Term Feature Banks for Detailed Video Understanding, Wu, Feichtenhofer, Fan, He, Krähenbühl, Girshick; 2018 |
|
[S1]
| Long Short-Term Transformer for Online Action Detection, Xu, Xiong, Chen, Li, Xia, Tu, Soatto; 2021 |
|
[S1]
| Less is More: ClipBERT for Video-and-Language Learning via Sparse
Sampling, Lei, Li, Zhou, Gan, Berg, Bansal, Liu; 2021 |
|
[S1]
| CLEVRER: CoLlision Events for Video REpresentation and Reasoning, Yi, Gan, Li, Kohli, Wu, Torralba, Tenenbaum; 2019 |
|
|
Nov 23 | Final Project Q/A | |
Nov 25 | No class - Thanksgiving | |
Nov 30 | Final Project Presentations |
|
Dec 02 | Final Project Presentations |
|
Syllabus subject to change.