Recommended AI papers: Dec 1 – 15, 2021

Simulation Intelligence: Towards A New Generation Of Scientific Methods: https://arxiv.org/pdf/2112.03235v1.pdf
Information is Power: Intrinsic Control via Information Capture: https://arxiv.org/pdf/2112.03899v1.pdf
GLaM: Efficient Scaling of Language Models with Mixture-of-Experts: https://arxiv.org/pdf/2112.06905v1.pdf
Creating Multimodal Interactive Agents with Imitation and Self-Supervised Learning: https://arxiv.org/pdf/2112.03763v1.pdf
Efficient Geometry-aware 3D Generative Adversarial Networks: https://arxiv.org/pdf/2112.07945v1.pdf
GAN-Supervised Dense Visual Alignment: https://arxiv.org/pdf/2112.05143v1.pdf
BEVT: BERT Pretraining of Video Transformers: https://arxiv.org/pdf/2112.01529v1.pdf
Optimal Latent Space Forecasting For Large Collections Of Short Time Series Using Temporal Matrix Factorization: https://arxiv.org/pdf/2112.08052v1.pdf
Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks: https://arxiv.org/pdf/2112.01522v1.pdf
Measure and Improve Robustness in NLP Models: A Survey: https://arxiv.org/pdf/2112.08313v1.pdf
Self-Attention Does Not Need O(N2) Memory: https://arxiv.org/pdf/2112.05682v2.pdf
Exploring the Equivalence of Siamese Self-Supervised Learning via A Unified Gradient Framework: https://arxiv.org/pdf/2112.05141v1.pdf
Plenoxels: Radiance Fields without Neural Networks: https://arxiv.org/pdf/2112.05131v1.pdf
Player of Games: https://arxiv.org/pdf/2112.03178v1.pdf
Improved Multiscale Vision Transformers for Classification and Detection: https://arxiv.org/pdf/2112.01526v1.pdf
Training Robust Zero-Shot Voice Conversion Models With Self-Supervised Features: https://arxiv.org/pdf/2112.04424v1.pdf
Grounded Language-Image Pre-training: https://arxiv.org/pdf/2112.03857v1.pdf
GenIE: Generative Information Extraction: https://arxiv.org/pdf/2112.08340v1.pdf
Systematic Generalization with Edge Transformers: https://arxiv.org/pdf/2112.00578v1.pdf
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers: https://arxiv.org/pdf/2111.12710v1.pdf
BACON: Band-limited Coordinate Networks for Multiscale Scene Representation: https://arxiv.org/pdf/2112.04645v1.pdf
Improving language models by retrieving from trillions of tokens: https://arxiv.org/pdf/2112.04426v1.pdf
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields: https://arxiv.org/pdf/2112.05139v1.pdf
Neuro-Symbolic Inductive Logic Programming with Logical Neural Networks: https://arxiv.org/pdf/2112.03324v1.pdf
Multi-Scale Feature Learning Dynamics: Insights For Double Descent: https://arxiv.org/pdf/2112.03215v1.pdf
Multipath++: Efficient Information Fusion And Trajectory Aggregation For Behavior Prediction: https://arxiv.org/pdf/2111.14973v2.pdf
RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs: https://arxiv.org/pdf/2112.00724v1.pdf
3D Question Answering: https://arxiv.org/pdf/2112.08359v1.pdf
Causal-based Time Series Domain Generalization for Vehicle Intention Prediction: https://arxiv.org/pdf/2112.02093v1.pdf
DANETs: Deep Abstract Networks for Tabular Data Classification and Regression: https://arxiv.org/pdf/2112.02962v1.pdf
Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields: https://arxiv.org/pdf/2112.03907v1.pdf
FLAVA: A Foundational Language And Vision Alignment Model: https://arxiv.org/pdf/2112.04482v1.pdf
CaSP: Class-agnostic Semi-Supervised Pretraining for Detection & Segmentation: https://arxiv.org/pdf/2112.04966v1.pdf
FaceFormer: Speech-Driven 3D Facial Animation with Transformers: https://arxiv.org/pdf/2112.05329v1.pdf
Uncertainty Estimation via Response Scaling for Pseudo-mask Noise Mitigation in Weakly-supervised Semantic Segmentation: https://arxiv.org/pdf/2112.07431v1.pdf
Coupling Vision and Proprioception for Navigation of Legged Robots: https://arxiv.org/pdf/2112.02094v1.pdf
InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation: https://arxiv.org/pdf/2112.01589v2.pdf
Robustness in Deep Learning for Computer Vision: Mind the gap?: https://arxiv.org/pdf/2112.00639v1.pdf
Neural Emotion Director: Speech-preserving semantic control of facial expressions in “in-the-wild” videos: https://arxiv.org/pdf/2112.00585v1.pdf