1 | Towards a Unified View of Parameter-Efficient Transfer Learning | 8.00 | 8.67 | 0.67 | | Spotlight |

2 | Self-Supervision Enhanced Feature Selection with Correlated Gates | 8.00 | 8.67 | 0.67 | | Spotlight |

3 | What Happens after SGD Reaches Zero Loss? --A Mathematical Framework | 8.00 | 8.50 | 0.50 | | Spotlight |

4 | Score-Based Generative Modeling with Critically-Damped Langevin Diffusion | 8.00 | 8.50 | 0.50 | | Spotlight |

5 | Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation | 6.00 | 8.50 | 2.50 | | Spotlight |

6 | Scaling Laws for Neural Machine Translation | 7.50 | 8.50 | 1.00 | | Spotlight |

7 | Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks | 6.00 | 8.00 | 2.00 | | Spotlight |

8 | EViT: Expediting Vision Transformers via Token Reorganizations | 7.00 | 8.00 | 1.00 | | Spotlight |

9 | Programmatic Reinforcement Learning without Oracles | 6.33 | 8.00 | 1.67 | | Spotlight |

10 | AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning | 7.50 | 8.00 | 0.50 | | Spotlight |

11 | Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design | 8.00 | 8.00 | 0.00 | | Spotlight |

12 | Assessing Generalization of SGD via Disagreement | 8.00 | 8.00 | 0.00 | | Spotlight |

13 | Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking | 6.00 | 8.00 | 2.00 | | Spotlight |

14 | Spike-inspired rank coding for fast and accurate recurrent neural networks | 6.33 | 8.00 | 1.67 | | Spotlight |

15 | MT3: Multi-Task Multitrack Music Transcription | 8.00 | 8.00 | 0.00 | | Spotlight |

16 | The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design | 6.60 | 8.00 | 1.40 | 8, 8, 5, 6, 6 | 8, 8, 8, 8, 8 |
| Spotlight |

17 | Possibility Before Utility: Learning And Using Hierarchical Affordances | 8.00 | 8.00 | 0.00 | | Spotlight |

18 | Path Auxiliary Proposal for MCMC in Discrete Space | 5.25 | 8.00 | 2.75 | | Spotlight |

19 | TAMP-S2GCNets: Coupling Time-Aware Multipersistence Knowledge Representation with Spatio-Supra Graph Convolutional Networks for Time-Series Forecasting | 8.00 | 8.00 | 0.00 | | Spotlight |

20 | Understanding Latent Correlation-Based Multiview Learning and Self-Supervision: An Identifiability Perspective | 6.67 | 8.00 | 1.33 | | Spotlight |

21 | Scalable Sampling for Nonsymmetric Determinantal Point Processes | 7.50 | 8.00 | 0.50 | | Spotlight |

22 | Sampling with Mirrored Stein Operators | 8.00 | 8.00 | 0.00 | | Spotlight |

23 | Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory | 8.00 | 8.00 | 0.00 | | Spotlight |

24 | Learning transferable motor skills with hierarchical latent mixture policies | 6.50 | 8.00 | 1.50 | | Spotlight |

25 | SphereFace2: Binary Classification is All You Need for Deep Face Recognition | 7.00 | 8.00 | 1.00 | | Spotlight |

26 | A General Analysis of Example-Selection for Stochastic Gradient Descent | 8.00 | 8.00 | 0.00 | | Spotlight |

27 | Explanations of Black-Box Models based on Directional Feature Interactions | 6.50 | 8.00 | 1.50 | | Spotlight |

28 | EntQA: Entity Linking as Question Answering | 8.00 | 8.00 | 0.00 | | Spotlight |

29 | Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing | 7.00 | 8.00 | 1.00 | | Spotlight |

30 | Wiring Up Vision: Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream | 6.75 | 8.00 | 1.25 | | Spotlight |

31 | RelaxLoss: Defending Membership Inference Attacks without Losing Utility | 7.33 | 8.00 | 0.67 | | Spotlight |

32 | Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions | 7.33 | 8.00 | 0.67 | | Spotlight |

33 | Tackling the Generative Learning Trilemma with Denoising Diffusion GANs | 7.50 | 8.00 | 0.50 | | Spotlight |

34 | Universal Approximation Under Constraints is Possible with Transformers | 7.00 | 8.00 | 1.00 | | Spotlight |

35 | Learning Strides in Convolutional Neural Networks | 6.75 | 8.00 | 1.25 | | Spotlight |

36 | Progressive Distillation for Fast Sampling of Diffusion Models | 7.00 | 8.00 | 1.00 | | Spotlight |

37 | Probabilistic Implicit Scene Completion | 6.80 | 8.00 | 1.20 | 6, 6, 8, 8, 6 | 8, 8, 8, 8, 8 |
| Spotlight |

38 | Perceiver IO: A General Architecture for Structured Inputs & Outputs | 7.50 | 8.00 | 0.50 | | Spotlight |

39 | How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective | 6.50 | 8.00 | 1.50 | | Spotlight |

40 | Emergent Communication at Scale | 8.00 | 8.00 | 0.00 | | Spotlight |

41 | RotoGrad: Gradient Homogenization in Multitask Learning | 7.50 | 8.00 | 0.50 | | Spotlight |

42 | Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality | 8.00 | 8.00 | 0.00 | | Spotlight |

43 | Meta Discovery: Learning to Discover Novel Classes given Very Limited Data | 7.50 | 8.00 | 0.50 | | Spotlight |

44 | GNN-LM: Language Modeling based on Global Contexts via GNN | 7.67 | 8.00 | 0.33 | | Spotlight |

45 | On the Connection between Local Attention and Dynamic Depth-wise Convolution | 7.33 | 8.00 | 0.67 | | Spotlight |

46 | SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models | 6.00 | 8.00 | 2.00 | | Spotlight |

47 | On the Optimal Memorization Power of ReLU Neural Networks | 8.00 | 8.00 | 0.00 | | Spotlight |

48 | Task Relatedness-Based Generalization Bounds for Meta Learning | 7.50 | 8.00 | 0.50 | | Spotlight |

49 | Understanding Domain Randomization for Sim-to-real Transfer | 7.25 | 7.75 | 0.50 | | Spotlight |

50 | Planning in Stochastic Environments with a Learned Model | 7.00 | 7.75 | 0.75 | | Spotlight |

51 | Source-Free Adaptation to Measurement Shift via Bottom-Up Feature Restoration | 6.60 | 7.60 | 1.00 | 8, 6, 5, 6, 8 | 8, 8, 6, 8, 8 |
| Spotlight |

52 | Label Encoding for Regression Networks | 5.50 | 7.50 | 2.00 | | Spotlight |

53 | On the Importance of Firth Bias Reduction in Few-Shot Classification | 7.00 | 7.50 | 0.50 | | Spotlight |

54 | Understanding the Role of Self Attention for Efficient Speech Recognition | 6.75 | 7.50 | 0.75 | | Spotlight |

55 | Latent Variable Sequential Set Transformers for Joint Multi-Agent Motion Prediction | 5.50 | 7.50 | 2.00 | | Spotlight |

56 | Deconstructing the Inductive Biases of Hamiltonian Neural Networks | 7.50 | 7.50 | 0.00 | | Spotlight |

57 | Learning more skills through optimistic exploration | 7.25 | 7.50 | 0.25 | | Spotlight |

58 | Hybrid Local SGD for Federated Learning with Heterogeneous Communications | 5.75 | 7.50 | 1.75 | | Spotlight |

59 | Training invariances and the low-rank phenomenon: beyond linear networks | 6.75 | 7.50 | 0.75 | | Spotlight |

60 | Continuous-Time Meta-Learning with Forward Mode Differentiation | 7.00 | 7.50 | 0.50 | | Spotlight |

61 | Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models | 7.00 | 7.50 | 0.50 | | Spotlight |

62 | Continual Learning with Filter Atom Swapping | 7.00 | 7.50 | 0.50 | | Spotlight |

63 | Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers | 7.00 | 7.50 | 0.50 | | Spotlight |

64 | Learning the Dynamics of Physical Systems from Sparse Observations with Finite Element Networks | 7.50 | 7.50 | 0.00 | | Spotlight |

65 | Learnability of convolutional neural networks for infinite dimensional input via mixed and anisotropic smoothness | 8.00 | 7.50 | -0.50 | | Spotlight |

66 | Interpretable Unsupervised Diversity Denoising and Artefact Removal | 7.25 | 7.50 | 0.25 | | Spotlight |

67 | Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy | 5.25 | 7.50 | 2.25 | | Spotlight |

68 | Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation | 5.50 | 7.50 | 2.00 | | Spotlight |

69 | Imbedding Deep Neural Networks | 7.00 | 7.50 | 0.50 | | Spotlight |

70 | Constrained Policy Optimization via Bayesian World Models | 6.75 | 7.50 | 0.75 | | Spotlight |

71 | On Improving Adversarial Transferability of Vision Transformers | 6.00 | 7.50 | 1.50 | | Spotlight |

72 | VAE Approximation Error: ELBO and Exponential Families | 7.00 | 7.50 | 0.50 | | Spotlight |

73 | Unifying Likelihood-free Inference with Black-box Sequence Design and Beyond | 7.00 | 7.50 | 0.50 | | Spotlight |

74 | Omni-Dimensional Dynamic Convolution | 7.00 | 7.50 | 0.50 | | Spotlight |

75 | Generative Planning for Temporally Coordinated Exploration in Reinforcement Learning | 6.25 | 7.50 | 1.25 | | Spotlight |

76 | SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning | 6.25 | 7.50 | 1.25 | | Spotlight |

77 | DEPTS: Deep Expansion Learning for Periodic Time Series Forecasting | 7.50 | 7.50 | 0.00 | | Spotlight |

78 | Exploring the Limits of Large Scale Pre-training | 7.50 | 7.50 | 0.00 | | Spotlight |

79 | Strength of Minibatch Noise in SGD | 7.50 | 7.50 | 0.00 | | Spotlight |

80 | PAC-Bayes Information Bottleneck | 7.50 | 7.50 | 0.00 | | Spotlight |

81 | Policy improvement by planning with Gumbel | 6.25 | 7.50 | 1.25 | | Spotlight |

82 | Controlling Directions Orthogonal to a Classifier | 6.67 | 7.33 | 0.67 | | Spotlight |

83 | Autoregressive Quantile Flows for Predictive Uncertainty Estimation | 7.00 | 7.33 | 0.33 | | Spotlight |

84 | Learning Causal Relationships from Conditional Moment Restrictions by Importance Weighting | 6.67 | 7.33 | 0.67 | | Spotlight |

85 | Distributional Decision Transformer for Hindsight Information Matching | 4.00 | 7.33 | 3.33 | | Spotlight |

86 | Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics | 6.33 | 7.33 | 1.00 | | Spotlight |

87 | Superclass-Conditional Gaussian Mixture Model For Learning Fine-Grained Embeddings | 6.67 | 7.33 | 0.67 | | Spotlight |

88 | Compositional Training for End-to-End Deep AUC Maximization | 7.33 | 7.33 | 0.00 | | Spotlight |

89 | Learning-Augmentedk-means Clustering | 6.00 | 7.33 | 1.33 | | Spotlight |

90 | Boosting Randomized Smoothing with Variance Reduced Classifiers | 6.67 | 7.33 | 0.67 | | Spotlight |

91 | Near-Optimal Reward-Free Exploration for Linear Mixture MDPs with Plug-in Solver | 6.33 | 7.33 | 1.00 | | Spotlight |

92 | IntSGD: Adaptive Floatless Compression of Stochastic Gradients | 6.67 | 7.33 | 0.67 | | Spotlight |

93 | Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models | 7.33 | 7.33 | 0.00 | | Spotlight |

94 | On the approximation properties of recurrent encoder-decoder architectures | 7.00 | 7.33 | 0.33 | | Spotlight |

95 | CoBERL: Contrastive BERT for Reinforcement Learning | 6.33 | 7.33 | 1.00 | | Spotlight |

96 | 8-bit Optimizers via Block-wise Quantization | 6.33 | 7.33 | 1.00 | | Spotlight |

97 | On Predicting Generalization using GANs | 6.25 | 7.25 | 1.00 | | Spotlight |

98 | Self-supervised Learning is More Robust to Dataset Imbalance | 7.25 | 7.25 | 0.00 | | Spotlight |

99 | Learning Long-Term Reward Redistribution via Randomized Return Decomposition | 5.33 | 7.25 | 1.92 | | Spotlight |

100 | How Do Vision Transformers Work? | 7.25 | 7.25 | 0.00 | | Spotlight |

101 | Learning Optimal Conformal Classifiers | 6.50 | 7.25 | 0.75 | | Spotlight |

102 | Escaping limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems | 7.25 | 7.25 | 0.00 | | Spotlight |

103 | Continual Learning with Recursive Gradient Optimization | 6.75 | 7.25 | 0.50 | | Spotlight |

104 | Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions | 5.75 | 7.25 | 1.50 | | Spotlight |

105 | POETREE: Interpretable Policy Learning with Adaptive Decision Trees | 5.25 | 7.25 | 2.00 | | Spotlight |

106 | Improving Federated Learning Face Recognition via Privacy-Agnostic Clusters | 5.50 | 7.25 | 1.75 | | Spotlight |

107 | Reinforcement Learning with Sparse Rewards using Guidance from Offline Demonstration | 6.00 | 7.20 | 1.20 | 6, 6, 5, 5, 8 | 8, 6, 6, 8, 8 |
| Spotlight |

108 | SGD Can Converge to Local Maxima | 6.60 | 7.20 | 0.60 | 8, 6, 8, 8, 3 | 8, 6, 8, 8, 6 |
| Spotlight |

109 | Responsible Disclosure of Generative Models Using Scalable Fingerprinting | 6.40 | 7.20 | 0.80 | 8, 8, 3, 8, 5 | 8, 8, 6, 8, 6 |
| Spotlight |

110 | Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions | 5.80 | 7.20 | 1.40 | 5, 6, 6, 6, 6 | 6, 8, 8, 8, 6 |
| Spotlight |

111 | Fairness in Representation for Multilingual NLP: Insights from Controlled Experiments on Conditional Language Modeling | 4.00 | 7.20 | 3.20 | 3, 3, 6, 5, 3 | 6, 6, 8, 8, 8 |
| Spotlight |

112 | Context-Aware Sparse Deep Coordination Graphs | 6.25 | 7.00 | 0.75 | | Spotlight |

113 | Multi-Stage Episodic Control for Strategic Exploration in Text Games | 6.25 | 7.00 | 0.75 | | Spotlight |

114 | On Bridging Generic and Personalized Federated Learning for Image Classification | 5.67 | 7.00 | 1.33 | | Spotlight |

115 | Convergent Boosted Smoothing for Modeling GraphData with Tabular Node Features | 7.00 | 7.00 | 0.00 | | Spotlight |

116 | Revisiting Over-smoothing in BERT from the Perspective of Graph | 6.75 | 7.00 | 0.25 | | Spotlight |

117 | On the Uncomputability of Partition Functions in Energy-Based Sequence Models | 6.75 | 7.00 | 0.25 | | Spotlight |

118 | Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation | 5.67 | 7.00 | 1.33 | | Spotlight |

119 | Variational methods for simulation-based inference | 5.50 | 7.00 | 1.50 | | Spotlight |

120 | Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning | 6.75 | 7.00 | 0.25 | | Spotlight |

121 | Value Gradient weighted Model-Based Reinforcement Learning | 6.00 | 7.00 | 1.00 | | Spotlight |

122 | Spanning Tree-based Graph Generation for Molecules | 5.75 | 7.00 | 1.25 | | Spotlight |

123 | COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation | 5.50 | 7.00 | 1.50 | | Spotlight |

124 | Equivariant Subgraph Aggregation Networks | 6.25 | 7.00 | 0.75 | | Spotlight |

125 | Churn Reduction via Distillation | 7.00 | 7.00 | 0.00 | | Spotlight |

126 | Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100 | 6.25 | 7.00 | 0.75 | | Spotlight |

127 | When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations | 5.50 | 7.00 | 1.50 | | Spotlight |

128 | EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits | 6.25 | 7.00 | 0.75 | | Spotlight |

129 | Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption | 6.25 | 7.00 | 0.75 | | Spotlight |

130 | Message Passing Neural PDE Solvers | 6.25 | 7.00 | 0.75 | | Spotlight |

131 | The MultiBERTs: BERT Reproductions for Robustness Analysis | 7.33 | 7.00 | -0.33 | | Spotlight |

132 | Compositional Attention: Disentangling Search and Retrieval | 5.67 | 7.00 | 1.33 | | Spotlight |

133 | When should agents explore? | 7.00 | 7.00 | 0.00 | | Spotlight |

134 | Contrastive Fine-grained Class Clustering via Generative Adversarial Networks | 6.25 | 7.00 | 0.75 | | Spotlight |

135 | NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning | 5.33 | 7.00 | 1.67 | | Spotlight |

136 | Geometric and Physical Quantities improve E(3) Equivariant Message Passing | 6.33 | 7.00 | 0.67 | 10, 6, 6, 6, 5, 5 | 10, 6, 8, 6, 6, 6 |
| Spotlight |

137 | GreaseLM: Graph REASoning Enhanced Language Models | 6.00 | 7.00 | 1.00 | | Spotlight |

138 | D-CODE: Discovering Closed-form ODEs from Observed Trajectories | 6.50 | 7.00 | 0.50 | | Spotlight |

139 | SO(2)-Equivariant Reinforcement Learning | 6.60 | 7.00 | 0.40 | 5, 6, 6, 8, 8 | 5, 6, 8, 8, 8 |
| Spotlight |

140 | On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning | 7.00 | 7.00 | 0.00 | | Spotlight |

141 | Long Expressive Memory for Sequence Modeling | 6.25 | 7.00 | 0.75 | | Spotlight |

142 | DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization | 5.75 | 7.00 | 1.25 | | Spotlight |

143 | Graph-Augmented Normalizing Flows for Anomaly Detection of Multiple Time Series | 5.50 | 7.00 | 1.50 | | Spotlight |

144 | Online Hyperparameter Meta-Learning with Hypergradient Distillation | 7.00 | 7.00 | 0.00 | | Spotlight |

145 | Learning Hierarchical Structures with Differentiable Nondeterministic Stacks | 6.75 | 7.00 | 0.25 | | Spotlight |

146 | NASPY: Automated Extraction of Automated Machine Learning Models | 7.00 | 7.00 | 0.00 | | Spotlight |

147 | Equivariant Transformers for Neural Network based Molecular Potentials | 6.25 | 7.00 | 0.75 | | Spotlight |

148 | Finite-Time Convergence and Sample Complexity of Multi-Agent Actor-Critic Reinforcement Learning with Average Reward | 5.60 | 6.80 | 1.20 | 6, 5, 6, 6, 5 | 6, 6, 8, 8, 6 |
| Spotlight |

149 | Revisiting Design Choices in Offline Model Based Reinforcement Learning | 5.40 | 6.80 | 1.40 | 8, 5, 6, 3, 5 | 8, 6, 8, 6, 6 |
| Spotlight |

150 | Learning Altruistic Behaviours in Reinforcement Learning without External Rewards | 6.00 | 6.80 | 0.80 | 8, 6, 6, 5, 5 | 8, 6, 8, 6, 6 |
| Spotlight |

151 | Adversarial Support Alignment | 6.00 | 6.75 | 0.75 | | Spotlight |

152 | Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design | 4.25 | 6.75 | 2.50 | | Spotlight |

153 | Dynamics-Aware Comparison of Learned Reward Functions | 6.00 | 6.75 | 0.75 | | Spotlight |

154 | Implicit Bias of Projected Subgradient Method Gives Provable Robust Recovery of Subspaces of Unknown Codimension | 6.75 | 6.75 | 0.00 | | Spotlight |

155 | Representation Learning for Online and Offline RL in Low-rank MDPs | 5.50 | 6.75 | 1.25 | | Spotlight |

156 | Leveraging Automated Unit Tests for Unsupervised Code Translation | 6.75 | 6.75 | 0.00 | | Spotlight |

157 | Properties from mechanisms: an equivariance perspective on identifiable representation learning | 6.67 | 6.67 | 0.00 | | Spotlight |

158 | Optimal Transport for Causal Discovery | 6.33 | 6.67 | 0.33 | | Spotlight |

159 | Half-Inverse Gradients for Physical Deep Learning | 6.33 | 6.67 | 0.33 | | Spotlight |

160 | Looking Back on Learned Experiences For Class/task Incremental Learning | 5.67 | 6.67 | 1.00 | | Spotlight |

161 | Learning meta-features for AutoML | 5.00 | 6.60 | 1.60 | 3, 3, 8, 6, 5 | 8, 6, 8, 6, 5 |
| Spotlight |

162 | On the relation between statistical learning and perceptual distances | 5.50 | 6.50 | 1.00 | | Spotlight |

163 | Tighter Sparse Approximation Bounds for ReLU Neural Networks | 6.50 | 6.50 | 0.00 | | Spotlight |

164 | ViTGAN: Training GANs with Vision Transformers | 5.40 | 6.40 | 1.00 | 5, 5, 5, 6, 6 | 6, 6, 6, 8, 6 |
| Spotlight |

165 | Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining | 6.00 | 6.25 | 0.25 | | Spotlight |

166 | Multitask Prompted Training Enables Zero-Shot Task Generalization | 6.25 | 6.25 | 0.00 | | Spotlight |

167 | Increasing the Cost of Model Extraction with Calibrated Proof of Work | 5.75 | 6.25 | 0.50 | | Spotlight |

168 | Memorizing Transformers | 5.75 | 6.25 | 0.50 | | Spotlight |

169 | Lossless Compression with Probabilistic Circuits | 5.50 | 6.25 | 0.75 | | Spotlight |

170 | Linking Emergent and Natural Languages via Corpus Transfer | 6.25 | 6.25 | 0.00 | | Spotlight |

171 | TRGP: Trust Region Gradient Projection for Continual Learning | 6.00 | 6.25 | 0.25 | | Spotlight |

172 | Relational Multi-Task Learning: Modeling Relations between Data and Tasks | 6.25 | 6.25 | 0.00 | | Spotlight |

173 | Understanding and Preventing Capacity Loss in Reinforcement Learning | 5.50 | 6.25 | 0.75 | | Spotlight |

174 | Learning Multimodal VAEs through Mutual Supervision | 6.00 | 6.25 | 0.25 | | Spotlight |

175 | Towards Understanding the Data Dependency of Mixup-style Training | 5.67 | 5.67 | 0.00 | | Spotlight |

176 | R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning | 5.67 | 5.67 | 0.00 | | Spotlight |