2 | Granger causal inference on DAGs identifies genomic loci regulating transcription | 6.75 | 8.00 | 1.25 | | Poster |

3 | PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method | 7.00 | 8.00 | 1.00 | | Poster |

4 | Adaptive Control Flow in Transformers Improves Systematic Generalization | 6.67 | 8.00 | 1.33 | | Poster |

5 | DemoDICE: Offline Imitation Learning with Supplementary Imperfect Demonstrations | 7.33 | 8.00 | 0.67 | | Poster |

6 | Evaluating Distributional Distortion in Neural Language Modeling | 6.33 | 8.00 | 1.67 | | Poster |

7 | NeuPL: Neural Population Learning | 6.50 | 8.00 | 1.50 | | Poster |

8 | Convergent Graph Solvers | 7.00 | 8.00 | 1.00 | | Poster |

9 | Neural Deep Equilibrium Solvers | 8.00 | 8.00 | 0.00 | | Poster |

10 | Inductive Relation Prediction Using Analogy Subgraph Embeddings | 5.80 | 8.00 | 2.20 | 6, 5, 6, 6, 6 | 8, 8, 8, 8, 8 |
11 | Fast Differentiable Matrix Square Root | 6.33 | 8.00 | 1.67 | | Poster |

12 | Visual Representation Learning Does Not Generalize Strongly Within the Same Domain | 6.75 | 8.00 | 1.25 | | Poster |

13 | Local Feature Swapping for Generalization in Reinforcement Learning | 5.00 | 7.60 | 2.60 | 5, 3, 6, 5, 6 | 8, 6, 8, 8, 8 |
14 | QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization | 6.00 | 7.50 | 1.50 | | Poster |

15 | Learnability Lock: Authorized Learnability Control Through Adversarial Invertible Transformations | 5.50 | 7.50 | 2.00 | | Poster |

16 | Optimization and Adaptive Generalization of Three layer Neural Networks | 7.25 | 7.50 | 0.25 | | Poster |

17 | Approximation and Learning with Deep Convolutional Models: a Kernel Perspective | 7.50 | 7.50 | 0.00 | | Poster |

18 | Case-based Reasoning for Better Generalization in Text-Adventure Games | 5.75 | 7.50 | 1.75 | | Poster |

19 | Conditional Image Generation by Conditioning Variational Auto-Encoders | 6.00 | 7.50 | 1.50 | | Poster |

20 | DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools | 6.33 | 7.50 | 1.17 | | Poster |

21 | When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently? | 8.00 | 7.50 | -0.50 | | Poster |

22 | The Boltzmann Policy Distribution: Accounting for Systematic Suboptimality in Human Models | 6.75 | 7.50 | 0.75 | | Poster |

23 | Accelerated Policy Learning with Parallel Differentiable Simulation | 6.00 | 7.50 | 1.50 | | Poster |

24 | NAS-Bench-Suite: NAS Evaluation is (Now) Surprisingly Easy | 7.50 | 7.50 | 0.00 | | Poster |

25 | Know Your Action Set: Learning Action Relations for Reinforcement Learning | 5.25 | 7.50 | 2.25 | | Poster |

26 | LORD: Lower-Dimensional Embedding of Log-Signature in Neural Rough Differential Equations | 7.25 | 7.50 | 0.25 | | Poster |

27 | StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis | 7.50 | 7.50 | 0.00 | | Poster |

28 | Environment Predictive Coding for Visual Navigation | 6.25 | 7.50 | 1.25 | | Poster |

29 | Unsupervised Federated Learning is Possible | 7.00 | 7.50 | 0.50 | | Poster |

30 | Meta-Imitation Learning by Watching Video Demonstrations | 5.25 | 7.50 | 2.25 | | Poster |

31 | Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception | 7.50 | 7.50 | 0.00 | | Poster |

32 | Can an Image Classifier Suffice For Action Recognition? | 7.25 | 7.50 | 0.25 | | Poster |

33 | Generative Models as a Data Source for Multiview Representation Learning | 6.25 | 7.50 | 1.25 | | Poster |

34 | CrossBeam: Learning to Search in Bottom-Up Program Synthesis | 7.00 | 7.50 | 0.50 | | Poster |

35 | Information Prioritization through Empowerment in Visual Model-based RL | 5.50 | 7.50 | 2.00 | | Poster |

36 | Revisiting flow generative models for Out-of-distribution detection | 5.75 | 7.50 | 1.75 | | Poster |

37 | HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation | 6.75 | 7.50 | 0.75 | | Poster |

38 | Mention Memory: incorporating textual knowledge into Transformers through entity mention attention | 6.50 | 7.50 | 1.00 | | Poster |

39 | Vitruvion: A Generative Model of Parametric CAD Sketches | 6.25 | 7.50 | 1.25 | | Poster |

40 | No One Representation to Rule Them All: Overlapping Features of Training Methods | 7.00 | 7.50 | 0.50 | | Poster |

41 | UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning | 7.50 | 7.50 | 0.00 | | Poster |

42 | Relating transformers to models and neural representations of the hippocampal formation | 5.75 | 7.50 | 1.75 | | Poster |

43 | πBO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization | 6.25 | 7.50 | 1.25 | | Poster |

44 | TAPEX: Table Pre-training via Learning a Neural SQL Executor | 8.00 | 7.50 | -0.50 | | Poster |

45 | On the Pitfalls of Analyzing Individual Neurons in Language Models | 6.75 | 7.50 | 0.75 | | Poster |

46 | Learning Discrete Structured Variational Auto-Encoder using Natural Evolution Strategies | 6.50 | 7.50 | 1.00 | | Poster |

47 | Creating Training Sets via Weak Indirect Supervision | 6.25 | 7.50 | 1.25 | | Poster |

48 | Decoupled Adaptation for Cross-Domain Object Detection | 6.75 | 7.50 | 0.75 | | Poster |

49 | InfinityGAN: Towards Infinite-Pixel Image Synthesis | 7.25 | 7.50 | 0.25 | | Poster |

50 | Deep Attentive Variational Inference | 5.75 | 7.50 | 1.75 | | Poster |

51 | Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent | 6.50 | 7.50 | 1.00 | | Poster |

52 | Efficient Sharpness-aware Minimization for Improved Training of Neural Networks | 6.50 | 7.50 | 1.00 | | Poster |

53 | Learning Super-Features for Image Retrieval | 7.25 | 7.50 | 0.25 | | Poster |

54 | How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data | 7.50 | 7.50 | 0.00 | | Poster |

55 | Adversarial Robustness Through the Lens of Causality | 6.25 | 7.50 | 1.25 | | Poster |

56 | A Deep Variational Approach to Clustering Survival Data | 7.25 | 7.50 | 0.25 | | Poster |

57 | Denoising Likelihood Score Matching for Conditional Score-based Data Generation | 6.75 | 7.50 | 0.75 | | Poster |

58 | CKConv: Continuous Kernel Convolution For Sequential Data | 6.50 | 7.50 | 1.00 | | Poster |

59 | What’s Wrong with Deep Learning in Tree Search for Combinatorial Optimization | 6.00 | 7.50 | 1.50 | | Poster |

60 | Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation | 6.75 | 7.50 | 0.75 | | Poster |

61 | You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory Prediction | 5.60 | 7.40 | 1.80 | 6, 6, 6, 5, 5 | 8, 10, 6, 8, 5 |
62 | Improving Mutual Information Estimation with Annealed and Energy-Based Bounds | 7.33 | 7.33 | 0.00 | | Poster |

63 | Distribution Compression in Near-Linear Time | 6.67 | 7.33 | 0.67 | | Poster |

64 | Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness | 7.00 | 7.33 | 0.33 | | Poster |

65 | Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates | 6.00 | 7.33 | 1.33 | | Poster |

66 | ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity | 7.00 | 7.33 | 0.33 | | Poster |

67 | Label-Efficient Semantic Segmentation with Diffusion Models | 5.00 | 7.33 | 2.33 | | Poster |

68 | Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future | 7.00 | 7.33 | 0.33 | | Poster |

69 | Open-vocabulary Object Detection via Vision and Language Knowledge Distillation | 7.00 | 7.33 | 0.33 | | Poster |

70 | Convergent and Efficient Deep Q Learning Algorithm | 5.33 | 7.33 | 2.00 | | Poster |

71 | Efficient Self-supervised Vision Transformers for Representation Learning | 6.67 | 7.33 | 0.67 | | Poster |

72 | Sound Adversarial Audio-Visual Navigation | 5.67 | 7.33 | 1.67 | | Poster |

73 | Actor-critic is implicitly biased towards high entropy optimal policies | 6.33 | 7.33 | 1.00 | | Poster |

74 | Chunked Autoregressive GAN for Conditional Waveform Synthesis | 7.00 | 7.33 | 0.33 | | Poster |

75 | A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion | 7.00 | 7.33 | 0.33 | | Poster |

76 | Training Structured Neural Networks Through Manifold Identification and Variance Reduction | 5.33 | 7.33 | 2.00 | | Poster |

77 | A Johnson-Lindenstrauss Framework for Randomly Initialized CNNs | 6.33 | 7.33 | 1.00 | | Poster |

78 | Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis | 6.67 | 7.33 | 0.67 | | Poster |

79 | Hybrid Random Features | 5.00 | 7.33 | 2.33 | | Poster |

80 | Graphon based Clustering and Testing of Networks: Algorithms and Theory | 5.67 | 7.33 | 1.67 | | Poster |

81 | Training Data Generating Networks: Shape Reconstruction via Bi-level Optimization | 6.67 | 7.33 | 0.67 | | Poster |

82 | Bregman Gradient Policy Optimization | 6.33 | 7.33 | 1.00 | | Poster |

83 | Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection | 6.33 | 7.33 | 1.00 | | Poster |

84 | Relational Surrogate Loss Learning | 7.33 | 7.33 | 0.00 | | Poster |

85 | Discovering Invariant Rationales for Graph Neural Networks | 6.33 | 7.33 | 1.00 | | Poster |

86 | Causal ImageNet: How to discover spurious features in Deep Learning? | 7.00 | 7.33 | 0.33 | | Poster |

87 | CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation | 5.67 | 7.33 | 1.67 | | Poster |

88 | Fast topological clustering with Wasserstein distance | 5.33 | 7.33 | 2.00 | | Poster |

89 | Critical Points in Quantum Generative Models | 7.00 | 7.33 | 0.33 | | Poster |

90 | Delaunay Component Analysis for Evaluation of Data Representations | 7.00 | 7.33 | 0.33 | | Poster |

91 | An Experimental Design Perspective on Exploration in Reinforcement Learning | 5.75 | 7.25 | 1.50 | | Poster |

92 | Fixed Neural Network Steganography: Train the images, not the network | 6.25 | 7.25 | 1.00 | | Poster |

93 | Recycling Model Updates in Federated Learning: Are Gradient Subspaces Low-Rank? | 6.00 | 7.25 | 1.25 | | Poster |

94 | Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization | 6.25 | 7.25 | 1.00 | | Poster |

95 | Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations | 6.75 | 7.25 | 0.50 | | Poster |

96 | On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications | 7.25 | 7.25 | 0.00 | | Poster |

97 | Graph-less Neural Networks: Teaching Old MLPs New Tricks Via Distillation | 6.75 | 7.25 | 0.50 | | Poster |

98 | Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks | 5.67 | 7.25 | 1.58 | | Poster |

99 | Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions | 6.25 | 7.25 | 1.00 | | Poster |

100 | CLEVA-Compass: A Continual Learning Evaluation Assessment Compass to Promote Research Transparency and Comparability | 5.75 | 7.25 | 1.50 | | Poster |

101 | Differentiable Scaffolding Tree for Molecule Optimization | 7.25 | 7.25 | 0.00 | | Poster |

102 | Transformer-based Transform Coding | 7.00 | 7.20 | 0.20 | 8, 5, 6, 8, 8 | 8, 6, 6, 8, 8 |
103 | Dual Lottery Ticket Hypothesis | 5.00 | 7.20 | 2.20 | | Poster |

104 | Pix2seq: A Language Modeling Framework for Object Detection | 6.80 | 7.20 | 0.40 | 8, 6, 6, 6, 8 | 8, 6, 8, 6, 8 |
105 | MetaMorph: Learning Universal Controllers with Transformers | 6.20 | 7.20 | 1.00 | 8, 8, 3, 6, 6 | 8, 8, 6, 6, 8 |
106 | SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training | 6.80 | 7.20 | 0.40 | 6, 6, 6, 8, 8 | 6, 6, 8, 8, 8 |
107 | Contextualized Scene Imagination for Generative Commonsense Reasoning | 5.75 | 7.00 | 1.25 | | Poster |

108 | Phenomenology of Double Descent in Finite-Width Neural Networks | 5.20 | 7.00 | 1.80 | 3, 3, 6, 6, 8 | 3, 8, 8, 8, 8 |
109 | Machine Learning For Elliptic PDEs: Fast Rate Generalization Bound, Neural Scaling Law and Minimax Optimality | 6.25 | 7.00 | 0.75 | | Poster |

110 | On Distributed Adaptive Optimization with Gradient Compression | 7.00 | 7.00 | 0.00 | | Poster |

111 | Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations? | 6.25 | 7.00 | 0.75 | | Poster |

112 | Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? | 6.25 | 7.00 | 0.75 | | Poster |

113 | Leveraging unlabeled data to predict out-of-distribution performance | 6.20 | 7.00 | 0.80 | 6, 8, 6, 5, 6 | 6, 8, 8, 5, 8 |
114 | Fortuitous Forgetting in Connectionist Networks | 6.00 | 7.00 | 1.00 | | Poster |

115 | A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning | 6.50 | 7.00 | 0.50 | | Poster |

116 | Learning Transferable Reward for Query Object Localization with Policy Adaptation | 5.50 | 7.00 | 1.50 | | Poster |

117 | CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture | 6.25 | 7.00 | 0.75 | | Poster |

118 | The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks | 5.75 | 7.00 | 1.25 | | Poster |

119 | Should I Run Offline Reinforcement Learning or Behavioral Cloning? | 5.50 | 7.00 | 1.50 | | Poster |

120 | Permutation-Based SGD: Is Random Optimal? | 7.00 | 7.00 | 0.00 | | Poster |

121 | Hindsight: Posterior-guided training of retrievers for improved open-ended generation | 6.25 | 7.00 | 0.75 | | Poster |

122 | Sample and Computation Redistribution for Efficient Face Detection | 7.33 | 7.00 | -0.33 | | Poster |

123 | Chaos is a Ladder: A New Understanding of Contrastive Learning | 5.50 | 7.00 | 1.50 | | Poster |

124 | Rethinking Adversarial Transferability from a Data Distribution Perspective | 6.00 | 7.00 | 1.00 | | Poster |

125 | High Probability Generalization Bounds for Minimax Problems with Fast Rates | 6.25 | 7.00 | 0.75 | | Poster |

126 | Unsupervised Semantic Segmentation by Distilling Feature Correspondences | 6.75 | 7.00 | 0.25 | | Poster |

127 | Is High Variance Unavoidable in RL? A Case Study in Continuous Control | 5.50 | 7.00 | 1.50 | | Poster |

128 | C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks | 6.75 | 7.00 | 0.25 | | Poster |

129 | Divisive Feature Normalization Improves Image Recognition Performance in AlexNet | 6.00 | 7.00 | 1.00 | | Poster |

130 | An Unconstrained Layer-Peeled Perspective on Neural Collapse | 6.50 | 7.00 | 0.50 | | Poster |

131 | Data-Driven Offline Optimization for Architecting Hardware Accelerators | 6.50 | 7.00 | 0.50 | | Poster |

132 | cosFormer: Rethinking Softmax In Attention | 6.25 | 7.00 | 0.75 | | Poster |

133 | Unsupervised Discovery of Object Radiance Fields | 6.33 | 7.00 | 0.67 | | Poster |

134 | MonoDistill: Learning Spatial Features for Monocular 3D Object Detection | 6.40 | 7.00 | 0.60 | 5, 6, 8, 5, 8 | 5, 8, 8, 6, 8 |
135 | Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction | 7.00 | 7.00 | 0.00 | | Poster |

136 | Phase Collapse in Neural Networks | 5.75 | 7.00 | 1.25 | | Poster |

137 | Coherence-based Label Propagation over Time Series for Accelerated Active Learning | 7.00 | 7.00 | 0.00 | | Poster |

138 | Differentially Private Fractional Frequency Moments Estimation with Polylogarithmic Space | 6.50 | 7.00 | 0.50 | | Poster |

139 | MCMC Should Mix: Learning Energy-Based Model with Flow-Based Backbone | 6.00 | 7.00 | 1.00 | | Poster |

140 | Gradient Information Matters in Policy Optimization by Back-propagating through Model | 4.50 | 7.00 | 2.50 | | Poster |

141 | Multi-objective Optimization by Learning Space Partition | 6.75 | 7.00 | 0.25 | | Poster |

142 | Spherical Message Passing for 3D Molecular Graphs | 5.67 | 7.00 | 1.33 | | Poster |

143 | AEVA: Black-box Backdoor Detection Using Adversarial Extreme Value Analysis | 5.75 | 7.00 | 1.25 | | Poster |

144 | PF-GNN: Differentiable particle filtering based approximation of universal graph representations | 6.25 | 7.00 | 0.75 | | Poster |

145 | LoRA: Low-Rank Adaptation of Large Language Models | 6.00 | 7.00 | 1.00 | | Poster |

146 | Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners | 6.50 | 7.00 | 0.50 | | Poster |

147 | Bootstrapping Semantic Segmentation with Regional Contrast | 5.50 | 7.00 | 1.50 | | Poster |

148 | Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations | 6.00 | 7.00 | 1.00 | | Poster |

149 | Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching | 6.75 | 7.00 | 0.25 | | Poster |

150 | Efficient Active Search for Combinatorial Optimization Problems | 7.00 | 7.00 | 0.00 | | Poster |

151 | Energy-Based Learning for Cooperative Games, with Applications to Valuation Problems in Machine Learning | 7.00 | 7.00 | 0.00 | | Poster |

152 | Minimax Optimization with Smooth Algorithmic Adversaries | 7.00 | 7.00 | 0.00 | | Poster |

153 | Domain Adversarial Training: A Game Perspective | 7.00 | 7.00 | 0.00 | | Poster |

154 | Conditional Object-Centric Learning from Video | 6.50 | 7.00 | 0.50 | | Poster |

155 | Visual Correspondence Hallucination | 7.00 | 7.00 | 0.00 | | Poster |

156 | Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View | 6.25 | 7.00 | 0.75 | | Poster |

157 | Neural Relational Inference with Node-Specific Information | 6.33 | 7.00 | 0.67 | | Poster |

158 | Learned Simulators for Turbulence | 6.00 | 7.00 | 1.00 | | Poster |

159 | Active Hierarchical Exploration with Stable Subgoal Representation Learning | 6.25 | 7.00 | 0.75 | | Poster |

160 | On the Limitations of Multimodal VAEs | 6.25 | 7.00 | 0.75 | | Poster |

161 | Embedded-model flows: Combining the inductive biases of model-free deep learning and explicit probabilistic modeling | 5.50 | 7.00 | 1.50 | | Poster |

162 | Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks | 6.75 | 7.00 | 0.25 | | Poster |

163 | Shuffle Private Stochastic Convex Optimization | 6.00 | 7.00 | 1.00 | | Poster |

164 | Self-Joint Supervised Learning | 7.00 | 7.00 | 0.00 | | Poster |

165 | Anomaly Detection for Tabular Data with Internal Contrastive Learning | 5.67 | 7.00 | 1.33 | | Poster |

166 | A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning | 5.75 | 7.00 | 1.25 | | Poster |

167 | Procedural generalization by planning with self-supervised world models | 6.75 | 7.00 | 0.25 | | Poster |

168 | Who Is Your Right Mixup Partner in Positive and Unlabeled Learning | 6.75 | 7.00 | 0.25 | | Poster |

169 | Ancestral protein sequence reconstruction using a tree-structured Ornstein-Uhlenbeck variational autoencoder | 6.00 | 7.00 | 1.00 | | Poster |

170 | Learning Towards The Largest Margins | 6.75 | 7.00 | 0.25 | | Poster |

171 | CURVATURE-GUIDED DYNAMIC SCALE NETWORKS FOR MULTI-VIEW STEREO | 5.00 | 7.00 | 2.00 | | Poster |

172 | Stochastic Training is Not Necessary for Generalization | 5.80 | 7.00 | 1.20 | 5, 3, 8, 8, 5 | 6, 5, 8, 10, 6 |
173 | Sqrt(d) Dimension Dependence of Langevin Monte Carlo | 7.00 | 7.00 | 0.00 | | Poster |

174 | The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs | 6.50 | 7.00 | 0.50 | | Poster |

175 | GiraffeDet: A Heavy-Neck Paradigm for Object Detection | 6.00 | 7.00 | 1.00 | | Poster |

176 | Joint Shapley values: a measure of joint feature importance | 7.00 | 7.00 | 0.00 | | Poster |

177 | Deep ReLU Networks Preserve Expected Length | 6.25 | 7.00 | 0.75 | | Poster |

178 | Noisy Feature Mixup | 7.00 | 7.00 | 0.00 | | Poster |

179 | Random matrices in service of ML footprint: ternary random features with no performance loss | 6.25 | 7.00 | 0.75 | | Poster |

180 | Distributionally Robust Models with Parametric Likelihood Ratios | 6.50 | 7.00 | 0.50 | | Poster |

181 | You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks | 6.25 | 7.00 | 0.75 | | Poster |

182 | A generalization of the randomized singular value decomposition | 6.33 | 7.00 | 0.67 | | Poster |

183 | Generalization of Overparametrized Deep Neural Network Under Noisy Observations | 6.25 | 7.00 | 0.75 | | Poster |

184 | Chemical-Reaction-Aware Molecule Representation Learning | 6.00 | 7.00 | 1.00 | | Poster |

185 | Offline Reinforcement Learning with Value-based Episodic Memory | 5.25 | 6.83 | 1.58 | 5, 6, 5, 5 | 6, 8, 6, 5, 8, 8 |
186 | How Does SimSiam Avoid Collapse Without Negative Samples? Towards a Unified Understanding of Progress in SSL | 6.20 | 6.80 | 0.60 | 8, 5, 5, 5, 8 | 8, 6, 6, 6, 8 |
187 | Tracking the risk of a deployed model and detecting harmful distribution shifts | 5.80 | 6.80 | 1.00 | 6, 6, 6, 5, 6 | 6, 8, 6, 6, 8 |
188 | Equivariant and Stable Positional Encoding for More Powerful Graph Neural Networks | 6.60 | 6.80 | 0.20 | 8, 8, 6, 6, 5 | 8, 8, 6, 6, 6 |
189 | Latent Image Animator: Learning to animate image via latent space navigation | 6.80 | 6.80 | 0.00 | 8, 6, 6, 6, 8 | 8, 6, 6, 6, 8 |
190 | On the Certified Robustness for Ensemble Models and Beyond | 6.20 | 6.80 | 0.60 | 5, 6, 6, 6, 8 | 6, 8, 6, 6, 8 |
191 | Multi-Critic Actor Learning: Teaching RL Policies to Act with Style | 5.00 | 6.80 | 1.80 | 8, 3, 3, 6, 5 | 8, 6, 6, 8, 6 |
192 | Learning to Generalize across Domains on Single Test Samples | 5.80 | 6.80 | 1.00 | 5, 5, 6, 5, 8 | 5, 8, 8, 5, 8 |
193 | Reinforcement Learning in Presence of Discrete Markovian Context Evolution | 6.40 | 6.80 | 0.40 | 5, 6, 5, 8, 8 | 6, 6, 6, 8, 8 |
194 | GNN is a Counter? Revisiting GNN for Question Answering | 6.25 | 6.75 | 0.50 | | Poster |

195 | Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently | 6.50 | 6.75 | 0.25 | | Poster |

196 | Pareto Policy Pool for Model-based Offline Reinforcement Learning | 5.25 | 6.75 | 1.50 | | Poster |

197 | Sparsity Winning Twice: Better Robust Generalization from More Efficient Training | 5.75 | 6.75 | 1.00 | | Poster |

198 | Deep AutoAugment | 5.50 | 6.75 | 1.25 | | Poster |

199 | BAM: Bayes Augmented with Memory | 6.50 | 6.75 | 0.25 | | Poster |

200 | Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect | 5.25 | 6.75 | 1.50 | | Poster |

201 | FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations | 6.25 | 6.75 | 0.50 | | Poster |

202 | On the Learning of Quasimetrics | 6.25 | 6.75 | 0.50 | | Poster |

203 | Synchromesh: Reliable Code Generation from Pre-trained Language Models | 6.25 | 6.75 | 0.50 | | Poster |

204 | Learning Object-Oriented Dynamics for Planning from Text | 6.75 | 6.75 | 0.00 | | Poster |

205 | How to Train Your MAML to Excel in Few-Shot Classification | 6.25 | 6.75 | 0.50 | | Poster |

206 | A Fine-Tuning Approach to Belief State Modeling | 5.00 | 6.75 | 1.75 | | Poster |

207 | Path Integral Sampler: A Stochastic Control Approach For Sampling | 6.75 | 6.75 | 0.00 | | Poster |

208 | DIVA: Dataset Derivative of a Learning Task | 7.00 | 6.75 | -0.25 | | Poster |

209 | A First-Occupancy Representation for Reinforcement Learning | 6.75 | 6.75 | 0.00 | | Poster |

210 | Towards Unknown-aware Learning with Virtual Outlier Synthesis | 5.75 | 6.75 | 1.00 | | Poster |

211 | Improving Non-Autoregressive Translation Models Without Distillation | 6.25 | 6.75 | 0.50 | | Poster |

212 | Learning Neural Contextual Bandits through Perturbed Rewards | 5.75 | 6.75 | 1.00 | | Poster |

213 | Better Supervisory Signals by Observing Learning Paths | 4.75 | 6.75 | 2.00 | | Poster |

214 | Constrained Graph Mechanics Networks | 5.00 | 6.75 | 1.75 | | Poster |

215 | Model-augmented Prioritized Experience Replay | 6.75 | 6.75 | 0.00 | | Poster |

216 | Enhancing Cross-lingual Transfer by Manifold Mixup | 5.75 | 6.75 | 1.00 | | Poster |

217 | Knowledge Removal in Sampling-based Bayesian Inference | 6.75 | 6.75 | 0.00 | | Poster |

218 | Mapping Language Models to Grounded Conceptual Spaces | 6.75 | 6.75 | 0.00 | | Poster |

219 | A Unified Contrastive Energy-based Model for Understanding the Generative Ability of Adversarial Training | 6.75 | 6.75 | 0.00 | | Poster |

220 | Proving the Lottery Ticket Hypothesis for Convolutional Neural Networks | 5.33 | 6.75 | 1.42 | | Poster |

221 | Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs | 6.00 | 6.75 | 0.75 | | Poster |

222 | Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games | 6.75 | 6.75 | 0.00 | | Poster |

223 | Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation | 5.75 | 6.75 | 1.00 | | Poster |

224 | SketchODE: Learning neural sketch representation in continuous time | 6.25 | 6.75 | 0.50 | | Poster |

225 | Sound and Complete Neural Network Repair with Minimality and Locality Guarantees | 6.00 | 6.75 | 0.75 | | Poster |

226 | Scene Transformer: A unified architecture for predicting future trajectories of multiple agents | 6.00 | 6.75 | 0.75 | | Poster |

227 | Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning | 6.75 | 6.75 | 0.00 | | Poster |

228 | ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning | 5.75 | 6.75 | 1.00 | | Poster |

229 | Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory | 5.25 | 6.75 | 1.50 | | Poster |

230 | Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields | 6.75 | 6.75 | 0.00 | | Poster |

231 | Unrolling PALM for Sparse Semi-Blind Source Separation | 4.25 | 6.75 | 2.50 | | Poster |

232 | Generalized rectifier wavelet covariance models for texture synthesis | 5.33 | 6.75 | 1.42 | | Poster |

233 | Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity | 6.75 | 6.75 | 0.00 | | Poster |

234 | Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic Forecasting | 5.50 | 6.75 | 1.25 | | Poster |

235 | Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios | 6.50 | 6.75 | 0.25 | | Poster |

236 | Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning | 5.75 | 6.75 | 1.00 | | Poster |

237 | A Loss Curvature Perspective on Training Instabilities of Deep Learning Models | 6.75 | 6.75 | 0.00 | | Poster |

238 | Surreal-GAN:Semi-Supervised Representation Learning via GAN for uncovering heterogeneous disease-related imaging patterns | 6.00 | 6.75 | 0.75 | | Poster |

239 | Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game | 4.50 | 6.75 | 2.25 | | Poster |

240 | Adversarially Robust Conformal Prediction | 6.75 | 6.75 | 0.00 | | Poster |

241 | Topological Experience Replay | 5.50 | 6.75 | 1.25 | | Poster |

242 | Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations | 5.75 | 6.75 | 1.00 | | Poster |

243 | NodePiece: Compositional and Parameter-Efficient Representations of Large Knowledge Graphs | 5.75 | 6.75 | 1.00 | | Poster |

244 | Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation | 6.75 | 6.75 | 0.00 | | Poster |

245 | Exploring Memorization in Adversarial Training | 6.33 | 6.75 | 0.42 | | Poster |

246 | Learning to Complete Code with Sketches | 6.75 | 6.75 | 0.00 | | Poster |

247 | miniF2F: a cross-system benchmark for formal Olympiad-level mathematics | 6.75 | 6.75 | 0.00 | | Poster |

248 | On Non-Random Missing Labels in Semi-Supervised Learning | 6.67 | 6.67 | 0.00 | | Poster |

249 | Invariant Causal Representation Learning for Out-of-Distribution Generalization | 6.33 | 6.67 | 0.33 | | Poster |

250 | Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks | 6.67 | 6.67 | 0.00 | | Poster |

251 | Provably Robust Adversarial Examples | 5.33 | 6.67 | 1.33 | | Poster |

252 | Image BERT Pre-training with Online Tokenizer | 6.00 | 6.67 | 0.67 | | Poster |

253 | SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations | 5.67 | 6.67 | 1.00 | | Poster |

254 | Solving Inverse Problems in Medical Imaging with Score-Based Generative Models | 5.67 | 6.67 | 1.00 | | Poster |

255 | TRAIL: Near-Optimal Imitation Learning with Suboptimal Data | 5.67 | 6.67 | 1.00 | | Poster |

256 | Automatic Loss Function Search for Predict-Then-Optimize Problems with Strong Ranking Property | 6.00 | 6.67 | 0.67 | | Poster |

257 | Toward Faithful Case-based Reasoning through Learning Prototypes in a Nearest Neighbor-friendly Space. | 6.00 | 6.67 | 0.67 | | Poster |

258 | The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program | 6.33 | 6.67 | 0.33 | | Poster |

259 | Triangle and Four Cycle Counting with Predictions in Graph Streams | 6.00 | 6.67 | 0.67 | | Poster |

260 | Sequence Approximation using Feedforward Spiking Neural Network for Spatiotemporal Learning: Theory and Optimization Methods | 4.67 | 6.67 | 2.00 | | Poster |

261 | RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning | 6.33 | 6.67 | 0.33 | | Poster |

262 | Neural Variational Dropout Processes | 6.67 | 6.67 | 0.00 | | Poster |

263 | Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators | 5.67 | 6.67 | 1.00 | | Poster |

264 | Safe Neurosymbolic Learning with Differentiable Symbolic Execution | 5.33 | 6.67 | 1.33 | | Poster |

265 | Reverse Engineering of Imperceptible Adversarial Image Perturbations | 5.33 | 6.67 | 1.33 | | Poster |

266 | VC dimension of partially quantized neural networks in the overparametrized regime | 5.67 | 6.67 | 1.00 | | Poster |

267 | Multimeasurement Generative Models | 6.67 | 6.67 | 0.00 | | Poster |

268 | Towards Understanding the Robustness Against Evasion Attack on Categorical Data | 5.00 | 6.67 | 1.67 | | Poster |

269 | Zero Pixel Directional Boundary by Vector Transform | 6.67 | 6.67 | 0.00 | | Poster |

270 | Label Leakage and Protection in Two-party Split Learning | 6.00 | 6.67 | 0.67 | | Poster |

271 | BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis | 5.67 | 6.67 | 1.00 | | Poster |

272 | Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework | 6.33 | 6.67 | 0.33 | | Poster |

273 | Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery | 6.00 | 6.67 | 0.67 | | Poster |

274 | High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize | 6.50 | 6.67 | 0.17 | | Poster |

275 | Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction | 5.67 | 6.67 | 1.00 | | Poster |

276 | Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification | 5.67 | 6.67 | 1.00 | | Poster |

277 | Practical Conditional Neural Process Via Tractable Dependent Predictions | 6.00 | 6.67 | 0.67 | | Poster |

278 | Hybrid Memoised Wake-Sleep: Approximate Inference at the Discrete-Continuous Interface | 6.33 | 6.67 | 0.33 | | Poster |

279 | Dive Deeper Into Integral Pose Regression | 5.67 | 6.67 | 1.00 | | Poster |

280 | Information Bottleneck: Exact Analysis of (Quantized) Neural Networks | 6.33 | 6.67 | 0.33 | | Poster |

281 | A Class of Short-term Recurrence Anderson Mixing Methods and Their Applications | 6.00 | 6.67 | 0.67 | | Poster |

282 | SimVLM: Simple Visual Language Model Pretraining with Weak Supervision | 6.33 | 6.67 | 0.33 | | Poster |

283 | Privacy Implications of Shuffling | 6.67 | 6.67 | 0.00 | | Poster |

284 | End-to-End Learning of Probabilistic Hierarchies on Graphs | 7.00 | 6.67 | -0.33 | | Poster |

285 | GradSign: Model Performance Inference with Theoretical Insights | 6.00 | 6.67 | 0.67 | | Poster |

286 | X-model: Improving Data Efficiency in Deep Learning with A Minimax Model | 6.33 | 6.67 | 0.33 | | Poster |

287 | Learning Versatile Neural Architectures by Propagating Network Codes | 6.67 | 6.67 | 0.00 | | Poster |

288 | Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph | 6.67 | 6.67 | 0.00 | | Poster |

289 | Entroformer: A Transformer-based Entropy Model for Learned Image Compression | 6.67 | 6.67 | 0.00 | | Poster |

290 | Uncertainty Modeling for Out-of-Distribution Generalization | 6.67 | 6.67 | 0.00 | | Poster |

291 | Online Facility Location with Predictions | 6.17 | 6.67 | 0.50 | 6, 6, 6, 8, 5, 6 | 6, 6, 6, 8, 6, 8 |
292 | PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning | 6.33 | 6.67 | 0.33 | | Poster |

293 | Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs | 5.67 | 6.67 | 1.00 | | Poster |

294 | When, Why, and Which Pretrained GANs Are Useful? | 6.67 | 6.67 | 0.00 | | Poster |

295 | Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains | 5.67 | 6.67 | 1.00 | | Poster |

296 | Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies | 5.67 | 6.67 | 1.00 | | Poster |

297 | Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification | 5.33 | 6.67 | 1.33 | | Poster |

298 | Steerable Partial Differential Operators for Equivariant Neural Networks | 6.33 | 6.67 | 0.33 | | Poster |

299 | NETWORK INSENSITIVITY TO PARAMETER NOISE VIA PARAMETER ATTACK DURING TRAINING | 6.33 | 6.67 | 0.33 | | Poster |

300 | P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts | 6.00 | 6.60 | 0.60 | 5, 8, 3, 6, 8 | 6, 8, 5, 6, 8 |
301 | A Unified Wasserstein Distributional Robustness Framework for Adversarial Training | 6.60 | 6.60 | 0.00 | 6, 6, 8, 5, 8 | 6, 6, 8, 5, 8 |
302 | Sample Selection with Uncertainty of Losses for Learning with Noisy Labels | 6.60 | 6.60 | 0.00 | 6, 8, 6, 8, 5 | 6, 8, 6, 8, 5 |
303 | Towards Better Understanding and Better Generalization of Low-shot Classification in Histology Images with Contrastive Learning | 6.40 | 6.60 | 0.20 | 5, 8, 8, 5, 6 | 6, 8, 8, 5, 6 |
304 | Trigger Hunting with a Topological Prior for Trojan Detection | 6.00 | 6.50 | 0.50 | | Poster |

305 | Optimizing Few-Step Diffusion Samplers by Gradient Descent | 5.50 | 6.50 | 1.00 | | Poster |

306 | Fast AdvProp | 6.50 | 6.50 | 0.00 | | Poster |

307 | Learning Temporally Latent Causal Processes from General Temporal Data | 5.33 | 6.50 | 1.17 | | Poster |

308 | Skill-based Meta-Reinforcement Learning | 5.50 | 6.50 | 1.00 | | Poster |

309 | Understanding Intrinsic Robustness Using Label Uncertainty | 6.25 | 6.50 | 0.25 | | Poster |

310 | From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness | 5.50 | 6.50 | 1.00 | | Poster |

311 | Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization | 6.25 | 6.50 | 0.25 | | Poster |

312 | Cross-Domain Imitation Learning via Optimal Transport | 6.25 | 6.50 | 0.25 | | Poster |

313 | Particle Stochastic Dual Coordinate Ascent: Exponential convergent algorithm for mean field neural network optimization | 6.00 | 6.50 | 0.50 | | Poster |

314 | Bi-linear Value Networks for Multi-goal Reinforcement Learning | 5.50 | 6.50 | 1.00 | | Poster |

315 | Explaining Point Processes by Learning Interpretable Temporal Logic Rules | 5.75 | 6.50 | 0.75 | | Poster |

316 | β-Intact-VAE: Identifying and Estimating Causal Effects under Limited Overlap | 6.25 | 6.50 | 0.25 | | Poster |

317 | Shallow and Deep Networks are Near-Optimal Approximators of Korobov Functions | 6.25 | 6.50 | 0.25 | | Poster |

318 | On Evaluation Metrics for Graph Generative Models | 4.75 | 6.50 | 1.75 | | Poster |

319 | How Did the Model Change? Efficiently Assessing Machine Learning API Shifts | 6.50 | 6.50 | 0.00 | | Poster |

320 | Learning Prototype-oriented Set Representations for Meta-Learning | 6.25 | 6.50 | 0.25 | | Poster |

321 | Feature Kernel Distillation | 5.75 | 6.50 | 0.75 | | Poster |

322 | The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models | 5.75 | 6.50 | 0.75 | | Poster |

323 | What Do We Mean by Generalization in Federated Learning? | 5.00 | 6.50 | 1.50 | | Poster |

324 | Learning Curves for Gaussian Process Regression with Power-Law Priors and Targets | 4.75 | 6.50 | 1.75 | | Poster |

325 | Few-shot Learning via Dirichlet Tessellation Ensemble | 6.25 | 6.50 | 0.25 | | Poster |

326 | Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning | 6.00 | 6.50 | 0.50 | | Poster |

327 | A Program to Build E(N)-Equivariant Steerable CNNs | 6.00 | 6.50 | 0.50 | | Poster |

328 | Variational Predictive Routing with Nested Subjective Timescales | 5.50 | 6.50 | 1.00 | | Poster |

329 | Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums | 4.75 | 6.50 | 1.75 | | Poster |

330 | PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions | 6.00 | 6.50 | 0.50 | | Poster |

331 | Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm | 6.00 | 6.50 | 0.50 | | Poster |

332 | Equivariant Self-Supervised Learning: Encouraging Equivariance in Representations | 5.25 | 6.50 | 1.25 | | Poster |

333 | Map Induction: Compositional spatial submap learning for efficient exploration in novel environments | 5.25 | 6.50 | 1.25 | | Poster |

334 | Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views? | 6.25 | 6.50 | 0.25 | | Poster |

335 | Surrogate Gap Minimization Improves Sharpness-Aware Training | 5.75 | 6.50 | 0.75 | | Poster |

336 | SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation | 6.67 | 6.50 | -0.17 | | Poster |

337 | Efficient and Differentiable Conformal Prediction with General Function Classes | 6.25 | 6.50 | 0.25 | | Poster |

338 | Declarative nets that are equilibrium models | 6.00 | 6.50 | 0.50 | | Poster |

339 | Capturing Structural Locality in Non-parametric Language Models | 5.75 | 6.50 | 0.75 | | Poster |

340 | IFR-Explore: Learning Inter-object Functional Relationships in 3D Indoor Scenes | 6.67 | 6.50 | -0.17 | | Poster |

341 | DEGREE: Decomposition Based Explanation for Graph Neural Networks | 6.00 | 6.50 | 0.50 | | Poster |

342 | Modular Lifelong Reinforcement Learning via Neural Composition | 5.25 | 6.50 | 1.25 | | Poster |

343 | Anisotropic Random Feature Regression in High Dimensions | 5.00 | 6.50 | 1.50 | | Poster |

344 | Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators | 6.17 | 6.50 | 0.33 | 6, 8, 6, 6, 3, 8 | 8, 8, 6, 6, 3, 8 |
345 | Understanding and Improving Graph Injection Attack by Promoting Unnoticeability | 6.25 | 6.50 | 0.25 | | Poster |

346 | Huber Additive Models for Non-stationary Time Series Analysis | 6.00 | 6.50 | 0.50 | | Poster |

347 | What Makes Better Augmentation Strategies? Augment Difficult but Not too Different | 5.75 | 6.50 | 0.75 | | Poster |

348 | Lipschitz-constrained Unsupervised Skill Discovery | 6.25 | 6.50 | 0.25 | | Poster |

349 | Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting | 5.25 | 6.50 | 1.25 | | Poster |

350 | FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes | 5.75 | 6.50 | 0.75 | | Poster |

351 | Backdoor Defense via Decoupling the Training Process | 6.25 | 6.50 | 0.25 | | Poster |

352 | Bayesian Framework for Gradient Leakage | 5.75 | 6.50 | 0.75 | | Poster |

353 | On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning | 6.50 | 6.50 | 0.00 | | Poster |

354 | Parallel Training of GRU Networks with a Multi-Grid Solver for Long Sequences | 6.25 | 6.50 | 0.25 | | Poster |

355 | Learning to Annotate Part Segmentation with Gradient Matching | 5.50 | 6.50 | 1.00 | | Poster |

356 | Predicting Physics in Mesh-reduced Space with Temporal Attention | 6.00 | 6.50 | 0.50 | | Poster |

357 | Online Ad Hoc Teamwork under Partial Observability | 6.50 | 6.50 | 0.00 | | Poster |

358 | On Incorporating Inductive Biases into VAEs | 6.25 | 6.50 | 0.25 | | Poster |

359 | Understanding the Variance Collapse of SVGD in High Dimensions | 6.50 | 6.50 | 0.00 | | Poster |

360 | Optimizing Neural Networks with Gradient Lexicase Selection | 5.25 | 6.50 | 1.25 | | Poster |

361 | Confidence Adaptive Anytime Pixel-Level Recognition | 6.00 | 6.50 | 0.50 | | Poster |

362 | How many degrees of freedom do we need to train deep networks: a loss landscape perspective | 6.50 | 6.50 | 0.00 | | Poster |

363 | Differentially Private Fine-tuning of Language Models | 6.00 | 6.50 | 0.50 | | Poster |

364 | Proof Artifact Co-Training for Theorem Proving with Language Models | 6.50 | 6.50 | 0.00 | | Poster |

365 | Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits | 6.25 | 6.50 | 0.25 | | Poster |

366 | Preference Conditioned Neural Multi-objective Combinatorial Optimization | 6.50 | 6.50 | 0.00 | | Poster |

367 | Gradient Step Denoiser for convergent Plug-and-Play | 5.50 | 6.50 | 1.00 | | Poster |

368 | Model-Based Offline Meta-Reinforcement Learning with Regularization | 5.50 | 6.50 | 1.00 | | Poster |

369 | How to deal with missing data in supervised deep learning? | 6.50 | 6.50 | 0.00 | | Poster |

370 | Learning Features with Parameter-Free Layers | 6.25 | 6.50 | 0.25 | | Poster |

371 | FedPara: Low-rank Hadamard Product for Communication-Efficient Federated Learning | 6.00 | 6.50 | 0.50 | | Poster |

372 | Defending Against Image Corruptions Through Adversarial Augmentations | 5.50 | 6.50 | 1.00 | | Poster |

373 | Simple GNN Regularisation for 3D Molecular Property Prediction and Beyond | 6.00 | 6.50 | 0.50 | | Poster |

374 | Trivial or Impossible --- dichotomous data difficulty masks model differences (on ImageNet and beyond) | 6.00 | 6.50 | 0.50 | | Poster |

375 | Learning to Downsample for Segmentation of Ultra-High Resolution Images | 6.25 | 6.50 | 0.25 | | Poster |

376 | Stiffness-aware neural network for learning Hamiltonian systems | 5.75 | 6.50 | 0.75 | | Poster |

377 | GraphENS: Neighbor-Aware Ego Network Synthesis for Class-Imbalanced Node Classification | 5.50 | 6.50 | 1.00 | | Poster |

378 | Effective Model Sparsification by Scheduled Grow-and-Prune Methods | 5.50 | 6.50 | 1.00 | | Poster |

379 | T-WaveNet: A Tree-Structured Wavelet Neural Network for Time Series Signal Analysis | 6.25 | 6.50 | 0.25 | | Poster |

380 | Policy Gradients Incorporating the Future | 6.00 | 6.50 | 0.50 | | Poster |

381 | DeSKO: Stability-Assured Robust Control with a Deep Stochastic Koopman Operator | 6.50 | 6.50 | 0.00 | | Poster |

382 | Interacting Contour Stochastic Gradient Langevin Dynamics | 5.75 | 6.50 | 0.75 | | Poster |

383 | Differentiable Expectation-Maximization for Set Representation Learning | 6.00 | 6.50 | 0.50 | | Poster |

384 | Maximum n-times Coverage for Vaccine Design | 5.50 | 6.50 | 1.00 | | Poster |

385 | Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features | 6.00 | 6.50 | 0.50 | | Poster |

386 | The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training | 5.50 | 6.50 | 1.00 | | Poster |

387 | Discovering Latent Concepts Learned in BERT | 5.00 | 6.50 | 1.50 | | Poster |

388 | Self-Supervised Inference in State-Space Models | 6.00 | 6.50 | 0.50 | | Poster |

389 | Bag of Instances Aggregation Boosts Self-supervised Distillation | 5.75 | 6.50 | 0.75 | | Poster |

390 | Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off | 5.75 | 6.50 | 0.75 | | Poster |

391 | HTLM: Hyper-Text Pre-Training and Prompting of Language Models | 6.25 | 6.50 | 0.25 | | Poster |

392 | Evaluating Model-Based Planning and Planner Amortization for Continuous Control | 6.25 | 6.50 | 0.25 | | Poster |

393 | On the Existence of Universal Lottery Tickets | 5.25 | 6.50 | 1.25 | | Poster |

394 | Reliable Adversarial Distillation with Unreliable Teachers | 6.25 | 6.50 | 0.25 | | Poster |

395 | Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation | 6.00 | 6.50 | 0.50 | | Poster |

396 | Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks | 5.50 | 6.50 | 1.00 | | Poster |

397 | Bundle Networks: Fiber Bundles, Local Trivializations, and a Generative Approach to Exploring Many-to-one Maps | 5.50 | 6.50 | 1.00 | | Poster |

398 | No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models | 6.50 | 6.50 | 0.00 | | Poster |

399 | Prototypical Contrastive Predictive Coding | 6.25 | 6.50 | 0.25 | | Poster |

400 | How unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis | 5.00 | 6.50 | 1.50 | | Poster |

401 | Effect of scale on catastrophic forgetting in neural networks | 5.00 | 6.50 | 1.50 | | Poster |

402 | Low-Budget Active Learning via Wasserstein Distance: An Integer Programming Approach | 6.50 | 6.50 | 0.00 | | Poster |

403 | Improving the Accuracy of Learning Example Weights for Imbalance Classification | 6.25 | 6.50 | 0.25 | | Poster |

404 | Fast Generic Interaction Detection for Model Interpretability and Compression | 5.75 | 6.50 | 0.75 | | Poster |

405 | AlphaZero-based Proof Cost Network to Aid Game Solving | 5.50 | 6.50 | 1.00 | | Poster |

406 | Implicit Bias of Adversarial Training for Deep Neural Networks | 6.50 | 6.50 | 0.00 | | Poster |

407 | Boosted Curriculum Reinforcement Learning | 6.67 | 6.50 | -0.17 | | Poster |

408 | NASI: Label- and Data-agnostic Neural Architecture Search at Initialization | 5.75 | 6.50 | 0.75 | | Poster |

409 | Gradient Importance Learning for Incomplete Observations | 5.50 | 6.50 | 1.00 | | Poster |

410 | PAC Prediction Sets Under Covariate Shift | 6.50 | 6.50 | 0.00 | | Poster |

411 | Hierarchical Few-Shot Imitation with Skill Transition Models | 6.25 | 6.50 | 0.25 | | Poster |

412 | The Uncanny Similarity of Recurrence and Depth | 5.75 | 6.50 | 0.75 | | Poster |

413 | Objects in Semantic Topology | 5.75 | 6.50 | 0.75 | | Poster |

414 | EigenGame Unloaded: When playing games is better than optimizing | 6.50 | 6.50 | 0.00 | | Poster |

415 | Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning | 6.50 | 6.50 | 0.00 | | Poster |

416 | AdaAug: Learning Class- and Instance-adaptive Data Augmentation Policies | 5.50 | 6.50 | 1.00 | | Poster |

417 | Dealing with Non-Stationarity in MARL via Trust-Region Decomposition | 5.50 | 6.50 | 1.00 | | Poster |

418 | Predictive Modeling in the Presence of Nuisance-Induced Spurious Correlations | 5.50 | 6.40 | 0.90 | | Poster |

419 | GRAND++: Graph Neural Diffusion with A Source Term | 5.40 | 6.40 | 1.00 | 8, 6, 5, 5, 3 | 8, 6, 6, 6, 6 |
420 | On the Role of Neural Collapse in Transfer Learning | 5.80 | 6.40 | 0.60 | 6, 6, 6, 5, 6 | 6, 6, 8, 6, 6 |
421 | Learning to Schedule Learning rate with Graph Neural Networks | 5.60 | 6.40 | 0.80 | 6, 8, 6, 5, 3 | 6, 8, 6, 6, 6 |
422 | It Takes Two to Tango: Mixup for Deep Metric Learning | 6.20 | 6.40 | 0.20 | 6, 5, 6, 6, 8 | 6, 6, 6, 6, 8 |
423 | WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection | 5.20 | 6.40 | 1.20 | 3, 6, 6, 6, 5 | 6, 6, 6, 8, 6 |
424 | Gradient Matching for Domain Generalization | 5.80 | 6.40 | 0.60 | 6, 6, 5, 6, 6 | 6, 6, 6, 8, 6 |
425 | Graph Neural Networks with Learnable Structural and Positional Representations | 5.60 | 6.40 | 0.80 | 5, 8, 5, 5, 5 | 6, 8, 8, 5, 5 |
426 | On the Convergence of Certified Robust Training with Interval Bound Propagation | 5.67 | 6.33 | 0.67 | | Poster |

427 | Learning Distributionally Robust Models at Scale via Composite Optimization | 5.67 | 6.33 | 0.67 | | Poster |

428 | MaGNET: Uniform Sampling from Deep Generative Network Manifolds Without Retraining | 5.33 | 6.33 | 1.00 | | Poster |

429 | Clean Images are Hard to Reblur: Exploiting the Ill-Posed Inverse Task for Dynamic Scene Deblurring | 5.67 | 6.33 | 0.67 | | Poster |

430 | Non-Autoregressive Models are Better Multilingual Translators | 6.33 | 6.33 | 0.00 | | Poster |

431 | Unified Visual Transformer Compression | 5.33 | 6.33 | 1.00 | | Poster |

432 | Bridging Recommendation and Marketing via Recurrent Intensity Modeling | 5.67 | 6.33 | 0.67 | | Poster |

433 | Language-driven Semantic Segmentation | 5.67 | 6.33 | 0.67 | | Poster |

434 | Optimal Representations for Covariate Shift | 6.33 | 6.33 | 0.00 | | Poster |

435 | Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise | 5.33 | 6.33 | 1.00 | | Poster |

436 | CrowdPlay: Crowdsourcing human demonstration data for offline learning in Atari games | 6.33 | 6.33 | 0.00 | | Poster |

437 | Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective | 6.00 | 6.33 | 0.33 | | Poster |

438 | Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization | 6.33 | 6.33 | 0.00 | | Poster |

439 | Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift | 5.33 | 6.33 | 1.00 | | Poster |

440 | Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data | 5.67 | 6.33 | 0.67 | | Poster |

441 | Neural Networks as Kernel Learners: The Silent Alignment Effect | 6.00 | 6.33 | 0.33 | | Poster |

442 | Hierarchical Variational Memory for Few-shot Learning Across Domains | 5.67 | 6.33 | 0.67 | | Poster |

443 | Learning to Map for Active Semantic Goal Navigation | 6.00 | 6.33 | 0.33 | | Poster |

444 | Sparse Attention with Learning to Hash | 5.33 | 6.33 | 1.00 | | Poster |

445 | Auto-scaling Vision Transformers without Training | 6.00 | 6.33 | 0.33 | | Poster |

446 | Autonomous Learning of Object-Centric Abstractions for High-Level Planning | 6.33 | 6.33 | 0.00 | | Poster |

447 | Concurrent Adversarial Learning for Large-Batch Training | 6.33 | 6.33 | 0.00 | | Poster |

448 | Fine-grained Differentiable Physics: A Yarn-level Model for Fabrics | 5.83 | 6.33 | 0.50 | 6, 6, 6, 6, 5, 6 | 6, 6, 6, 6, 8, 6 |
449 | Counterfactual Plans under Distributional Ambiguity | 6.00 | 6.33 | 0.33 | | Poster |

450 | Pareto Policy Adaptation | 5.33 | 6.33 | 1.00 | | Poster |

451 | Mapping conditional distributions for domain adaptation under generalized target shift | 6.33 | 6.33 | 0.00 | | Poster |

452 | Anti-Concentrated Confidence Bonuses For Scalable Exploration | 6.33 | 6.33 | 0.00 | | Poster |

453 | ViDT: An Efficient and Effective Fully Transformer-based Object Detector | 6.00 | 6.33 | 0.33 | | Poster |

454 | Information-theoretic Online Memory Selection for Continual Learning | 5.67 | 6.33 | 0.67 | | Poster |

455 | Transformers Can Do Bayesian Inference | 6.33 | 6.33 | 0.00 | | Poster |

456 | Neural Models for Output-Space Invariance in Combinatorial Problems | 6.33 | 6.33 | 0.00 | | Poster |

457 | Neural Solvers for Fast and Accurate Numerical Optimal Control | 5.33 | 6.33 | 1.00 | | Poster |

458 | Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information | 5.33 | 6.33 | 1.00 | | Poster |

459 | Using Graph Representation Learning with Schema Encoders to Measure the Severity of Depressive Symptoms | 5.33 | 6.33 | 1.00 | | Poster |

460 | Generative Principal Component Analysis | 5.33 | 6.33 | 1.00 | | Poster |

461 | Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL | 6.00 | 6.33 | 0.33 | | Poster |

462 | DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR | 5.33 | 6.33 | 1.00 | | Poster |

463 | MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer | 6.33 | 6.33 | 0.00 | | Poster |

464 | Incremental False Negative Detection for Contrastive Learning | 5.00 | 6.33 | 1.33 | | Poster |

465 | A Neural Tangent Kernel Perspective of Infinite Tree Ensembles | 6.33 | 6.33 | 0.00 | | Poster |

466 | Fairness Guarantees under Demographic Shift | 5.75 | 6.25 | 0.50 | | Poster |

467 | Connectome-constrained Latent Variable Model of Whole-Brain Neural Activity | 5.00 | 6.25 | 1.25 | | Poster |

468 | Automated Self-Supervised Learning for Graphs | 6.00 | 6.25 | 0.25 | | Poster |

469 | Knowledge Infused Decoding | 6.00 | 6.25 | 0.25 | | Poster |

470 | Distributional Reinforcement Learning with Monotonic Splines | 6.00 | 6.25 | 0.25 | | Poster |

471 | AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation | 6.25 | 6.25 | 0.00 | | Poster |

472 | Learning Value Functions from Undirected State-only Experience | 6.00 | 6.25 | 0.25 | | Poster |

473 | Finding an Unsupervised Image Segmenter in each of your Deep Generative Models | 6.25 | 6.25 | 0.00 | | Poster |

474 | Neural Processes with Stochastic Attention: Paying more attention to the context dataset | 5.50 | 6.25 | 0.75 | | Poster |

475 | SUMNAS: Supernet with Unbiased Meta-Features for Neural Architecture Search | 5.75 | 6.25 | 0.50 | | Poster |

476 | Semi-relaxed Gromov-Wasserstein divergence and applications on graphs | 6.25 | 6.25 | 0.00 | | Poster |

477 | Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks | 5.50 | 6.25 | 0.75 | | Poster |

478 | Neural Link Prediction with Walk Pooling | 5.75 | 6.25 | 0.50 | | Poster |

479 | Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference | 5.00 | 6.25 | 1.25 | | Poster |

480 | Adversarial Retriever-Ranker for Dense Text Retrieval | 6.00 | 6.25 | 0.25 | | Poster |

481 | Provable Learning-based Algorithm For Sparse Recovery | 5.00 | 6.25 | 1.25 | | Poster |

482 | Goal-Directed Planning via Hindsight Experience Replay | 5.50 | 6.25 | 0.75 | | Poster |

483 | GDA-AM: ON THE EFFECTIVENESS OF SOLVING MIN-IMAX OPTIMIZATION VIA ANDERSON MIXING | 4.75 | 6.25 | 1.50 | | Poster |

484 | The Essential Elements of Offline RL via Supervised Learning | 4.75 | 6.25 | 1.50 | | Poster |

485 | Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism | 6.25 | 6.25 | 0.00 | | Poster |

486 | Conditional Contrastive Learning with Kernel | 5.50 | 6.25 | 0.75 | | Poster |

487 | Differentiable Gradient Sampling for Learning Implicit 3D Scene Reconstructions from a Single Image | 5.75 | 6.25 | 0.50 | | Poster |

488 | The Three Stages of Learning Dynamics in High-dimensional Kernel Methods | 6.25 | 6.25 | 0.00 | | Poster |

489 | FedBABU: Toward Enhanced Representation for Federated Image Classification | 6.00 | 6.25 | 0.25 | | Poster |

490 | Curriculum learning as a tool to uncover learning principles in the brain | 5.00 | 6.25 | 1.25 | | Poster |

491 | Model Zoo: A Growing Brain That Learns Continually | 6.25 | 6.25 | 0.00 | | Poster |

492 | Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series | 5.50 | 6.25 | 0.75 | | Poster |

493 | Fast Model Editing at Scale | 6.33 | 6.25 | -0.08 | | Poster |

494 | TAda! Temporally-Adaptive Convolutions for Video Understanding | 5.50 | 6.25 | 0.75 | | Poster |

495 | Blaschke Product Neural Networks (BPNN): A Physics-Infused Neural Network for Phase Retrieval of Meromorphic Functions | 5.00 | 6.25 | 1.25 | | Poster |

496 | Step-unrolled Denoising Autoencoders for Text Generation | 5.50 | 6.25 | 0.75 | | Poster |

497 | Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL | 6.25 | 6.25 | 0.00 | | Poster |

498 | Neural Parameter Allocation Search | 5.00 | 6.25 | 1.25 | | Poster |

499 | Generalized Kernel Thinning | 6.25 | 6.25 | 0.00 | | Poster |

500 | Do deep networks transfer invariances across classes? | 5.25 | 6.25 | 1.00 | | Poster |

501 | Transferable Visual Control Policies Through Robot-Awareness | 5.50 | 6.25 | 0.75 | | Poster |

502 | Deep Point Cloud Reconstruction | 6.25 | 6.25 | 0.00 | | Poster |

503 | Learning curves for continual learning in neural networks: Self-knowledge transfer and forgetting | 6.25 | 6.25 | 0.00 | | Poster |

504 | Collapse by Conditioning: Training Class-conditional GANs with Limited Data | 6.00 | 6.25 | 0.25 | | Poster |

505 | Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients | 6.00 | 6.25 | 0.25 | | Poster |

506 | Is Importance Weighting Incompatible with Interpolating Classifiers? | 5.67 | 6.25 | 0.58 | | Poster |

507 | Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings | 6.25 | 6.25 | 0.00 | | Poster |

508 | How Much Can CLIP Benefit Vision-and-Language Tasks? | 5.75 | 6.25 | 0.50 | | Poster |

509 | It Takes Four to Tango: Multiagent Self Play for Automatic Curriculum Generation | 5.00 | 6.25 | 1.25 | | Poster |

510 | Large-Scale Representation Learning on Graphs via Bootstrapping | 6.00 | 6.25 | 0.25 | | Poster |

511 | Neural Contextual Bandits with Deep Representation and Shallow Exploration | 6.75 | 6.25 | -0.50 | | Poster |

512 | Robbing the Fed: Directly Obtaining Private Data in Federated Learning with Modified Models | 6.25 | 6.25 | 0.00 | | Poster |

513 | Discriminative Similarity for Data Clustering | 6.25 | 6.25 | 0.00 | | Poster |

514 | CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting | 6.00 | 6.25 | 0.25 | | Poster |

515 | The Evolution of Uncertainty of Learning in Games | 5.75 | 6.25 | 0.50 | | Poster |

516 | Enabling Arbitrary Translation Objectives with Adaptive Tree Search | 6.00 | 6.25 | 0.25 | | Poster |

517 | CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention | 5.75 | 6.25 | 0.50 | | Poster |

518 | Subspace Regularizers for Few-Shot Class Incremental Learning | 5.75 | 6.25 | 0.50 | | Poster |

519 | Explainable GNN-Based Models over Knowledge Graphs | 5.25 | 6.25 | 1.00 | | Poster |

520 | Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning | 4.67 | 6.25 | 1.58 | | Poster |

521 | R4D: Utilizing Reference Objects for Long-Range Distance Estimation | 6.25 | 6.25 | 0.00 | | Poster |

522 | A Biologically Interpretable Graph Convolutional Network to Link Genetic Risk Pathways and Imaging Phenotypes of Disease | 5.75 | 6.25 | 0.50 | | Poster |

523 | CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG Signals | 6.00 | 6.25 | 0.25 | | Poster |

524 | How Low Can We Go: Trading Memory for Error in Low-Precision Training | 5.75 | 6.25 | 0.50 | | Poster |

525 | Boosting the Certified Robustness of L-infinity Distance Nets | 5.75 | 6.25 | 0.50 | | Poster |

526 | Memory Augmented Optimizers for Deep Learning | 6.25 | 6.25 | 0.00 | | Poster |

527 | Gaussian Mixture Convolution Networks | 6.33 | 6.25 | -0.08 | | Poster |

528 | Evidential Turing Processes | 5.50 | 6.25 | 0.75 | | Poster |

529 | A global convergence theory for deep ReLU implicit networks via over-parameterization | 6.25 | 6.25 | 0.00 | | Poster |

530 | How Well Does Self-Supervised Pre-Training Perform with Streaming Data? | 6.00 | 6.25 | 0.25 | | Poster |

531 | Scale Efficiently: Insights from Pretraining and Finetuning Transformers | 6.25 | 6.25 | 0.00 | | Poster |

532 | Learning to Extend Molecular Scaffolds with Structural Motifs | 6.25 | 6.25 | 0.00 | | Poster |

533 | Group-based Interleaved Pipeline Parallelism for Large-scale DNN Training | 6.25 | 6.25 | 0.00 | | Poster |

534 | Switch to Generalize: Domain-Switch Learning for Cross-Domain Few-Shot Classification | 5.75 | 6.25 | 0.50 | | Poster |

535 | DriPP: Driven Point Processes to Model Stimuli Induced Patterns in M/EEG Signals | 5.00 | 6.25 | 1.25 | | Poster |

536 | Taming Sparsely Activated Transformer with Stochastic Experts | 5.75 | 6.25 | 0.50 | | Poster |

537 | Quantitative Performance Assessment of CNN Units via Topological Entropy Calculation | 5.50 | 6.25 | 0.75 | | Poster |

538 | Unsupervised Disentanglement with Tensor Product Representations on the Torus | 6.25 | 6.25 | 0.00 | | Poster |

539 | Multi-Agent MDP Homomorphic Networks | 6.00 | 6.25 | 0.25 | | Poster |

540 | DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning | 6.00 | 6.25 | 0.25 | | Poster |

541 | Online Coreset Selection for Rehearsal-based Continual Learning | 5.75 | 6.25 | 0.50 | | Poster |

542 | Mirror Descent Policy Optimization | 5.75 | 6.25 | 0.50 | | Poster |

543 | On-Policy Model Errors in Reinforcement Learning | 6.00 | 6.25 | 0.25 | | Poster |

544 | In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications | 4.75 | 6.25 | 1.50 | | Poster |

545 | Multi-Mode Deep Matrix and Tensor Factorization | 6.33 | 6.25 | -0.08 | | Poster |

546 | Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage | 6.25 | 6.25 | 0.00 | | Poster |

547 | Scale Mixtures of Neural Network Gaussian Processes | 6.00 | 6.25 | 0.25 | | Poster |

548 | Monotonic Differentiable Sorting Networks | 6.00 | 6.25 | 0.25 | | Poster |

549 | Target-Side Data Augmentation for Sequence Generation | 4.75 | 6.25 | 1.50 | | Poster |

550 | Quadtree Attention for Vision Transformers | 6.25 | 6.25 | 0.00 | | Poster |

551 | Igeood: An Information Geometry Approach to Out-of-Distribution Detection | 5.00 | 6.25 | 1.25 | | Poster |

552 | Continual Normalization: Rethinking Batch Normalization for Online Continual Learning | 5.50 | 6.25 | 0.75 | | Poster |

553 | On feature learning in shallow and multi-layer neural networks with global convergence guarantees | 5.50 | 6.25 | 0.75 | | Poster |

554 | Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum | 6.25 | 6.25 | 0.00 | | Poster |

555 | Generative Modeling with Optimal Transport Maps | 6.00 | 6.25 | 0.25 | | Poster |

556 | Multi-Task Processes | 6.00 | 6.25 | 0.25 | | Poster |

557 | Expressivity of Emergent Languages is a Trade-off between Contextual Complexity and Unpredictability | 5.50 | 6.25 | 0.75 | | Poster |

558 | Zero-CL: Instance and Feature decorrelation for negative-free symmetric contrastive learning | 6.25 | 6.25 | 0.00 | | Poster |

559 | GATSBI: Generative Adversarial Training for Simulation-Based Inference | 6.00 | 6.25 | 0.25 | | Poster |

560 | Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression | 6.00 | 6.25 | 0.25 | | Poster |

561 | Rethinking Class-Prior Estimation for Positive-Unlabeled Learning | 6.00 | 6.25 | 0.25 | | Poster |

562 | Top-N: Equivariant Set and Graph Generation without Exchangeability | 5.00 | 6.25 | 1.25 | | Poster |

563 | FastSHAP: Real-Time Shapley Value Estimation | 5.00 | 6.25 | 1.25 | | Poster |

564 | Autoregressive Diffusion Models | 6.25 | 6.25 | 0.00 | | Poster |

565 | Maximum Entropy RL (Provably) Solves Some Robust RL Problems | 5.75 | 6.25 | 0.50 | | Poster |

566 | Constraining Linear-chain CRFs to Regular Languages | 5.75 | 6.25 | 0.50 | | Poster |

567 | Neural Markov Controlled SDE: Stochastic Optimization for Continuous-Time Data | 6.25 | 6.25 | 0.00 | | Poster |

568 | Disentanglement Analysis with Partial Information Decomposition | 5.50 | 6.25 | 0.75 | | Poster |

569 | Hindsight Foresight Relabeling for Meta-Reinforcement Learning | 5.00 | 6.25 | 1.25 | | Poster |

570 | Graph Auto-Encoder via Neighborhood Wasserstein Reconstruction | 6.25 | 6.25 | 0.00 | | Poster |

571 | Self-ensemble Adversarial Training for Improved Robustness | 5.00 | 6.25 | 1.25 | | Poster |

572 | An Autoregressive Flow Model for 3D Molecular Geometry Generation from Scratch | 6.25 | 6.25 | 0.00 | | Poster |

573 | Learning Fast, Learning Slow: A General Continual Learning Method based on Complementary Learning System | 6.25 | 6.25 | 0.00 | | Poster |

574 | Non-Parallel Text Style Transfer with Self-Parallel Supervision | 5.00 | 6.20 | 1.20 | 6, 6, 5, 3, 5 | 8, 6, 8, 3, 6 |
575 | Cross-Domain Lossy Compression as Optimal Transport with an Entropy Bottleneck | 6.20 | 6.20 | 0.00 | 3, 8, 6, 6, 8 | 3, 8, 6, 6, 8 |
576 | NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training | 6.20 | 6.20 | 0.00 | 6, 5, 6, 8, 6 | 6, 5, 6, 8, 6 |
577 | Policy Smoothing for Provably Robust Reinforcement Learning | 5.40 | 6.20 | 0.80 | 6, 6, 6, 6, 3 | 6, 8, 6, 6, 5 |
578 | The Spectral Bias of Polynomial Neural Networks | 5.40 | 6.20 | 0.80 | 3, 6, 6, 6, 6 | 5, 6, 6, 8, 6 |
579 | Fair Normalizing Flows | 5.00 | 6.20 | 1.20 | 6, 3, 5, 5, 6 | 6, 5, 8, 6, 6 |
580 | Understanding Dimensional Collapse in Contrastive Self-supervised Learning | 5.60 | 6.20 | 0.60 | 6, 3, 8, 6, 5 | 6, 6, 8, 6, 5 |
581 | A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features | 6.00 | 6.20 | 0.20 | 5, 8, 6, 6, 5 | 5, 8, 6, 6, 6 |
582 | BiBERT: Accurate Fully Binarized BERT | 6.00 | 6.20 | 0.20 | 5, 6, 5, 6, 8 | 6, 6, 5, 6, 8 |
583 | Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective | 5.80 | 6.20 | 0.40 | 5, 5, 6, 5, 8 | 6, 5, 6, 6, 8 |
584 | OBJECT DYNAMICS DISTILLATION FOR SCENE DECOMPOSITION AND REPRESENTATION | 6.00 | 6.20 | 0.20 | 5, 5, 8, 6, 6 | 6, 5, 8, 6, 6 |
585 | On Redundancy and Diversity in Cell-based Neural Architecture Search | 6.00 | 6.20 | 0.20 | 5, 5, 8, 6, 6 | 5, 6, 8, 6, 6 |
586 | Efficient Neural Causal Discovery without Acyclicity Constraints | 6.00 | 6.20 | 0.20 | 6, 6, 5, 8, 5 | 6, 6, 5, 8, 6 |
587 | Top-label calibration and multiclass-to-binary reductions | 5.50 | 6.00 | 0.50 | | Poster |

588 | PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication | 5.75 | 6.00 | 0.25 | | Poster |

589 | Auto-Transfer: Learning to Route Transferable Representations | 5.00 | 6.00 | 1.00 | | Poster |

590 | FILM: Following Instructions in Language with Modular Methods | 6.25 | 6.00 | -0.25 | | Poster |

591 | Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers | 6.00 | 6.00 | 0.00 | | Poster |

592 | Language model compression with weighted low-rank factorization | 5.33 | 6.00 | 0.67 | | Poster |

593 | The Effects of Invertibility on the Representational Complexity of Encoders in Variational Autoencoders | 4.67 | 6.00 | 1.33 | | Poster |

594 | Prototype memory and attention mechanisms for few shot image generation | 6.00 | 6.00 | 0.00 | | Poster |

595 | LEARNING GUARANTEES FOR GRAPH CONVOLUTIONAL NETWORKS ON THE STOCHASTIC BLOCK MODEL | 5.50 | 6.00 | 0.50 | | Poster |

596 | CrossMatch: Cross-Classifier Consistency Regularization for Open-Set Single Domain Generalization | 5.50 | 6.00 | 0.50 | | Poster |

597 | LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning | 5.25 | 6.00 | 0.75 | | Poster |

598 | Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods | 5.25 | 6.00 | 0.75 | | Poster |

599 | Learning Representation from Neural Fisher Kernel with Low-rank Approximation | 6.00 | 6.00 | 0.00 | | Poster |

600 | Discrete Representations Strengthen Vision Transformer Robustness | 5.33 | 6.00 | 0.67 | | Poster |

601 | Modeling Label Space Interactions in Multi-label Classification using Box Embeddings | 6.00 | 6.00 | 0.00 | | Poster |

602 | Graph-Guided Network for Irregularly Sampled Multivariate Time Series | 5.33 | 6.00 | 0.67 | | Poster |

603 | Learning to Dequantise with Truncated Flows | 5.33 | 6.00 | 0.67 | | Poster |

604 | Axiomatic Explanations for Visual Search, Retrieval, and Similarity Learning | 5.00 | 6.00 | 1.00 | | Poster |

605 | Autonomous Reinforcement Learning: Formalism and Benchmarking | 6.00 | 6.00 | 0.00 | | Poster |

606 | VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects | 5.00 | 6.00 | 1.00 | | Poster |

607 | An Agnostic Approach to Federated Learning with Class Imbalance | 5.50 | 6.00 | 0.50 | | Poster |

608 | Generalization Through the Lens of Leave-One-Out Error | 4.67 | 6.00 | 1.33 | | Poster |

609 | Complete Verification via Multi-Neuron Relaxation Guided Branch-and-Bound | 4.80 | 6.00 | 1.20 | 5, 5, 5, 6, 3 | 6, 6, 6, 6, 6 |
610 | Augmented Sliced Wasserstein Distances | 6.00 | 6.00 | 0.00 | | Poster |

611 | W-CTC: a Connectionist Temporal Classification Loss with Wild Cards | 5.75 | 6.00 | 0.25 | | Poster |

612 | DictFormer: Tiny Transformer with Shared Dictionary | 5.25 | 6.00 | 0.75 | | Poster |

613 | Nonlinear ICA Using Volume-Preserving Transformations | 5.80 | 6.00 | 0.20 | 6, 6, 6, 6, 5 | 6, 6, 6, 6, 6 |
614 | Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation | 4.67 | 6.00 | 1.33 | | Poster |

615 | PoNet: Pooling Network for Efficient Token Mixing in Long Sequences | 5.75 | 6.00 | 0.25 | | Poster |

616 | DISSECT: Disentangled Simultaneous Explanations via Concept Traversals | 5.75 | 6.00 | 0.25 | | Poster |

617 | Is Homophily a Necessity for Graph Neural Networks? | 5.25 | 6.00 | 0.75 | | Poster |

618 | Query Embedding on Hyper-Relational Knowledge Graphs | 6.00 | 6.00 | 0.00 | 8, 5, 5, 6, 6 | 8, 5, 5, 6, 6 |
619 | Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation | 5.00 | 6.00 | 1.00 | | Poster |

620 | Selective Ensembles for Consistent Predictions | 5.50 | 6.00 | 0.50 | | Poster |

621 | Open-World Semi-Supervised Learning | 5.80 | 6.00 | 0.20 | 6, 6, 6, 6, 5 | 6, 6, 6, 6, 6 |
622 | On the benefits of maximum likelihood estimation for Regression and Forecasting | 5.33 | 6.00 | 0.67 | | Poster |

623 | An Explanation of In-context Learning as Implicit Bayesian Inference | 5.50 | 6.00 | 0.50 | | Poster |

624 | Stein Latent Optimization for Generative Adversarial Networks | 5.50 | 6.00 | 0.50 | | Poster |

625 | Pseudo Numerical Methods for Diffusion Models on Manifolds | 6.00 | 6.00 | 0.00 | | Poster |

626 | Discrepancy-Based Active Learning for Domain Adaptation | 5.75 | 6.00 | 0.25 | | Poster |

627 | Adversarial Unlearning of Backdoors via Implicit Hypergradient | 5.25 | 6.00 | 0.75 | | Poster |

628 | Surrogate NAS Benchmarks: Going Beyond the Limited Search Spaces of Tabular NAS Benchmarks | 5.50 | 6.00 | 0.50 | | Poster |

629 | Offline Reinforcement Learning for Large Scale Language Action Spaces | 5.00 | 6.00 | 1.00 | | Poster |

630 | Generalized Natural Gradient Flows in Hidden Convex-Concave Games and GANs | 5.25 | 6.00 | 0.75 | | Poster |

631 | Learning Weakly-supervised Contrastive Representations | 5.50 | 6.00 | 0.50 | | Poster |

632 | Generalizing Few-Shot NAS with Gradient Matching | 5.75 | 6.00 | 0.25 | | Poster |

633 | THOMAS: Trajectory Heatmap Output with learned Multi-Agent Sampling | 5.00 | 6.00 | 1.00 | | Poster |

634 | SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning | 5.33 | 6.00 | 0.67 | | Poster |

635 | Scaling the Depth of Vision Transformers via the Fourier Domain Analysis | 5.33 | 6.00 | 0.67 | | Poster |

636 | Illiterate DALL⋅E Learns to Compose | 5.33 | 6.00 | 0.67 | | Poster |

637 | Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning | 4.75 | 6.00 | 1.25 | | Poster |

638 | Variational autoencoders in the presence of low-dimensional data: landscape and implicit bias | 5.00 | 6.00 | 1.00 | | Poster |

639 | Online Adversarial Attacks | 5.25 | 6.00 | 0.75 | | Poster |

640 | Provably convergent quasistatic dynamics for mean-field two-player zero-sum games | 5.75 | 6.00 | 0.25 | | Poster |

641 | Space-Time Graph Neural Networks | 6.00 | 6.00 | 0.00 | | Poster |

642 | IGLU: Efficient GCN Training via Lazy Updates | 5.67 | 6.00 | 0.33 | | Poster |

643 | On the Pitfalls of Heteroscedastic Uncertainty Estimation with Probabilistic Neural Networks | 4.33 | 6.00 | 1.67 | | Poster |

644 | RegionViT: Regional-to-Local Attention for Vision Transformers | 6.00 | 6.00 | 0.00 | | Poster |

645 | Group equivariant neural posterior estimation | 5.25 | 6.00 | 0.75 | | Poster |

646 | GeneDisco: A Benchmark for Experimental Design in Drug Discovery | 4.67 | 6.00 | 1.33 | | Poster |

647 | One After Another: Learning Incremental Skills for a Changing World | 4.75 | 6.00 | 1.25 | | Poster |

648 | Hidden Parameter Recurrent State Space Models For Changing Dynamics Scenarios | 5.00 | 6.00 | 1.00 | | Poster |

649 | Universalizing Weak Supervision | 5.25 | 6.00 | 0.75 | | Poster |

650 | Toward Efficient Low-Precision Training: Data Format Optimization and Hysteresis Quantization | 4.67 | 6.00 | 1.33 | | Poster |

651 | The Rich Get Richer: Disparate Impact of Semi-Supervised Learning | 5.50 | 6.00 | 0.50 | | Poster |

652 | On the role of population heterogeneity in emergent communication | 5.00 | 6.00 | 1.00 | | Poster |

653 | MoReL: Multi-omics Relational Learning | 6.00 | 6.00 | 0.00 | | Poster |

654 | Topological Graph Neural Networks | 5.75 | 6.00 | 0.25 | | Poster |

655 | Measuring CLEVRness: Black-box Testing of Visual Reasoning Models | 5.67 | 6.00 | 0.33 | | Poster |

656 | TPU-GAN: Learning temporal coherence from dynamic point cloud sequences | 5.80 | 6.00 | 0.20 | 6, 6, 6, 6, 5 | 6, 6, 6, 6, 6 |
657 | OntoProtein: Protein Pretraining With Gene Ontology Embedding | 5.67 | 6.00 | 0.33 | | Poster |

658 | Orchestrated Value Mapping for Reinforcement Learning | 5.67 | 6.00 | 0.33 | | Poster |

659 | Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization | 5.50 | 6.00 | 0.50 | | Poster |

660 | Do Users Benefit From Interpretable Vision? A User Study, Baseline, And Dataset | 5.25 | 6.00 | 0.75 | | Poster |

661 | Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games | 5.25 | 6.00 | 0.75 | | Poster |

662 | Training Transition Policies via Distribution Matching for Complex Tasks | 6.00 | 6.00 | 0.00 | | Poster |

663 | On Robust Prefix-Tuning for Text Classification | 5.50 | 6.00 | 0.50 | | Poster |

664 | The Efficiency Misnomer | 4.75 | 6.00 | 1.25 | | Poster |

665 | Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations | 5.75 | 6.00 | 0.25 | | Poster |

666 | Neural Methods for Logical Reasoning over Knowledge Graphs | 5.25 | 6.00 | 0.75 | | Poster |

667 | Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes | 5.75 | 6.00 | 0.25 | | Poster |

668 | Charformer: Fast Character Transformers via Gradient-based Subword Tokenization | 6.00 | 6.00 | 0.00 | 6, 8, 6, 5, 5 | 6, 8, 6, 5, 5 |
669 | Signing the Supermask: Keep, Hide, Invert | 5.00 | 6.00 | 1.00 | | Poster |

670 | Attention-based Interpretability with Concept Transformers | 5.25 | 6.00 | 0.75 | | Poster |

671 | Normalization of Language Embeddings for Cross-Lingual Alignment | 5.60 | 6.00 | 0.40 | 8, 6, 5, 3, 6 | 8, 6, 5, 3, 8 |
672 | Offline Reinforcement Learning with In-sample Q-Learning | 5.50 | 6.00 | 0.50 | | Poster |

673 | Differentiable DAG Sampling | 6.00 | 6.00 | 0.00 | | Poster |

674 | On the Convergence of mSGD and AdaGrad for Stochastic Optimization | 5.67 | 6.00 | 0.33 | | Poster |

675 | Neural Stochastic Dual Dynamic Programming | 5.75 | 6.00 | 0.25 | | Poster |

676 | ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind | 5.33 | 6.00 | 0.67 | | Poster |

677 | Learning Invariant Representations on Multilingual Language Models for Unsupervised Cross-Lingual Transfer | 5.50 | 6.00 | 0.50 | | Poster |

678 | Learning Curves for SGD on Structured Features | 5.75 | 6.00 | 0.25 | | Poster |

679 | Learning Scenario Representation for Solving Two-stage Stochastic Integer Programs | 4.33 | 6.00 | 1.67 | | Poster |

680 | Recursive Disentanglement Network | 5.25 | 6.00 | 0.75 | | Poster |

681 | MAML is a Noisy Contrastive Learner | 5.33 | 6.00 | 0.67 | | Poster |

682 | L0-Sparse Canonical Correlation Analysis | 6.00 | 6.00 | 0.00 | | Poster |

683 | Hot-Refresh Model Upgrades with Regression-Free Compatible Training in Image Retrieval | 5.75 | 6.00 | 0.25 | | Poster |

684 | A Statistical Framework for Efficient Out of Distribution Detection in Deep Neural Networks | 5.50 | 6.00 | 0.50 | | Poster |

685 | Transfer RL across Observation Feature Spaces via Model-Based Regularization | 5.25 | 6.00 | 0.75 | | Poster |

686 | A Theory of Tournament Representations | 5.25 | 6.00 | 0.75 | | Poster |

687 | Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning | 5.50 | 6.00 | 0.50 | | Poster |

688 | Conditioning Sequence-to-sequence Networks with Learned Activations | 5.67 | 6.00 | 0.33 | | Poster |

689 | PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series | 5.50 | 6.00 | 0.50 | | Poster |

690 | Controlling the Complexity and Lipschitz Constant improves Polynomial Nets | 6.00 | 6.00 | 0.00 | | Poster |

691 | Vector-quantized Image Modeling with Improved VQGAN | 5.50 | 6.00 | 0.50 | | Poster |

692 | Sample Efficient Stochastic Policy Extragradient Algorithm for Zero-Sum Markov Game | 5.60 | 6.00 | 0.40 | 5, 6, 6, 6, 5 | 6, 6, 6, 6, 6 |
693 | Optimal Transport for Long-Tailed Recognition with Learnable Cost Matrix | 5.33 | 6.00 | 0.67 | | Poster |

694 | BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models | 6.00 | 6.00 | 0.00 | | Poster |

695 | Partial Wasserstein Adversarial Network for Non-rigid Point Set Registration | 5.80 | 6.00 | 0.20 | 6, 6, 6, 6, 5 | 6, 6, 6, 6, 6 |
696 | Few-Shot Backdoor Attacks on Visual Object Tracking | 5.33 | 6.00 | 0.67 | | Poster |

697 | Generative Pseudo-Inverse Memory | 6.00 | 6.00 | 0.00 | | Poster |

698 | PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior | 5.00 | 6.00 | 1.00 | | Poster |

699 | How Attentive are Graph Attention Networks? | 6.00 | 6.00 | 0.00 | | Poster |

700 | Dropout Q-Functions for Doubly Efficient Reinforcement Learning | 4.67 | 6.00 | 1.33 | | Poster |

701 | Evaluating Disentanglement of Structured Latent Representations | 5.67 | 6.00 | 0.33 | | Poster |

702 | MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts | 5.67 | 6.00 | 0.33 | | Poster |

703 | iFlood: A Stable and Effective Regularizer | 5.25 | 6.00 | 0.75 | | Poster |

704 | An Operator Theoretic View On Pruning Deep Neural Networks | 6.25 | 6.00 | -0.25 | | Poster |

705 | Optimizer Amalgamation | 5.75 | 6.00 | 0.25 | | Poster |

706 | Wisdom of Committees: An Overlooked Approach To Faster and More Accurate Models | 5.33 | 6.00 | 0.67 | | Poster |

707 | Neural graphical modelling in continuous-time: consistency guarantees and algorithms | 6.50 | 6.00 | -0.50 | | Poster |

708 | Adaptive Wavelet Transformer Network for 3D Shape Representation Learning | 5.75 | 6.00 | 0.25 | | Poster |

709 | Transferable Adversarial Attack based on Integrated Gradients | 5.75 | 6.00 | 0.25 | | Poster |

710 | Learning Graphon Mean Field Games and Approximate Nash Equilibria | 6.00 | 6.00 | 0.00 | | Poster |

711 | Benchmarking the Spectrum of Agent Capabilities | 5.75 | 6.00 | 0.25 | | Poster |

712 | Generalisation in Lifelong Reinforcement Learning through Logical Composition | 4.67 | 5.83 | 1.17 | 5, 3, 3, 6, 6, 5 | 5, 5, 5, 8, 6, 6 |
713 | Graph-based Nearest Neighbor Search in Hyperbolic Spaces | 7.00 | 5.80 | -1.20 | | Poster |

714 | Why Propagate Alone? Parallel Use of Labels and Features on Graphs | 5.40 | 5.80 | 0.40 | 5, 5, 3, 6, 8 | 5, 5, 5, 6, 8 |
715 | Symbolic Learning to Optimize: Towards Interpretability and Scalability | 4.80 | 5.80 | 1.00 | 6, 5, 3, 5, 5 | 6, 6, 5, 6, 6 |
716 | Regularized Autoencoders for Isometric Representation Learning | 5.80 | 5.80 | 0.00 | 6, 5, 5, 8, 5 | 6, 5, 5, 8, 5 |
717 | Data Efficient Language-Supervised Zero-Shot Recognition with Optimal Transport Distillation | 5.40 | 5.80 | 0.40 | 5, 5, 6, 5, 6 | 5, 6, 6, 6, 6 |
718 | Relational Learning with Variational Bayes | 5.60 | 5.80 | 0.20 | 5, 6, 5, 6, 6 | 5, 6, 6, 6, 6 |
719 | Amortized Implicit Differentiation for Stochastic Bilevel Optimization | 5.60 | 5.80 | 0.20 | 3, 6, 5, 8, 6 | 3, 6, 6, 8, 6 |
720 | A Generalized Weighted Optimization Method for Computational Learning and Inversion | 5.25 | 5.80 | 0.55 | | Poster |

721 | Towards Empirical Sandwich Bounds on the Rate-Distortion Function | 4.25 | 5.75 | 1.50 | | Poster |

722 | Network Augmentation for Tiny Deep Learning | 5.25 | 5.75 | 0.50 | | Poster |

723 | QUERY-EFFICIENT DECISION-BASED SPARSE ATTACKS AGAINST BLACK-BOX MACHINE LEARNING MODELS | 5.75 | 5.75 | 0.00 | | Poster |

724 | Graph Condensation for Graph Neural Networks | 5.25 | 5.75 | 0.50 | | Poster |

725 | A Tale of Two Flows: Cooperative Learning of Langevin Flow and Normalizing Flow Toward Energy-Based Model | 5.50 | 5.75 | 0.25 | | Poster |

726 | An Information Fusion Approach to Learning with Instance-Dependent Label Noise | 5.50 | 5.75 | 0.25 | | Poster |

727 | From Intervention to Domain Transportation: A Novel Perspective to Optimize Recommendation | 5.67 | 5.75 | 0.08 | | Poster |

728 | GradMax: Growing Neural Networks using Gradient Information | 5.00 | 5.75 | 0.75 | | Poster |

729 | Provable Adaptation across Multiway Domains via Representation Learning | 5.25 | 5.75 | 0.50 | | Poster |

730 | Learning Efficient Online 3D Bin Packing on Packing Configuration Trees | 5.25 | 5.75 | 0.50 | | Poster |

731 | Bandit Learning with Joint Effect of Incentivized Sampling, Delayed Sampling Feedback, and Self-Reinforcing User Preferences | 5.00 | 5.75 | 0.75 | | Poster |

732 | A Comparison of Variable Selection Methods for Blockwise Diagonal Designs | 5.50 | 5.75 | 0.25 | | Poster |

733 | A Zest of LIME: Towards Architecture-Independent Model Distances | 5.25 | 5.75 | 0.50 | | Poster |

734 | Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks | 5.50 | 5.75 | 0.25 | | Poster |

735 | Task-Induced Representation Learning | 4.75 | 5.75 | 1.00 | | Poster |

736 | Constructing Orthogonal Convolutions in an Explicit Manner | 5.33 | 5.75 | 0.42 | | Poster |

737 | Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning | 5.50 | 5.75 | 0.25 | | Poster |

738 | FP-DETR: Detection Transformer Advanced by Fully Pre-training | 5.50 | 5.75 | 0.25 | | Poster |

739 | Reward Uncertainty for Exploration in Preference-based Reinforcement Learning | 4.00 | 5.75 | 1.75 | | Poster |

740 | Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning | 5.00 | 5.75 | 0.75 | | Poster |

741 | Rethinking Supervised Pre-Training for Better Downstream Transferring | 5.00 | 5.75 | 0.75 | | Poster |

742 | Geometric Transformers for Protein Interface Contact Prediction | 5.00 | 5.75 | 0.75 | | Poster |

743 | Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative | 6.25 | 5.75 | -0.50 | | Poster |

744 | Diverse Client Selection for Federated Learning via Submodular Maximization | 5.75 | 5.75 | 0.00 | | Poster |

745 | Neural Energy Minimization for Molecular Conformation Optimization | 4.25 | 5.75 | 1.50 | | Poster |

746 | Towards Continual Knowledge Learning of Language Models | 5.75 | 5.75 | 0.00 | | Poster |

747 | Learning Generalizable Representations for Reinforcement Learning via Adaptive Meta-learner of Behavioral Similarities | 5.00 | 5.75 | 0.75 | | Poster |

748 | KL Guided Domain Adaptation | 5.25 | 5.75 | 0.50 | | Poster |

749 | CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation | 5.00 | 5.75 | 0.75 | | Poster |

750 | Generalized Demographic Parity for Group Fairness | 4.75 | 5.75 | 1.00 | | Poster |

751 | Evaluating Language-biased image classification based on semantic compositionality | 5.75 | 5.75 | 0.00 | | Poster |

752 | ConFeSS: A Framework for Single Source Cross-Domain Few-Shot Learning | 5.75 | 5.75 | 0.00 | | Poster |

753 | Permutation Compressors for Provably Faster Distributed Nonconvex Optimization | 5.50 | 5.75 | 0.25 | | Poster |

754 | Distributionally Robust Fair Principal Components via Geodesic Descents | 5.75 | 5.75 | 0.00 | | Poster |

755 | DKM: Differentiable k-Means Clustering Layer for Neural Network Compression | 5.25 | 5.75 | 0.50 | | Poster |

756 | Variational Neural Cellular Automata | 4.75 | 5.75 | 1.00 | | Poster |

757 | On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning | 5.75 | 5.75 | 0.00 | | Poster |

758 | Towards Model Agnostic Federated Learning Using Knowledge Distillation | 5.25 | 5.75 | 0.50 | | Poster |

759 | Towards Building A Group-based Unsupervised Representation Disentanglement Framework | 5.50 | 5.75 | 0.25 | | Poster |

760 | Demystifying Limited Adversarial Transferability in Automatic Speech Recognition Systems | 5.75 | 5.75 | 0.00 | | Poster |

761 | Learning a subspace of policies for online adaptation in Reinforcement Learning | 5.00 | 5.75 | 0.75 | | Poster |

762 | Focus on the Common Good: Group Distributional Robustness Follows | 5.75 | 5.75 | 0.00 | | Poster |

763 | Adaptive Filters for Low-Latency and Memory-Efficient Graph Neural Networks | 5.75 | 5.75 | 0.00 | | Poster |

764 | GLASS: GNN with Labeling Tricks for Subgraph Representation Learning | 5.25 | 5.75 | 0.50 | | Poster |

765 | Data Poisoning Won’t Save You From Facial Recognition | 5.50 | 5.75 | 0.25 | | Poster |

766 | FILIP: Fine-grained Interactive Language-Image Pre-Training | 5.50 | 5.75 | 0.25 | | Poster |

767 | Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity | 5.25 | 5.75 | 0.50 | | Poster |

768 | Understanding approximate and unrolled dictionary learning for pattern recovery | 4.75 | 5.75 | 1.00 | | Poster |

769 | Variational oracle guiding for reinforcement learning | 5.50 | 5.75 | 0.25 | | Poster |

770 | HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning | 5.50 | 5.75 | 0.25 | | Poster |

771 | Towards Distribution Shift of Node-Level Prediction on Graphs: An Invariance Perspective | 4.75 | 5.75 | 1.00 | | Poster |

772 | Ada-NETS: Face Clustering via Adaptive Neighbour Discovery in the Structure Space | 5.75 | 5.75 | 0.00 | | Poster |

773 | Optimization inspired Multi-Branch Equilibrium Models | 5.50 | 5.75 | 0.25 | | Poster |

774 | Constrained Physical-Statistics Models for Dynamical System Identification and Prediction | 5.50 | 5.75 | 0.25 | | Poster |

775 | Imitation Learning by Reinforcement Learning | 5.75 | 5.75 | 0.00 | | Poster |

776 | Exploring extreme parameter compression for pre-trained language models | 4.75 | 5.75 | 1.00 | | Poster |

777 | Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations | 6.50 | 5.75 | -0.75 | | Poster |

778 | On the Importance of Difficulty Calibration in Membership Inference Attacks | 5.75 | 5.75 | 0.00 | | Poster |

779 | Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable | 4.67 | 5.75 | 1.08 | | Poster |

780 | Acceleration of Federated Learning with Alleviated Forgetting in Local Training | 5.25 | 5.75 | 0.50 | | Poster |

781 | Learning Synthetic Environments and Reward Networks for Reinforcement Learning | 5.25 | 5.75 | 0.50 | | Poster |

782 | Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach | 4.67 | 5.67 | 1.00 | | Poster |

783 | Graph-Relational Domain Adaptation | 5.33 | 5.67 | 0.33 | | Poster |

784 | Imitation Learning from Observations under Transition Model Disparity | 5.00 | 5.67 | 0.67 | | Poster |

785 | Meta Learning Low Rank Covariance Factors for Energy Based Deterministic Uncertainty | 5.00 | 5.67 | 0.67 | | Poster |

786 | ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity | 4.33 | 5.67 | 1.33 | | Poster |

787 | EXACT: Scalable Graph Neural Networks Training via Extreme Activation Compression | 5.67 | 5.67 | 0.00 | | Poster |

788 | Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming | 5.33 | 5.67 | 0.33 | | Poster |

789 | Task Affinity with Maximum Bipartite Matching in Few-Shot Learning | 5.33 | 5.67 | 0.33 | | Poster |

790 | Neural Spectral Marked Point Processes | 5.67 | 5.67 | 0.00 | | Poster |

791 | Exploiting Class Activation Value for Partial-Label Learning | 5.33 | 5.67 | 0.33 | | Poster |

792 | Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization | 5.67 | 5.67 | 0.00 | | Poster |

793 | Closed-form Sample Probing for Learning Generative Models in Zero-shot Learning | 5.20 | 5.60 | 0.40 | 6, 5, 5, 5, 5 | 6, 6, 5, 6, 5 |
794 | Graph Neural Network Guided Local Search for the Traveling Salesperson Problem | 5.40 | 5.60 | 0.20 | 3, 8, 5, 3, 8 | 3, 8, 6, 3, 8 |
795 | Plant 'n' Seek: Can You Find the Winning Ticket? | 4.80 | 5.60 | 0.80 | 3, 6, 5, 5, 5 | 5, 6, 6, 5, 6 |
796 | Pretrained Language Model in Continual Learning: A Comparative Study | 5.50 | 5.50 | 0.00 | | Poster |

797 | Pre-training Molecular Graph Representation with 3D Geometry | 5.00 | 5.50 | 0.50 | | Poster |

798 | Measuring the Interpretability of Unsupervised Representations via Quantized Reversed Probing | 5.25 | 5.50 | 0.25 | | Poster |

799 | COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks | 5.00 | 5.50 | 0.50 | | Poster |

800 | Diurnal or Nocturnal? Federated Learning of Multi-branch Networks from Periodically Shifting Distributions | 5.00 | 5.50 | 0.50 | | Poster |

801 | PI3NN: Out-of-distribution-aware Prediction Intervals from Three Neural Networks | 5.00 | 5.50 | 0.50 | | Poster |

802 | Towards Evaluating the Robustness of Neural Networks Learned by Transduction | 5.25 | 5.50 | 0.25 | | Poster |

803 | Attacking deep networks with surrogate-based adversarial black-box methods is easy | 5.25 | 5.50 | 0.25 | | Poster |

804 | Crystal Diffusion Variational Autoencoder for Periodic Material Generation | 5.50 | 5.50 | 0.00 | | Poster |

805 | New Insights on Reducing Abrupt Representation Change in Online Continual Learning | 5.50 | 5.50 | 0.00 | | Poster |

806 | Object Pursuit: Building a Space of Objects via Discriminative Weight Generation | 5.25 | 5.50 | 0.25 | | Poster |

807 | Learning State Representations via Retracing in Reinforcement Learning | 5.00 | 5.50 | 0.50 | | Poster |

808 | Understanding and Leveraging Overparameterization in Recursive Value Estimation | 4.75 | 5.50 | 0.75 | | Poster |

809 | The Role of Pretrained Representations for the OOD Generalization of RL Agents | 4.50 | 5.50 | 1.00 | | Poster |

810 | Contrastive Learning is Just Meta-Learning | 5.50 | 5.50 | 0.00 | | Poster |

811 | Non-Linear Operator Approximations for Initial Value Problems | 5.00 | 5.50 | 0.50 | | Poster |

812 | Tuformer: Data-Driven Design of Expressive Transformer by Tucker Tensor Representation | 5.25 | 5.50 | 0.25 | | Poster |

813 | Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How | 4.75 | 5.50 | 0.75 | | Poster |

814 | Reducing the Communication Cost of Federated Learning through Multistage Optimization | 5.75 | 5.50 | -0.25 | | Poster |

815 | Scattering Networks on the Sphere for Scalable and Rotationally Equivariant Spherical CNNs | 5.00 | 5.50 | 0.50 | | Poster |

816 | Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations | 5.50 | 5.50 | 0.00 | | Poster |

817 | Causal Contextual Bandits with Targeted Interventions | 5.50 | 5.50 | 0.00 | | Poster |

818 | LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5 | 5.00 | 5.50 | 0.50 | | Poster |

819 | Stability Regularization for Discrete Representation Learning | 5.50 | 5.50 | 0.00 | | Poster |

820 | Divergence-aware Federated Self-Supervised Learning | 5.00 | 5.50 | 0.50 | | Poster |

821 | Learning to Guide and to be Guided in the Architect-Builder Problem | 5.50 | 5.50 | 0.00 | | Poster |

822 | Dynamic Token Normalization improves Vision Transformers | 5.25 | 5.50 | 0.25 | | Poster |

823 | Associated Learning: an Alternative to End-to-End Backpropagation that Works on CNN, RNN, and Transformer | 5.25 | 5.50 | 0.25 | | Poster |

824 | ADAVI: Automatic Dual Amortized Variational Inference Applied To Pyramidal Bayesian Models | 5.25 | 5.50 | 0.25 | | Poster |

825 | Bayesian Neural Network Priors Revisited | 5.50 | 5.50 | 0.00 | | Poster |

826 | Certified Robustness for Deep Equilibrium Models via Interval Bound Propagation | 6.00 | 5.50 | -0.50 | | Poster |

827 | Representation-Agnostic Shape Fields | 5.50 | 5.50 | 0.00 | | Poster |

828 | Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions | 4.80 | 5.40 | 0.60 | 5, 6, 5, 3, 5 | 5, 6, 5, 5, 6 |
829 | Minimax Optimality (Probably) Doesn't Imply Distribution Learning for GANs | 5.20 | 5.40 | 0.20 | 6, 3, 5, 6, 6 | 6, 3, 6, 6, 6 |
830 | Discovering Nonlinear PDEs from Scarce Data with Physics-encoded Learning | 5.00 | 5.40 | 0.40 | 3, 3, 8, 5, 6 | 5, 5, 6, 5, 6 |
831 | Unraveling Model-Agnostic Meta-Learning via The Adaptation Learning Rate | 5.20 | 5.40 | 0.20 | 6, 5, 5, 5, 5 | 6, 6, 5, 5, 5 |
832 | Missingness Bias in Model Debugging | 5.33 | 5.33 | 0.00 | | Poster |

833 | Unsupervised Learning of Full-Waveform Inversion: Connecting CNN and Partial Differential Equation in a Loop | 5.33 | 5.33 | 0.00 | | Poster |

834 | Fooling Explanations in Text Classifiers | 5.33 | 5.33 | 0.00 | | Poster |

835 | ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods | 4.67 | 5.33 | 0.67 | | Poster |

836 | Robust and Scalable SDE Learning: A Functional Perspective | 5.33 | 5.33 | 0.00 | | Poster |

837 | AS-MLP: An Axial Shifted MLP Architecture for Vision | 5.00 | 5.33 | 0.33 | | Poster |

838 | Zero-Shot Self-Supervised Learning for MRI Reconstruction | 5.33 | 5.33 | 0.00 | | Poster |

839 | Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings | 5.25 | 5.25 | 0.00 | | Poster |

840 | A fast and accurate splitting method for optimal transport: analysis and implementation | 5.25 | 5.25 | 0.00 | | Poster |

841 | Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL | 5.00 | 5.25 | 0.25 | | Poster |

842 | Visual hyperacuity with moving sensor and recurrent neural computations | 4.75 | 5.25 | 0.50 | | Poster |

843 | Consistent Counterfactuals for Deep Models | 5.00 | 5.25 | 0.25 | | Poster |

844 | Neural Network Approximation based on Hausdorff distance of Zonotopes | 5.25 | 5.25 | 0.00 | | Poster |

845 | Practical Integration via Separable Bijective Networks | 5.00 | 5.25 | 0.25 | | Poster |

846 | VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning | 5.00 | 5.25 | 0.25 | | Poster |

847 | Maximizing Ensemble Diversity in Deep Reinforcement Learning | 5.00 | 5.25 | 0.25 | | Poster |

848 | Memory Replay with Data Compression for Continual Learning | 5.25 | 5.25 | 0.00 | | Poster |

849 | Model Agnostic Interpretability for Multiple Instance Learning | 3.50 | 5.25 | 1.75 | | Poster |

850 | Towards General Function Approximation in Zero-Sum Markov Games | 5.25 | 5.25 | 0.00 | | Poster |

851 | Visual Representation Learning over Latent Domains | 5.25 | 5.25 | 0.00 | | Poster |

852 | Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning | 5.25 | 5.25 | 0.00 | | Poster |

853 | Overcoming The Spectral Bias of Neural Value Approximation | 4.00 | 5.00 | 1.00 | | Poster |

854 | FairCal: Fairness Calibration for Face Verification | 4.67 | 5.00 | 0.33 | | Poster |

855 | CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing | 4.25 | 5.00 | 0.75 | | Poster |

856 | Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization | 5.00 | 5.00 | 0.00 | | Poster |

857 | Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels | 5.50 | 5.00 | -0.50 | | Poster |

858 | CoMPS: Continual Meta Policy Search | 4.80 | 5.00 | 0.20 | 3, 5, 8, 5, 3 | 3, 5, 6, 5, 6 |
859 | Learning Continuous Environment Fields via Implicit Functions | 5.00 | 5.00 | 0.00 | | Poster |

860 | Towards Understanding Generalization via Decomposing Excess Risk Dynamics | 5.00 | 5.00 | 0.00 | | Poster |

861 | ComPhy: Compositional Physical Reasoning of Objects and Events from Videos | 4.75 | 5.00 | 0.25 | | Poster |

862 | Transformer Embeddings of Irregularly Spaced Events and Their Participants | 4.25 | 4.75 | 0.50 | | Poster |

863 | Topologically Regularized Data Embeddings | 4.75 | 4.75 | 0.00 | | Poster |

864 | Neural Program Synthesis with Query | 4.00 | 4.67 | 0.67 | | Poster |

865 | Learning by Directional Gradient Descent | 4.00 | 4.50 | 0.50 | | Poster |