Our paper has been accepted for publication in Transactions on Machine Learning Research.

■Bibliographic Information
Bum Jun Kim, Yoshinobu Kawahara, Sang Woo Kim. “Disappearance of Timestep Embedding: A Case Study on Neural ODE and Diffusion Models .” Transactions on Machine Learning Research (TMLR). https://openreview.net/forum?id=bpaLYaf6Dp

■Outline
Dynamical systems are often time-varying, whose modeling requires a function that evolves with respect to time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time. However, we claim that the architectural choice to build a time-dependent neural network significantly affects its time-awareness but still lacks sufficient In this study, we conduct an in-depth analysis of the architecture of neural ordinary differential equations. Here, we report a vulnerability of vanishing timestep embedding, which disables the time-awareness of a time-dependent neural network. Specifically, we find that the ConcatConv operation, which is widely used in neural ordinary differential equations, causes an additive effect of timestep embedding, which This vanishing timestep embedding also arises for group normalization and is analyzed Furthermore, we find that this vulnerability can also be observed in diffusion models because they employ a similar architecture that incorporates timestep embedding to discriminate between different timesteps during a diffusion process. Our analysis provides a detailed description of this phenomenon as well as several solutions to address the root cause. Through experiments on neural ordinary differential equations and diffusion models, we observed that ensuring alive time-awareness via proposed solutions Through experiments on neural ordinary differential equations and diffusion models, we observed that ensuring alive time-awareness via proposed solutions boosted their performance, such as classification accuracy, FID, and inception score, which implies that their current implementations lack sufficient time-dependency.

Our paper has been accepted for publication in Transactions on Machine Learning Research.

Related Post