Physics-informed machine learning for audio processing

In many machine learning techniques, a model is trained on a large amount of data to accomplish a specific task. However, in audio processing, it is often difficult to collect a large amount of data for such training. On the other hand, acoustic phenomena are supposed to obey physical laws, and such physical constraints can provide useful prior information for machine learning models. The governing equations of sound propagation, such as the wave equation, are the best example. In addition, there are various other constraints based on physical properties that can be considered depending on the target or task, and we aim to build new audio processing technology by efficiently incorporating them into machine learning models. This is expected to enable technology that requires less data than conventional machine learning technology and is more flexible than technology based solely on physical models.

References
  • S. Koyama, J. G. C. Ribeiro, T. Nakamura, N. Ueno, and M. Pezzoli, “Physics-Informed Machine Learning For Sound Field Estimation: Fundamentals, state of the art, and challenges,” IEEE Signal Process. Mag., vol. 41, no. 6, pp. 60-71, 2025.
  • J. G. C. Ribeiro, S. Koyama, R. Horiuchi, and H. Saruwatari, “Sound Field Estimation Based on Physics-Constrained Kernel Interpolation Adapted to Environment,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 32, pp. 4369-4383, 2024.