Hierarchical Regulated Iterative Network for Joint Task of Music Detection and Music Relative Loudness Estimation.

Image credit: Unsplash

Abstract

One practical requirement of the music copyright management is the estimation of music relative loudness, which is mostly ignored in existing music detection works. To solve this problem, we study the joint task of music detection and music relative loudness estimation. To be specific, we observe that the joint task has two characteristics, i.e., temporality and hierarchy, which could facilitate to obtain the solution. For example, a tiny fragment of audio is temporally related to its neighbor fragments because they may all belong to the same event, and the event classes of the fragment in the two tasks have a hierarchical relationship. Based on the above observation, we reformulate the joint task as hierarchical event detection and localization problem. To solve this problem, we further propose Hierarchical Regulated Iterative Networks (HRIN), which includes two variants, termed as HRIN-r and HRIN-cr, which are based on recurrent and convolutional recurrent modules. To enjoy the joint task’s characteristics, our models employ an iterative framework to achieve encouraging capability in temporal modeling while designing three hierarchical violation penalties to regulate hierarchy. Extensive experiments on the currently largest dataset (i.e., OpenBMAT) show that the promising performance of our HRIN in the segment-level and event-level evaluations.

Publication
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Shenglan Yang
Shenglan Yang
M.Sc Student

My research interests include Deep learning

Jiancheng Lv
Jiancheng Lv
Dean and professor of Computer Science of Sichuan University

My research interests include natural language processing, computer vision, industrial intelligence, smart medicine and smart cultural creation.

Related