As a guest user you are not logged in or recognized by your IP address. You have
access to the Front Matter, Abstracts, Author Index, Subject Index and the full
text of Open Access publications.
This paper aims to enhance polyphonic music audio recognition by addressing the challenge of low resolution. It proposes a new method that uses time-frequency spectrograms for improved recognition of polyphonic music. The method focuses on extracting the main melody and other elements in polyphonic pieces, significantly increasing pitch resolution beyond traditional semitone identification methods. The process begins with using a Long Short-Term Memory (LSTM) network to create the musical signal’s time-frequency spectrogram. An adaptive edge distortion processing technique is then applied to binarize the spectrogram, reducing note edge distortions. This binarized spectrogram undergoes analysis using a sophisticated Simulated Annealing (SA) algorithm, which converts transformations from discrete to continuous domains, achieving accurate note placement. Finally, a density-based clustering algorithm (DBSCAN) combined with fundamental frequency extraction is used to extract musical information. The results demonstrate the algorithm’s ability to deliver high resolution in both time and frequency dimensions for polyphonic music, with an average frequency domain error below 6 Hz and a temporal error under 80 ms.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.
This website uses cookies
We use cookies to provide you with the best possible experience. They also allow us to analyze user behavior in order to constantly improve the website for you. Info about the privacy policy of IOS Press.