帳號:guest(54.152.5.73)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目
作者(中文):饒彥章
作者(外文):Jao, Yen-Chang
論文名稱(中文):改進線性伸縮以用於哼唱選歌
論文名稱(外文):Improving Linear Scaling for Query-by- Singing/Humming
指導教授(中文):張智星
張俊盛
指導教授(外文):Jang, Jyh-Shing
Chang, Jason S.
口試委員(中文):呂仁園
徐嘉連
口試委員(外文):Renyuan Lyu
Jia-Lien Hsu
學位類別:碩士
校院名稱:國立清華大學
系所名稱:資訊工程學系
學號:101062630
出版年(民國):103
畢業學年度:102
語文別:中文
論文頁數:45
中文關鍵詞:音樂檢索哼唱選歌線性伸縮黃金比例搜尋法序列誤差向量
外文關鍵詞:music retrievalquery-by-singing/humminglinear scalinggolden section searchsorted error vector
相關次數:
  • 推薦推薦:0
  • 點閱點閱:734
  • 評分評分:*****
  • 下載下載:1
  • 收藏收藏:0
本論文中,我們提出了一種有效改善哼唱選歌(query by singing/humming, QBSH)的整合架構。其中包含了三種不同的改進方法。第一種方法,是利用黃金比例搜尋法(golden section search)減少傳統線性伸縮(linear scaling)的比對耗時。第二種方法,是針對音高向量(包括使用者的哼唱以及資料庫歌曲)中的休止符加入不同的權重,以減少休止符對距離計算的影響。第三種方法,則是在比對音高向量時,利用序列誤差向量(sorted error vector)的概念,忽略一部分差異過大的距離值,而改使用剩餘的距離值作為比對距離。這是為了減少因使用者哼唱技巧不足或是音高追蹤錯誤,導致的短暫音高偏差所造成的影響。
我們提出的整合方案,不僅能夠縮短辨識所需的時間(方法一),同時也提升了辨識的正確率(方法二、方法三)。根據我們在MIR-QBSH資料庫與測試語料的實驗中,我們獲得了21.4%的誤差縮減比例(error reduction rate)並減少了49.3%的比對耗時。
This thesis proposes an improved framework for improving both the efficiency and the effectiveness of a query by singing/humming (QBSH) system. The proposed framework is based on three methods. Method 1 uses golden section search to reduce the computation time in traditional linear scaling (LS) algorithm. Method 2 assigns different weights for rests (in both database songs and in queries) so that these rests now have less effect on computing the weighted distance. Method 3 utilizes a sorted error vector to ignore the LS distances that are overly large and only considers the rest of the LS distances in the computation. This reduces the effect of pitch deviation in a short time span, probably due to the singer being out of tune or errors in pitch track-ing.
The proposed framework improves the baseline system in both the computation time reduction (via scheme 1) and recognition accuracy (via schemes 2 and 3) of LS-based QBSH. Our experiment shows an error reduction rate of 21.4% in accuracy and 49.3% decrease in computation time on the MIR-QBSH dataset.
摘要 I
Abstract II
謝誌 III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章 緒論 1
1.1 研究主題 1
1.2 相關研究簡介 1
1.3 本論文之研究方向與成果 2
1.4 章節概要 3
第二章 相關理論與知識 4
2.1 線性伸縮(Linear Scaling) 4
2.2 黃金比例(Golden Ratio) 6
2.3 黃金比例搜尋法(Golden Section Search) 7
第三章 研究方法 10
3.1 使用黃金比例搜尋法加速線性伸縮 10
3.1.1 GSS over LS 11
3.1.2 GSS over LS的問題 13
3.1.3 GSS Hybrid over LS 16
3.2 加入權重之距離計算 19
3.2.2 改變休止符的權重 21
3.3 序列誤差向量(Sorted Error Vector) 23
3.4 方法之整合 24
第四章 實驗結果與分析 26
4.1 實驗環境設定 26
4.2 測試語料及資料庫 26
4.3 LsGssHybrid使用不同step size的辨識率與辨識時間分析 28
4.4 LsGss與LsGssHybrid的加速效果分析 30
4.5 不同權重的休止符之辨識率分析 32
4.6 不同SEV bound的辨識率分析 37
4.7 綜合方法的辨識率與加速效果分析 39
第五章 結論與未來研究方向 42
5.1 結論 42
5.2 未來工作 43
參考文獻 44
[1] SoundHound, http://www.soundhound.com
[2] Shazam, http://www.shazam.com/
[3] Rodger J. McNab, Lloyd A. Smith, Ian H. Witten, Clare L. Henderson, Sally Jo Cunningham, “Towards the Digital Music Library: Tune Retrieval from Acoustic input,” in Proc. the 1st ACM international conference, pp. 11–18, 1996.
[4] J.-S. Roger Jang and Ming-Yang Gao, “A Query-by-Singing System based on Dynamic Programming”, International Workshop on Intelligent Systems Resolu-tions(the 8th Bellman Continuum), pp. 85-89, 2000.
[5] J.-S. Roger Jang, Hong-Ru Lee, Ming-Yang Kao, “Content-based Music Retriev-al Using Linear Scaling and Branch-and-bound Tree Search”, IEEE International Conference on Multimedia and Expo, pp. 289-292, 2001.
[6] Norman H. Adams, Mark A. Bartsch, Gregory H. Wakefield, “Note Segmenta-tion and Quantization for Music Information Retrieval”, IEEE Transactions on Audio, Speech, and Language Processing, Volume 14, pp 131-141, 2006
[7] M. Ryynänen and A. Klapuri, “ Query by Humming of MIDI and Audio Using Locality Sensitive Hashing, ” in Proc. 2008 IEEE International Conference on Acoustics, Speech, and Signal Processing(ICASSP'08),pp , 2008-2012
[8] L. Wang, S. Huang, S. Hu, J. Liang, B. Xu, “Improving Searching Speed and Accuracy of Query by Humming System Based on Three Methods: Feature Fu-sion, Candidates Set Reduction and Multiple Similarity Measurement Rescoring”, 9th Annual Conference of the International Speech Communication Association(INTERSPEECH 2008), pp. 2024-2027, 2008.
[9] “Golden section search”, from Wikipedia, http://en.wikipedia.org/wiki/Golden_section_search
[10] Kiefer, J., “Sequential minimax search for a maximum”, Proceedings of the American Mathematical Society 4(3), pp 502–506, 1953
[11] X. Wu, M. Li, J. Liu, J. Yang, Y. Yan, “A top-down approach to melody match in pitch contour for query by humming,” in Proc. International Conference of Chi-nese Spoken Language Processing, 2006.
[12] D. Ke, B. Xu, “Chinese intonation assessment using SEV features”, in Proc. In-ternational Conference on Acoustics, Speech and Signal Processing(ICASSP ‘09), pp. 4853-4856, 2009
[13] L. Wang, “MIREX 2012 QBSH Task: YINLONG’s Solution”, Extended Ab-stract in 8th Music Information Retrieval Evaluation eXchange(MIREX ‘12)
[14] W. H. Press, S. A. Teukolsky, W. T. Vetterling, B. P. Flannery, “Numerical Reci-pes: The Art of Scientific Computing(3rd ed.)”, “Section 10.2. Golden Section Search in One Dimension”, ISBN 978-0-521-88068-8, 2007
[15] C.-H. Chen, “Speedup Mechanism for Comparison of Query by Sing-ing/Humming over GPUs”, National Tsing Hua University, 2012
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *