Audio Recognition

Reading time ~1 minute

Audio Recognition

Sound Classification

  • urban sound classification
    • Environmental sound classification with convolutional neural networks, 2015 [paper]
    • Deep convolutional neural networks and data augmentation for environmental sound classification, 2017 [paper]
    • UNSUPERVISED FEATURE LEARNING FOR URBAN SOUND CLASSIFICATION, 2015 [paper]
  • speaker’s age, gender classification
    • Deep neural network framework and transformed MFCCs for speaker’s age and gender classification, 2017 [paper]
    • Speaker age classification and regression using i-vectors, 2016 [paper]
    • A new pitch-range based feature set for a speaker’s age and gender classification, 2015 [paper]
    • A new approach with score-level fusion for the classification of a speaker age and gender, 2016 [paper]
    • Automatic speaker, age-group and gender identification from children’s speech, 2018 [paper]
    • Speaker age estimation on conversational telephone speech using senone posterior based i-vectors, 2016 [paper]
    • Estimating Age and Gender for Speaker through, 2016 [paper]
  • sound source direction classification

  • data augmentation
    • EXPLORING DATA AUGMENTATION FOR IMPROVED SINGING VOICE DETECTION WITH NEURAL NETWORKS, 2015 [paper]
      • Singing voice detection with deep recurrent neural networks, 2015 [paper]

Voice Activity Detection (VAD) - audio record 내 목소리 유무 여부 판단

  • Feature learning with raw-waveform CLDNNs for Voice Activity Detection, 2016 [paper]
  • Boosting contextual information for deep neural network based voice activity detection, 2016 [paper]
  • Voice Activity Detection: Merging Source and Filter-based Information, 2016 [paper]
  • Features for voice activity detection: a comparative analysis, 2015 [paper]
  • Formant-based robust voice activity detection, 2015 [paper]
  • A robust voice activity detection for real-time automatic speech recognition, 2018 [paper]
  • Ensemble of deep neural networks using acoustic environment classification for statistical model-based voice activity detection, 2016 [paper]
  • Audio-Visual Voice Activity Detection Using Diffusion Maps, 2015 [paper]

Sound Source Direction Detection - audio 내 소리의 음원 방향 검출

  • Detection Sound Source Direction in 3D Space Using Convolutional Neural Networks, 2018 [paper]
  • Design of UAV-embedded microphone array system for sound source localization in outdoor environments, 2017 [paper]
    • UAV-embedded (드론에 연결된…)

?

  • A low-latency, real-time-capable singing voice detection method with LSTM recurrent neural networks, 2015 [paper]