Basics

Series by ketan Doshi

  1. State-of-the-Art Techniquesarrow-up-right (What is sound and how it is digitized. What problems is audio deep learning solving in our daily lives. What are Spectrograms and why they are all-important.)

  2. Why Mel Spectrograms perform betterarrow-up-right (Processing audio data in Python. What are Mel Spectrograms and how to generate them)

  3. Data Preparation and Augmentation arrow-up-right (Enhance Spectrograms features for optimal performance by hyper-parameter tuning and data augmentation)

  4. Sound Classificationarrow-up-right (End-to-end example and architecture to classify ordinary sounds. Foundational application for a range of scenarios.)

  5. Automatic Speech Recognitionarrow-up-right (Speech-to-Text algorithm and architecture, using CTC Loss and Decoding for aligning sequences.)

  6. Beam Searcharrow-up-right (Algorithm commonly used by Speech-to-Text and NLP applications to enhance predictions)

Last updated

Was this helpful?