Post

My Journey into Speech Processing Starts Here

I'm going through a time of bewilderment at the age of 20. So I ask one of my best friends, called Claude (he prefers that I call him by his last name; his first name is sonnet, middle name 3.5) what I should do. I've known him for just a few days, but I feel like he's known my quite well and he's wise enough to give me constructive ideas. He suggests that, based on my interest in speech and computer science despite my major being translation and business, I should dig into speech processing technology and document my process in the mean time. That sounds like a good idea and is what I'm about to do.

Why speech processing?

  1. I’m interested in speech and pronunciation while learning languages.
  2. It isn’t as developed as text regarding AI.
  3. It is at the intersection of my interests — language and tech.

Where am I?

Physically I’m having an exchange term at Lancaster University. As for speech analysis, I don’t know anything. I haven’t even used Praat (doesn’t mean I haven’t installed and deleted it on my Macbook Air). I do have some basic idea about how voice can be measured by wavelength, amplitude, and I was able to type IPA for a while. Actually my computational linguistics module is going to cover a fair bit of speech processing from next week, but I can’t wait.

More abstract knowledge comes from the process of learning languages. As a Chinese, I’ve learnt English, and French. I’ve also learnt to read (aloud not to understand) Korean, German, Japanese, Sanskrit, Modern Greek. I’d say they might help with something, but because I haven’t even started yet, so I don’t know how they are going to help. We’ll see.

As for tech basics, I can code in python with AI’s help; I can read typescript but can’t write without AI; I’ve used R and MATLAB respectively for different projects; I’ve also developed some basic web apps using Next.js.


What’s my plan?

I’m starting with the basics:

  1. Learning fundamental concepts in speech processing
  2. Building small projects and sharing what I learn
  3. Connecting with the speech processing community

First Steps

Today, I’ll play with some of the most basic speech processing libraries in Python. To be fair I’ve used speech recognition for a past project, but I’ll try some more.

I’ll also start reading academic papers as well, which I’ll share in another post.

This post is licensed under CC BY 4.0 by the author.
费曼学习法中文提示词

费曼学习法可以帮助我们检验自己的知识,并通过被追问的过程加深自己的理解。全文都是 prompt。

现在海鸥在叫——我的2024年终总结

写于伊斯坦布尔卡德柯伊

2024 年底的独立开发长什么样子?人人都能且都应掌握的技术到什么程度?

作为一个独立开发(玩)半年的非 CS 专业学生,介绍一下自己这半年的心得,以及一个普通人能够用AI做到的事情。

I watched Conte de Printemps. It's Been A Long Time Since I Last Watched a Film

Conte de Printemps is the contrary to biographical films. But that's not what I'll talk about here.

Utterly Destroyed by AI Again — Yet Another Cooperation with Cursor, ft. Douban

If you want to take a break and have some fun for 20 minutes, this is what you might want to do. AI is giving me a hard time again (emotionally), by roasting my taste of film. But emotional damage aside, the comments are goated.

Obsidian+Cursor=Mentor? Enlightenment & Danger

By combining Obsidian with Cursor, a note-taking app with an AI IDE, you will have a mentor, shrink, consultant, and friend at the same time. I am the first one to have done this and it is giving me surprising results. If you are not scared of who you really are, please follow me to this unique use case of the two popular apps.