Back in mid 2015, I created my first and only Python script so far. The script was a voice assistant. I thought I lost the source code, but it's been in my Dropbox all the time and was named with a weird project name so I didn't realised that's it. I just didn't try to look for it hard enough.
Here are a couple videos I took back in 2015 when it was built. The script was running on a Raspberry Pi 3 with a USB sound card connecting to a microphone and a speaker.
You will notice that it was pretty slow in response. That's probably because I couldn't figure out how to lower the sound input sample rate. The rate was 44.1KHz. I think 8KHz is good enough and it can cut down the data size significantly.
How the script works is fairly simple. Get the voice data from the USB sound card, and send it to Google Speech Recognition API if the sound goes above a threshold. Get the text back from Google API and decide what to do. It can only handle two kind of commands. If you say "(.*)(tell|say)(.*) about (.*)", it will call Wikipedia for the last "(.*)" which is supposed to be a name or a thing. Otherwise it will call WolframAlpha to try to get an answer of your question. Once it gets the text for response, it responds through the speaker.
The most challenging part of this project was not coding. The original plan was to use bluetooth speaker with microphone and I spent a long time to try to make the speaker work with Raspberry Pi but failed.
Now, here is the source code. I can't guarantee it still works since it's over 2 years old. It might not even be the working version on my Raspberry Pi. But I hope this is helpful for anyone who is interesting in building a voice assistant.
If you want to build a voice assistant, I suggest to try DialogFlow as the response engine. It can make responses more natural, and it can have a conversation to gather all the required information for a task (e.g. book a flight ticket).