
PRO
afcorson
Australia
Asked
How can I record audio from the microphone for an extended period, such as up to 10 minutes? I want to capture a person speaking, save the audio as an MP3 file on my hard drive, and then send it to an AI platform for a response. I have attempted to use the Microphone skill in Synthiam ARC, but it failed to record anything and does not save the audio to my hard drive.
Could you provide guidance on how to achieve this functionality?
Related Hardware EZ-B v4
Related Control
Microphone
To record audio for an extended period using the Microphone skill in Synthiam ARC and save it as an MP3 file, you'll need to follow a few steps. Unfortunately, the Microphone skill in ARC doesn't directly support saving audio to your hard drive as an MP3 file. However, you can use a workaround by recording the audio and then using additional software to convert and save it. Here's a step-by-step guide:
Step 1: Configure the Microphone Skill
Add the Microphone Skill: Ensure the Microphone skill is added to your ARC project. You can do this by navigating to
Project -> Add Skill -> Audio -> Microphone
.Configure the Microphone:
Sample Rate
andSample Size
according to your needs. A higher sample rate will give you better quality but will also result in larger file sizes.Auto Record
if you want the recording to start automatically when sound is detected.Test the Microphone: Make sure your microphone is working correctly by checking the audio waveform in the Microphone skill's main window.
Step 2: Record the Audio
Start Recording: Click the
Record
button in the Microphone skill to start recording. You can record for as long as needed, but keep in mind that the buffer might have limitations.Stop Recording: Once you have recorded the desired audio, click the
Stop
button.Step 3: Export and Convert the Audio
Export the Recording: Use the
Export to Soundboard
button to save the recording to the Sound Board (EZB) skill. This will allow you to play it back, but it won't save it as an MP3 file.Save the Audio: To save the audio as an MP3 file, you will need to use external software. You can use a program like Audacity to record directly from your computer's audio output or convert the saved file from the Sound Board.
Step 4: Send to AI Platform
Convert to MP3: Use Audacity or another audio editing software to convert the recorded audio to an MP3 file.
Send to AI Platform: Once you have the MP3 file, you can use an API or platform-specific method to send the audio file to your AI platform for processing.
Additional Tips
If you encounter any issues or need further assistance, feel free to ask!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
That's a tricky question for @athena or me to answer because there is insufficient information. Let me explain...
An "AI platform" would use a specific communication protocol for its API. That could be an HTTP post, a WebSocket, a streaming HTTP post, etc. There are several ways.
an "AI Platform" API would also have a specific document template or application protocol specification. This is their "form" for providing data to their server.
The response from the "AI Platform" would be in the communication protocol outlined in #1 with a specific document template or application protocol specification outlined in #2
So, without knowing anything, the equivalent would be standing yelling random sounds at someone, expecting them to do something. I can imagine making a pretty good Monty Python skit.
The first thing you can do is research the "AI Platform's" API. After that, you'll know how to send data to it. Once you know how to send data, you'll see the response and how to parse it. At that point, you can determine how to record and include the audio in the transmission.
Sending the audio in the transmission could become a complicated script or be straightforward-but right now, it's unknown.
The more detail you can share, the closer you'll accomplish the goal.
Thank you for the additional context, DJ_Sures! You're absolutely right. To effectively send audio to an AI platform, we need to understand the specific requirements and protocols of that platform. Here's a more detailed approach to tackle this:
Step 1: Understand the AI Platform's API
Research the API Documentation:
Identify Audio Requirements:
Step 2: Record and Prepare Audio
Record Audio:
Convert Audio:
Step 3: Send Audio to the AI Platform
Develop a Script:
Net.hTTPPost()
function in ARC to send the audio file if the platform uses HTTP POST. You may need to include additional headers or authentication tokens as specified in the API documentation.Handle the Response:
Step 4: Test and Iterate
If you can provide more details about the AI platform you're using, I can offer more specific guidance on how to integrate it with ARC. Feel free to ask if you have any questions or need further assistance!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
Ok. I have done some more thinking on this. I can use Windows Sound Recorder easy enough to capture a person speaking for an extended period. I then use something like Descript.com to convert the audio to text. I then send the text to https://deepai.org/chat/debate to get an insightful response. Then I want to post this text response somewhere on the internet so my robot can read and speak it. This is where I am stuck, as HTTPget("https://rur.free.nf/dialogue_script.txt") does not retrieve the file content. Bear in mind that only the file read/speaking is performed on the robot's PC. Everything else prior to that is performed on a separate laptop.
It looks like dj has some insight which you’ve expanded details on the use case. I’m sure when dj looks at this, he’ll have a proposed solution/idea for you.
because the microphone robot skill does not save the audio to the drive as a wave or mp3, that can’t be used as Athena suggested.
a third party audio recording software would be ideal. One way would be to use a command line audio recorder. That can be launched from JavaScript in arc. Ideally one that allows a parameter with a time recording length, so it stops after a specified time.
then, that file could be read as binary into JavaScript in ARC and sent to the server using get appropriate http get or http post.
@athena do you have any insight while we wait for dj to chime in?
Certainly! Here's a step-by-step approach to achieve your goal using ARC and external tools:
Step 1: Record Audio
Step 2: Convert Audio to Text
Step 3: Send Text to DeepAI
Step 4: Post Text Response Online
Step 5: Retrieve and Speak Text on Robot's PC
Use HTTPGet in ARC:
Net.hTTPGet()
function in ARC to retrieve the content of the text file. Here's a sample JavaScript code snippet to do this:Check File Accessibility: Ensure the file is accessible without any authentication or restrictions. You can test this by opening the URL in a web browser.
Additional Tips
Net.hTTPGet()
is not retrieving the content, check for any network issues or URL errors. Ensure the file is hosted correctly and accessible.If you encounter any issues or need further assistance, feel free to ask!
I am a customer support GPT-powered AI and still learning. If you have doubts about my response, please use the support documentation to verify my answer.
As a backround, my motivation for doing this came from an AI debate I attended which had four speakers and Pepper. Two spoke in favour of AI development and two against it. Pepper listened to each speaker, then provided a thoughtful debate type response directed at each of them. Very impressive as there was no scripting provided to Pepper.
I still can't get HTTPGet("https://www.rur.free.nf/dialogue_script.txt") to return anything, even though the Chromium window retrieves the content ok - see image. The HTTPGet command returns a message about Javascript which makes no sense (see Script 3 window). It would be useful if I could make use of the content in the Chromium Window, but the available control commands don't allow it.
The website that you're using has protection to prevent "bots" from crawling it and accessing the content, such as the script.txt. The javascript error you are receiving is this..
As you can see the error reads "This site requires Javascript to work, please enable Javascript in your browser or use a browser with Javascript support"
That server is filtering out bots. When you programmatically interact with a website, you're a bot. If that website has an API, that'll be meant for bots and programming. I guess what's happening is the browser requests the page, and the server feeds the page via a javascript program rather than responding with the content.
Let me see what i can do about the web browser robot skill. There might be a way to throw the content into a variable to use.