Process audio and generate text output based on instructions
Chat with an AI assistant using text and images