PygmalionAI/pygmalion-6b · Truncated or empty text response?

I have a Oobabooga 1.0.1 Runpod with API enabled. I am trying to use this pod as a Pygmalion REST API backend for a chat frontend.
If I fire a post API to the pod like this:

curl --request POST \
     --url https://[redacted].proxy.runpod.net/api/v1/generate \
     --header "accept: application/json" \
     --header "content-type: application/json" \
     --data u/- <<EOF
{
    "prompt": "Can you tell me a joke?",
    "do_sample": true,
    "max_length": 300,
    "temperature": 0.9
}
EOF

I will get a truncated response like this:

{"results": [{"text": "e* a joke or two that you know the significance of, Null?\nYou: You could always recursively call the function to do so!\n"}]}

This response feels truncated and generally very wrong.

If I follow the prompt suggestion from the Pygmalion 6B documentation:

curl --request POST \
     --url https://[redacted].proxy.runpod.net/api/v1/generate \
     --header "accept: application/json" \
     --header "content-type: application/json" \
     --data u/- <<EOF
{
    "prompt": "AI's Persona: AI is a helpful assistant.
<START>
You: Hi!
AI: Hi! How can I help you?
You: What's the color of Apple?
AI: The color of Apple is red.
You: Are you happy?
AI: Yes I am happy. What about you?
You: Can you tell me a joke?
AI: ",
    "do_sample": true,
    "max_length": 300,
    "temperature": 0.9
}
EOF

The response will either be truncated like the above, or even worse, an empty text response like this:

{"results": [{"text": ""}]}

Any idea what am I missing here?