Compatibility with TGI (Text Generation Inference) framework v1.4.3
#1
by
hugging-face-infrax
- opened
Hello Team, currently we've trying to deploy the jan-hq/komodo-7b-chat-adapter
model. But it seems there's an issue.
Server and tools Spec
- TGI: v1.4.3
- Azure A100 (80 GB)
How to Reproduce
Got access to the base model https://huggingface.co./Yellow-AI-NLP/komodo-7b-base so we can load the
Peft
model.Here's our
docker-compose.yml file
to running the inference framework.
services:
llm:
image: ghcr.io/huggingface/text-generation-inference:1.4.3
container_name: llm
command: >
--model-id jan-hq/komodo-7b-chat-adapter
--max-total-tokens 8192
--max-input-length 4096
--num-shard 1
--max-top-n-tokens 1
--max-best-of 1
--trust-remote-code
--disable-custom-kernels
--max-stop-sequences 1
--validation-workers 1
--waiting-served-ratio 0
--max-batch-total-tokens 8192
--max-waiting-tokens 4096
--cuda-memory-fraction 0.8
--max-concurrent-requests 512
--max-batch-prefill-tokens 8192
volumes:
- ./data:/data
ports:
- 8080:80
shm_size: '1gb'
environment:
- "HUGGING_FACE_HUB_TOKEN=${TOKEN}"
restart: always
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:80/health"]
interval: 30s
timeout: 45s
start_period: 180s
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
- Check the logs and we've got a warning like this.
on.rs:159: Warning: Token 'gubernur' was expected to have ID '34993' but was given ID 'None'
2024-03-20T05:42:17.368924Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'seiring' was expected to have ID '34994' but was given ID 'None'
2024-03-20T05:42:17.368927Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Imam' was expected to have ID '34995' but was given ID 'None'
2024-03-20T05:42:17.368929Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'pengurus' was expected to have ID '34996' but was given ID 'None'
2024-03-20T05:42:17.368932Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Premier' was expected to have ID '34997' but was given ID 'None'
2024-03-20T05:42:17.368934Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'teknik' was expected to have ID '34998' but was given ID 'None'
2024-03-20T05:42:17.368937Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Lombok' was expected to have ID '34999' but was given ID 'None'
2024-03-20T05:42:17.368939Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'penerimaan' was expected to have ID '35000' but was given ID 'None'
2024-03-20T05:42:17.368941Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Nah' was expected to have ID '35001' but was given ID 'None'
2024-03-20T05:42:17.368944Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Cabang' was expected to have ID '35002' but was given ID 'None'
2024-03-20T05:42:17.368946Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'berikan' was expected to have ID '35003' but was given ID 'None'
2024-03-20T05:42:17.368949Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Perhubungan' was expected to have ID '35004' but was given ID 'None'
2024-03-20T05:42:17.368953Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Gunakan' was expected to have ID '35005' but was given ID 'None'
2024-03-20T05:42:17.368956Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'Turki' was expected to have ID '35006' but was given ID 'None'
2024-03-20T05:42:17.368958Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.15.2/src/tokenizer/serialization.rs:159: Warning: Token 'fans' was expected to have ID '35007' but was given ID 'None'
Could you help if there's something missing from the configuration? Thank you