Spaces:

GameScribes
/

Multipurpose-AI-Agent-Development

Running on T4

devve1 commited on Aug 5

Commit

3dcb8a9

•

1 Parent(s): 1346d2a

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -178,15 +178,15 @@ def load_models_and_documents():
         tokenizer = AutoTokenizer.from_pretrained(model_path)
         llm = vllm.LLM(
-            model_path,
             tensor_parallel_size=1,
             max_model_len=12288,
             trust_remote_code=True,
             enforce_eager=True,
-            quantization='awq',
             gpu_memory_utilization=0.9,
-            dtype='auto'
-            #load_format='npcache'
         )
         model = models.VLLM(llm)

         tokenizer = AutoTokenizer.from_pretrained(model_path)
         llm = vllm.LLM(
+            model_path,
             tensor_parallel_size=1,
             max_model_len=12288,
             trust_remote_code=True,
             enforce_eager=True,
+            quantization="bitsandbytes",
             gpu_memory_utilization=0.9,
+            dtype='auto',
+            load_format="bitsandbytes"
         )
         model = models.VLLM(llm)