Keep getting 'model_kwargs` are not used by the model: ['token_type_ids']
#60 opened over 1 year ago
by
uglydumpling
Falcon models slow inference
10
#59 opened over 1 year ago
by
mikeytrw
I need an API of Falcon
8
#56 opened over 1 year ago
by
JustMe4Real
Google Colab for Falcon 40B and 7B with Live Response Streaming
3
#55 opened over 1 year ago
by
gaodrew
can anyone help me get prompt template for Question Answering model
2
#54 opened over 1 year ago
by
Iamexperimenting
Might be interesting to have a thread on people with Successful Implementations, and on what kind of hardware..
1
#53 opened over 1 year ago
by
LinuxMagic
Batch inference seems to be done sequentially
3
#50 opened over 1 year ago
by
yard1
Extracting attention maps
#49 opened over 1 year ago
by
roeehendel
Error with custom inference loop with past_key_values
26
#48 opened over 1 year ago
by
dimaischenko
Fix the kv-cache dimensions
1
#47 opened over 1 year ago
by
cchudant
Multi GPU inference issue
1
#39 opened over 1 year ago
by
eastwind
Is it on purpose? loss for singlelable and multilable switched.
#36 opened over 1 year ago
by
rhy2023
Fine-tuning on a new language
4
#35 opened over 1 year ago
by
AliMirlou
Flash attention
2
#34 opened over 1 year ago
by
utensil
about evaluating on humaneval
#33 opened over 1 year ago
by
dongZheX
Finetune on "uncensored" dataset?
1
#32 opened over 1 year ago
by
sivarajan
Tokenizer Details
#31 opened over 1 year ago
by
kye
Import dataset and chat with it
2
#27 opened over 1 year ago
by
phdykd
Working code with full server requirements
2
#24 opened over 1 year ago
by
gmjolt
Bug: Generate method doesn't work for falcon-7b and falcon-40b in int8 mode.
#22 opened over 1 year ago
by
avacaondata
It can run with two 4090 or a single 6000 ADA.
5
#20 opened over 1 year ago
by
znsoft
Finetune wtih QLoRA please
7
#14 opened over 1 year ago
by
supercharge19
How to set trust_remote_code to true?
13
#9 opened over 1 year ago
by
gmjolt
[Bug] Does not work
58
#3 opened over 1 year ago
by
catid