Any reason why this longer context length wasn't applied to the chat and instruct versions?

#29

by RonanMcGovern - opened May 29, 2023

Discussion

RonanMcGovern

May 29, 2023

It would be super useful to have more than 2048 sequence length for chat or instruct.

jacobfulano

May 31, 2023

In the model cards we explain how to take advantage of ALiBi so that you can increase maximum sequence length during inference:

Although the model was trained with a sequence length of 2048, ALiBi enables users to increase the maximum sequence length during finetuning and/or inference. For example:

config = transformers.AutoConfig.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  trust_remote_code=True
)
config.update({"max_seq_len": 4096})
model = transformers.AutoModelForCausalLM.from_pretrained(
  'mosaicml/mpt-7b-instruct',
  config=config,
  trust_remote_code=True
)

jacobfulano changed discussion status to closed May 31, 2023

RonanMcGovern changed discussion status to open May 31, 2023

RonanMcGovern

May 31, 2023

Many thanks @jacobfulano .

Is it possible to directly deploy with that configuration using Sagemaker, or would I have to take a route like building a docker image including the config update? Thanks

abhi-mosaic

Jun 3, 2023

HI @RonanMcGovern , I'm not sure what the limitations of Sagemaker are, but as long as you can pass in a custom HF config, you can set config.max_seq_len=4096 dynamically at start time. It's just a config arg, no need for custom Docker images or anything.

abhi-mosaic changed discussion status to closed Jun 3, 2023

RonanMcGovern

Jul 14, 2023

Hi @abhi-mosaic I've managed to get MPT-7B running well on Google collab with:

    # use cache directory while loading model and tokenizer
    self.model = AutoModelForCausalLM.from_pretrained(
        model_name,
        cache_dir=cache_dir,
        torch_dtype=torch_dtype,
        trust_remote_code=trust_remote_code,
        use_auth_token=use_auth_token,
    )


I'm then trying to add in the config for max_seq_len but not getting any joy:

        # Load the configuration
        config = transformers.AutoConfig.from_pretrained(model_name,
                                                        trust_remote_code=trust_remote_code)
        # Explicitly set the max_seq_len
        config.max_seq_len = 4096

        # Load the model with the updated configuration
        self.model = transformers.AutoModelForCausalLM.from_pretrained(
            model_name,
            config=config,
            cache_dir=cache_dir,
            torch_dtype=torch_dtype,
            trust_remote_code=trust_remote_code,
            use_auth_token=use_auth_token,
        )

RonanMcGovern changed discussion status to open Jul 14, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment