llm improvements
- document the tokenizer used (https://github.com/huggingface/swift-transformers) - provide a hook for tokenizer configuration, prompt augmentation - this isn't as rich as the python equivalents but it helps a little
This commit is contained in:
@@ -4,9 +4,22 @@ This is a port of several models from:
|
||||
|
||||
- https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/models/
|
||||
|
||||
You can use this to load models from huggingface, e.g.:
|
||||
using the Hugging Face swift transformers package to provide tokenization:
|
||||
|
||||
- https://huggingface.co/mlx-community/Mistral-7B-v0.1-hf-4bit-mlx
|
||||
https://github.com/huggingface/swift-transformers
|
||||
|
||||
The [Models.swift](Models.swift) provides minor overrides and customization --
|
||||
if you require overrides for the tokenizer or prompt customizations they can be
|
||||
added there.
|
||||
|
||||
This is set up to load models from Hugging Face, e.g. https://huggingface.co/mlx-community
|
||||
|
||||
The following models have been tried:
|
||||
|
||||
- mlx-community/Mistral-7B-v0.1-hf-4bit-mlx
|
||||
- mlx-community/CodeLlama-13b-Instruct-hf-4bit-MLX
|
||||
- mlx-community/phi-2-hf-4bit-mlx
|
||||
- mlx-community/quantized-gemma-2b-it
|
||||
|
||||
Currently supported model types are:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user