- document the tokenizer used (https://github.com/huggingface/swift-transformers) - provide a hook for tokenizer configuration, prompt augmentation - this isn't as rich as the python equivalents but it helps a little
33 lines
896 B
Markdown
33 lines
896 B
Markdown
# Llama
|
|
|
|
This is a port of several models from:
|
|
|
|
- https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/models/
|
|
|
|
using the Hugging Face swift transformers package to provide tokenization:
|
|
|
|
https://github.com/huggingface/swift-transformers
|
|
|
|
The [Models.swift](Models.swift) provides minor overrides and customization --
|
|
if you require overrides for the tokenizer or prompt customizations they can be
|
|
added there.
|
|
|
|
This is set up to load models from Hugging Face, e.g. https://huggingface.co/mlx-community
|
|
|
|
The following models have been tried:
|
|
|
|
- mlx-community/Mistral-7B-v0.1-hf-4bit-mlx
|
|
- mlx-community/CodeLlama-13b-Instruct-hf-4bit-MLX
|
|
- mlx-community/phi-2-hf-4bit-mlx
|
|
- mlx-community/quantized-gemma-2b-it
|
|
|
|
Currently supported model types are:
|
|
|
|
- Llama / Mistral
|
|
- Gemma
|
|
- Phi
|
|
|
|
See [Configuration.swift](Configuration.swift) for more info.
|
|
|
|
See [llm-tool](../../Tools/llm-tool)
|