Files

David Koski 82f6a969d4 llm improvements

- document the tokenizer used (https://github.com/huggingface/swift-transformers)
- provide a hook for tokenizer configuration, prompt augmentation
	- this isn't as rich as the python equivalents but it helps a little

2024-03-01 14:46:32 -08:00

896 B

Raw Blame History

Llama

This is a port of several models from:

https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/models/

using the Hugging Face swift transformers package to provide tokenization:

https://github.com/huggingface/swift-transformers

The Models.swift provides minor overrides and customization -- if you require overrides for the tokenizer or prompt customizations they can be added there.

This is set up to load models from Hugging Face, e.g. https://huggingface.co/mlx-community

The following models have been tried:

mlx-community/Mistral-7B-v0.1-hf-4bit-mlx
mlx-community/CodeLlama-13b-Instruct-hf-4bit-MLX
mlx-community/phi-2-hf-4bit-mlx
mlx-community/quantized-gemma-2b-it

Currently supported model types are:

Llama / Mistral
Gemma
Phi

See Configuration.swift for more info.

See llm-tool

896 B Raw Blame History

Llama

896 B

Raw Blame History