Files

T

Anthony ac6bdfccec Add Llama 3.1 (#98 )

* Update Mistral 7B config

* Add Mistral NeMo

* Update for Llama 3.1

* Align LlamaConfiguration with Python implementation

* Fix model configuration names

* Refine DynamicNTKScalingRoPE

* compute base only once

---------

Co-authored-by: Awni Hannun <awni@apple.com>

2024-07-26 13:05:42 -07:00

Cohere.swift

implement LoRA / QLoRA (#46 )

2024-04-22 09:30:12 -07:00

Configuration.swift

Add Gemma 2 (#88 )

2024-07-01 09:35:43 -07:00

Evaluate.swift

Fix extra EOS tokens (#91 )

2024-07-03 16:22:29 -07:00

Gemma.swift

Add Gemma 2 (#88 )

2024-07-01 09:35:43 -07:00

Llama.swift

Add Llama 3.1 (#98 )

2024-07-26 13:05:42 -07:00

LLM.h

initial commit

2024-02-22 10:41:02 -08:00

LLMModel.swift

handle partially quantized models (#76 )

2024-05-28 16:35:11 -07:00

Load.swift

handle partially quantized models (#76 )

2024-05-28 16:35:11 -07:00

Lora.swift

handle partially quantized models (#76 )

2024-05-28 16:35:11 -07:00

Lora+Data.swift

implement LoRA / QLoRA (#46 )

2024-04-22 09:30:12 -07:00

Models.swift

Add Llama 3.1 (#98 )

2024-07-26 13:05:42 -07:00

OpenELM.swift

remove the bias in the ffn module (#68 )

2024-05-08 15:31:28 -07:00

Phi3.swift

phi3 (#54 )

2024-04-24 09:31:01 -07:00

Phi.swift

implement LoRA / QLoRA (#46 )

2024-04-22 09:30:12 -07:00

Qwen2.swift

handle partially quantized models (#76 )

2024-05-28 16:35:11 -07:00

README.md

implement LoRA / QLoRA (#46 )

2024-04-22 09:30:12 -07:00

Starcoder2.swift

implement LoRA / QLoRA (#46 )

2024-04-22 09:30:12 -07:00

Tokenizer.swift

implement LoRA / QLoRA (#46 )

2024-04-22 09:30:12 -07:00

README.md

LLM

This is a port of several models from:

https://github.com/ml-explore/mlx-examples/blob/main/llms/mlx_lm/models/

using the Hugging Face swift transformers package to provide tokenization:

https://github.com/huggingface/swift-transformers

The Models.swift provides minor overrides and customization -- if you require overrides for the tokenizer or prompt customizations they can be added there.

This is set up to load models from Hugging Face, e.g. https://huggingface.co/mlx-community

The following models have been tried:

mlx-community/Mistral-7B-v0.1-hf-4bit-mlx
mlx-community/CodeLlama-13b-Instruct-hf-4bit-MLX
mlx-community/phi-2-hf-4bit-mlx
mlx-community/quantized-gemma-2b-it

Currently supported model types are:

Llama / Mistral
Gemma
Phi

See Configuration.swift for more info.

See llm-tool

LoRA

Lora.swift contains an implementation of LoRA based on this example:

https://github.com/ml-explore/mlx-examples/tree/main/lora

See llm-tool/LoraCommands.swift for an example of a driver and llm-tool for examples of how to run it.