Commit Graph

21 Commits

Author SHA1 Message Date
Anthony
ac6bdfccec Add Llama 3.1 (#98)
* Update Mistral 7B config

* Add Mistral NeMo

* Update for Llama 3.1

* Align LlamaConfiguration with Python implementation

* Fix model configuration names

* Refine DynamicNTKScalingRoPE

* compute base only once

---------

Co-authored-by: Awni Hannun <awni@apple.com>
2024-07-26 13:05:42 -07:00
David Koski
9d74afd119 handle partially quantized models (#76)
* handle partially quantized models

- fix for #53 #71 #69 #74
- in order to test the models
	- I added a default prompt of an appropriate form
	- while working on the model configuration also added additional stop tokens (#74)
- fixed the repetitionPenalty code (#71)
2024-05-28 16:35:11 -07:00
Awni Hannun
b951b78eb2 phi3 (#54)
* phi3

Co-authored-by: David Koski <dkoski@apple.com>
2024-04-24 09:31:01 -07:00
David Koski
6c0b66f90a implement LoRA / QLoRA (#46)
* implement LoRA / QLoRA

- example of using MLX to fine-tune an LLM with low rank adaptation (LoRA) for a target task
- see also https://arxiv.org/abs/2106.09685
- based on https://github.com/ml-explore/mlx-examples/tree/main/lora

* add some command line flags I found useful during use
- --quiet -- don't print decorator text, just the generated text
- --prompt @/tmp/file.txt -- load prompt from file

* user can specify path to model OR model identifier in huggingface

* update mlx-swift reference

Co-authored-by: Ashraful Islam <ashraful.meche@gmail.com>
Co-authored-by: JustinMeans <46542161+JustinMeans@users.noreply.github.com>
2024-04-22 09:30:12 -07:00
Ashraful Islam
7e85eb8b88 adds a check before proceeding with generation (#51) 2024-04-12 12:46:29 -07:00
David Koski
96b94b0df6 prepare for lora branch (#47)
- remove async llm generation -- this is just doubling our work
	- and does not match the style used in the example applications
- package generation parameters into a struct
- refactor command line arguments into distinct pieces based on their use
	- this will be reusable in the lora commands
2024-04-10 10:56:18 -07:00
Florent Morin
e48e2ce2c9 Append visionOS support to LLMEval (#43)
* Update `mlx-swift` to last revision

* Add Apple Vision Target

* Update visionOS UI
2024-03-31 20:48:46 -07:00
David Koski
0199407d93 LLMEval performance (#40)
* notes about performance and some performance improvements (don't update the display for every token)

* swift-format

* Update Applications/LLMEval/README.md

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* Update Applications/LLMEval/README.md

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-03-28 12:00:52 -07:00
Awni Hannun
15b38cd146 Use fast (#38)
* update to latest mlx swift and use fast norms
* gpu usage -> memory usage
2024-03-27 16:37:35 -07:00
Ashraful Islam
c37018d7d2 feat: adds gpu usages stat in the toolbar (#36)
* feat: adds gpu usages stat in the toolbar
2024-03-25 10:29:54 -07:00
David Koski
452b49aef0 fixed height for the progress view, produce more tokens (#33) 2024-03-19 08:56:37 -07:00
Loc Bui
0588abec77 fix: Tokenizer dependency (#30) 2024-03-18 12:57:04 -07:00
Ashraful Islam
a7b2b54f18 LLMEval UI Improvements (#27)
* Feat: LLMEval UI Improvements

1. adds Markdown rendering in the UI
2. Adds init time and token/second stat
3. Minor UI enhancements

* feat: adds a copy to clipboard button for llm outputs

* adds scrollviewreader to sync with main

* ran pre-format to resolve formatting issues

* updates the missing dependency in project definition

* feat: switch between plain text and markdown

adds a segemented picker to switch between plain text and markdown
2024-03-18 09:15:50 -07:00
David Koski
a1431e7155 scroll to bottom when text is generated (#24)
- also restore circleci
2024-03-14 13:18:28 -07:00
David Koski
61105bf0c4 use memory limit API (#13)
* add buffer cache limit

* swift-format

* a more reasonable size

* add memory stats to command line tool, update to final api

* add note about changing models
2024-03-05 15:22:12 -08:00
David Koski
fe116f857d swift-format 2024-03-01 23:26:25 -08:00
David Koski
33d4b6f57e make the generated output a little more interesting 2024-03-01 22:56:28 -08:00
David Koski
23fc53c43e allow selection of output 2024-03-01 22:44:33 -08:00
David Koski
0374e4b073 update documentation 2024-03-01 16:33:49 -08:00
David Koski
c49dd73c28 swift-format, circleci setup 2024-03-01 16:10:34 -08:00
David Koski
b41f14fba7 add LLM evaluator example
- runs on iOS and macOS
- downloads a model / tokenizer from hugging face
- evaluates the given prompt
2024-03-01 16:10:00 -08:00