* implement LoRA / QLoRA
- example of using MLX to fine-tune an LLM with low rank adaptation (LoRA) for a target task
- see also https://arxiv.org/abs/2106.09685
- based on https://github.com/ml-explore/mlx-examples/tree/main/lora
* add some command line flags I found useful during use
- --quiet -- don't print decorator text, just the generated text
- --prompt @/tmp/file.txt -- load prompt from file
* user can specify path to model OR model identifier in huggingface
* update mlx-swift reference
Co-authored-by: Ashraful Islam <ashraful.meche@gmail.com>
Co-authored-by: JustinMeans <46542161+JustinMeans@users.noreply.github.com>
* switch swift-tokenizers to main, remove some workarounds
- swift-tokenizers is getting a lot of updates and fixes, let's track main for now
- remove some workarounds that are no longer needed
- https://github.com/huggingface/swift-transformers/issues/63
- document the tokenizer used (https://github.com/huggingface/swift-transformers)
- provide a hook for tokenizer configuration, prompt augmentation
- this isn't as rich as the python equivalents but it helps a little