Commit Graph

5 Commits

Author SHA1 Message Date
Awni Hannun
15b38cd146 Use fast (#38)
* update to latest mlx swift and use fast norms
* gpu usage -> memory usage
2024-03-27 16:37:35 -07:00
David Koski
ac273a14ea fix float types in Phi (use float16) (#25)
- per suggestions in #23 ensure that the values that go into the cache are float16
2024-03-14 13:18:40 -07:00
David Koski
0fb74cbfdc adopt MLXFast.scaledDotProductAttention (#23) 2024-03-12 14:04:43 -07:00
David Koski
bb7bacc077 fix for #2 -- CodeLlama crashes
- add replacement tokenizer class for unknown tokenizers
- fix quantization for models that don't have lm_head quantized

Requires https://github.com/ml-explore/mlx-swift/pull/28
2024-02-26 10:38:05 -08:00
David Koski
b6d1e14465 initial commit 2024-02-22 10:41:02 -08:00