LLMEval performance (#40)

* notes about performance and some performance improvements (don't update the display for every token)

* swift-format

* Update Applications/LLMEval/README.md

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

* Update Applications/LLMEval/README.md

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>

---------

Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
This commit is contained in:
David Koski
2024-03-28 12:00:52 -07:00
committed by GitHub
parent 15b38cd146
commit 0199407d93
2 changed files with 28 additions and 3 deletions

View File

@@ -50,3 +50,9 @@ Building in Release / optimizations will remove a lot of tail calls in the C++
layer. These lead to the stack overflows.
See discussion here: https://github.com/ml-explore/mlx-swift-examples/issues/3
### Performance
Different models have difference performance characteristics. For example Gemma 2B may outperform Phi-2 in terms of tokens / second.
You may also find that running outside the debugger boosts performance. You can do this in Xcode by pressing cmd-opt-r and unchecking "Debug Executable".