LLMEval performance (#40)

* notes about performance and some performance improvements (don't update the display for every token) * swift-format * Update Applications/LLMEval/README.md Co-authored-by: Awni Hannun <awni.hannun@gmail.com> * Update Applications/LLMEval/README.md Co-authored-by: Awni Hannun <awni.hannun@gmail.com> --------- Co-authored-by: Awni Hannun <awni.hannun@gmail.com>
2024-03-28 12:00:52 -07:00
parent 15b38cd146
commit 0199407d93
2 changed files with 28 additions and 3 deletions
--- a/Applications/LLMEval/README.md
+++ b/Applications/LLMEval/README.md
@@ -50,3 +50,9 @@ Building in Release / optimizations will remove a lot of tail calls in the C++
 layer.  These lead to the stack overflows.

 See discussion here: https://github.com/ml-explore/mlx-swift-examples/issues/3
+
+### Performance
+
+Different models have difference performance characteristics. For example Gemma 2B may outperform Phi-2 in terms of tokens / second.
+
+You may also find that running outside the debugger boosts performance.  You can do this in Xcode by pressing cmd-opt-r and unchecking "Debug Executable".