Commit Graph

7 Commits

Author SHA1 Message Date
David Koski
9d74afd119 handle partially quantized models (#76)
* handle partially quantized models

- fix for #53 #71 #69 #74
- in order to test the models
	- I added a default prompt of an appropriate form
	- while working on the model configuration also added additional stop tokens (#74)
- fixed the repetitionPenalty code (#71)
2024-05-28 16:35:11 -07:00
Awni Hannun
b951b78eb2 phi3 (#54)
* phi3

Co-authored-by: David Koski <dkoski@apple.com>
2024-04-24 09:31:01 -07:00
David Koski
96b94b0df6 prepare for lora branch (#47)
- remove async llm generation -- this is just doubling our work
	- and does not match the style used in the example applications
- package generation parameters into a struct
- refactor command line arguments into distinct pieces based on their use
	- this will be reusable in the lora commands
2024-04-10 10:56:18 -07:00
Anchen
c27208812d chore: add repetition_penalty example (#45) 2024-04-04 15:15:50 -07:00
Anchen
2d0fdfe3a9 chore(llm-tool): add the top_p option in the llm-tool (#41)
* chore: add top p option in llm-tool
* chore: wire up the top p with async generate
2024-04-03 07:54:54 -07:00
Anchen
3314e20a24 chore: add top_p sampling example (#34) 2024-03-26 12:44:13 -07:00
David Koski
c86d1c195e partial fix for #1
- handle loading models with different names for the safetensors files (gemma)
- handle merge tokens that can't be split
- organize code into Load/Evaluate
2024-02-26 13:23:21 -08:00