handle partially quantized models (#76)

* handle partially quantized models - fix for #53 #71 #69 #74 - in order to test the models - I added a default prompt of an appropriate form - while working on the model configuration also added additional stop tokens (#74) - fixed the repetitionPenalty code (#71)
2024-05-28 16:35:11 -07:00
parent 65f4968e5f
commit 9d74afd119
12 changed files with 139 additions and 67 deletions
--- a/Libraries/LLM/Lora.swift
+++ b/Libraries/LLM/Lora.swift
@@ -377,7 +377,7 @@ public enum LoRATrain {
    /// - training with ``train(model:train:validate:optimizer:loss:tokenizer:parameters:progress:)``
    /// - loss evaluation with ``evaluate(model:dataset:loss:tokenizer:batchSize:batchCount:)``
    /// - fusing with ``fuse(model:layers:deQuantize:)``
-    /// - text generation with ``generate(promptTokens:parameters:model:tokenizer:didGenerate:)``
+    /// - text generation with ``generate(promptTokens:parameters:model:tokenizer:additionalEOSTokens:didGenerate:)``
    ///     - note that this is just using normal model text generation
    ///
    /// - Parameters: