handle partially quantized models (#76)
* handle partially quantized models - fix for #53 #71 #69 #74 - in order to test the models - I added a default prompt of an appropriate form - while working on the model configuration also added additional stop tokens (#74) - fixed the repetitionPenalty code (#71)
This commit is contained in:
@@ -266,6 +266,7 @@ class LoRAEvaluator {
|
||||
let result = await LLM.generate(
|
||||
promptTokens: promptTokens, parameters: generateParameters, model: model,
|
||||
tokenizer: tokenizer,
|
||||
extraEOSTokens: modelConfiguration.extraEOSTokens,
|
||||
didGenerate: { tokens in
|
||||
if tokens.count % evaluateShowEvery == 0 {
|
||||
let fullOutput = tokenizer.decode(tokens: tokens)
|
||||
|
||||
Reference in New Issue
Block a user