* Update Mistral 7B config * Add Mistral NeMo * Update for Llama 3.1 * Align LlamaConfiguration with Python implementation * Fix model configuration names * Refine DynamicNTKScalingRoPE * compute base only once --------- Co-authored-by: Awni Hannun <awni@apple.com>
8.4 KiB
8.4 KiB