* add buffer cache limit
* swift-format
* a more reasonable size
* add memory stats to command line tool, update to final api
* add note about changing models
- document the tokenizer used (https://github.com/huggingface/swift-transformers)
- provide a hook for tokenizer configuration, prompt augmentation
- this isn't as rich as the python equivalents but it helps a little