Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
gemma : use more bits for the token_embd.weight tensor (ggerganov#5650)
* gemma : use Q8_0 for the token_embd.weight tensor * llama : quantize token_embd.weight using output type (cherry picked from commit 96633ee) Signed-off-by: Jared Van Bortel <[email protected]>
- Loading branch information