Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantized models crash/fail to load on iOS #191

Open
MalikHarrisAhm opened this issue Sep 24, 2024 · 0 comments
Open

Quantized models crash/fail to load on iOS #191

MalikHarrisAhm opened this issue Sep 24, 2024 · 0 comments

Comments

@MalikHarrisAhm
Copy link

I've run the quantization as described in README on the ggml_weights.bin for q4_0 and q8_0. The original ggml_weights.bin works on XCode and when I run it on my device, however, if I change the model to any of the quantized versions, it crashes upon launch. I've added print statements to the bark_model_load function in the package, and this is the output:

Entering bark_model_load function
Model hyperparameters:
n_layer: 12
n_head: 12
n_embd: 768
block_size: 1024
bias: 0
n_in_vocab: 129600
n_out_vocab: 10048
n_lm_heads: 1
n_wtes: 1
ftype: 2007
qntvr: 2
Weight type: 8
Estimated buffer size: 6547975168 bytes
Estimated number of tensors: 78
Creating ggml context with size: 27456
ggml context created successfully
Initializing backend
Using CPU backend
CPU backend initialized successfully
Allocating weights buffer of size 6547975168
Weights buffer allocated successfully
Memory prepared for weights
Preparing key + value memory
n_mem: 12288, n_elements: 9437184
Allocating KV cache for text and coarse encoder
Memory size for KV cache: 72.00 MB
KV cache allocated successfully
Key + value memory prepared
Loading 76 tensors
Loading tensor 'model/wte/0'
Dimensions: 2, Elements: 99532800, Type: 8
Tensor shape: [768, 129600]
Tensor elements: 99532800
Tensor size: 105753600 bytes
Allocator address: 0x0
Reading 105753600 bytes from file
Tensor 'model/wte/0' has null data pointer
bark_load_model_from_file: invalid model file '/private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin' (bad text)
bark_load_model: failed to load model weights from '/private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin'
Couldn't load model at /private/var/containers/Bundle/Application/6A2F4C4C-D922-4F7B-8B50-5EFBB3017A61/Throwaway.app/ggml_weights_q4_0.bin
The operation couldn’t be completed. (Throwaway.BarkError error 0.)
Can't find or decode reasons
Failed to get or decode unavailable reasons
Can't find or decode disallowed use cases

Additional notes:

  • The quantized models run perfectly to generate .wav files if run through terminal, so they are not corrupted.
@MalikHarrisAhm MalikHarrisAhm changed the title Quantized models crash on iOS Quantized models crash/fail to load on iOS Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant