Skip to content

Commit

Permalink
kompute : disable GPU offload for Mixtral
Browse files Browse the repository at this point in the history
We haven't implemented the necessary GPU kernels yet.

Fixes this crash:

ggml_vk_graph_compute: error: unsupported op 'ARGSORT'
GGML_ASSERT: /home/jared/src/forks/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml-kompute.cpp:1508: !"unsupported op"

Signed-off-by: Jared Van Bortel <[email protected]>
  • Loading branch information
cebtenzzre committed Feb 5, 2024
1 parent 06ba998 commit 315102f
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4138,6 +4138,7 @@ static int llama_model_load(const std::string & fname, llama_model & model, llam
#ifdef GGML_USE_KOMPUTE
if (params.n_gpu_layers > 0 && (
!(model.arch == LLM_ARCH_LLAMA || model.arch == LLM_ARCH_FALCON)
|| model.hparams.n_expert > 0
|| !(
model.ftype == LLAMA_FTYPE_ALL_F32 ||
model.ftype == LLAMA_FTYPE_MOSTLY_F16 ||
Expand Down

0 comments on commit 315102f

Please sign in to comment.