Error from load_quant
#25
Labels
bug
Something isn't working
good first issue
Good for newcomers
help wanted
Extra attention is needed
I am using AWS P3 8xLarge instance. I was trying to run your code and getting the following error -
Loading model Models/vicuna-7B-1.1-GPTQ-4bit-128g checkpoint Models/vicuna-7B-1.1-GPTQ-4bit-128g/vicuna-7B-1.1-GPTQ-4bit-128g.safetensors
Loading model ...
Found 3 unique KN Linear values.
Warming up autotune cache ...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
Found 1 unique fused mlp KN values.
Warming up autotune cache ...
0%| python3: project/lib/Analysis/Allocation.cpp:42: std::pair<llvm::SmallVector, llvm::SmallVector > mlir::triton::getCvtOrder(const mlir::Attribute&, const mlir::
Aborted
The text was updated successfully, but these errors were encountered: