issue with IPEX v2.3.110 #718

byclear · 2024-10-04T17:05:02Z

Describe the bug

C:\Users\clear\miniconda3\envs\310\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\Users\clear\miniconda3\envs\310\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
Collecting environment information...
Traceback (most recent call last):
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 517, in
main()
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 512, in main
output = get_pretty_env_info()
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 507, in get_pretty_env_info
return pretty_str(get_env_info())
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 368, in get_env_info
xpu_available_str = str(torch.xpu.is_available())
File "C:\Users\clear\miniconda3\envs\310\lib\site-packages\torch\xpu_init_.py", line 63, in is_available
return device_count() > 0
File "C:\Users\clear\miniconda3\envs\310\lib\site-packages\torch\xpu_init_.py", line 57, in device_count
return torch._C._xpu_getDeviceCount()
RuntimeError: Can't add devices across platforms to a single context. -33 (PI_ERROR_INVALID_DEVICE)

conda install libuv
python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/cn/

Versions

C:\Users\clear\miniconda3\envs\310\lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\Users\clear\miniconda3\envs\310\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from torchvision.io, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have libjpeg or libpng installed before building torchvision from source?
warn(
Collecting environment information...
Traceback (most recent call last):
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 517, in
main()
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 512, in main
output = get_pretty_env_info()
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 507, in get_pretty_env_info
return pretty_str(get_env_info())
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 368, in get_env_info
xpu_available_str = str(torch.xpu.is_available())
File "C:\Users\clear\miniconda3\envs\310\lib\site-packages\torch\xpu_init_.py", line 63, in is_available
return device_count() > 0
File "C:\Users\clear\miniconda3\envs\310\lib\site-packages\torch\xpu_init_.py", line 57, in device_count
return torch._C._xpu_getDeviceCount()
RuntimeError: Can't add devices across platforms to a single context. -33 (PI_ERROR_INVALID_DEVICE)

The text was updated successfully, but these errors were encountered:

ZailiWang · 2024-10-05T02:10:59Z

Hi, let me have a check and get back to you. Thanks.

byclear · 2024-10-05T08:11:48Z

Hi, let me have a check and get back to you. Thanks.

好的。麻烦你看看。谢谢

ZailiWang · 2024-10-09T06:41:43Z

Hi, are you sure you are using an integrated Arc in Meteor Lake or Lunar Lake processors, or a discrete Arc device? Please be aware the pip package urls are different for these devices in the installation guide

byclear · 2024-10-09T07:13:22Z

Hi, are you sure you are using an integrated Arc in Meteor Lake or Lunar Lake processors, or a discrete Arc device? Please be aware the pip package urls are different for these devices in the installation guide

I am certain that the device I am using is the Intel A770M. I used 2.1.40 without any problem. But when I use 2.3.110, I get an error. I am using NUC12. Viper Canyon

byclear · 2024-10-09T07:14:41Z

主板 Intel Corporation
型号 NUC12SNKi72
版本 M45201-502
操作系统 Microsoft Windows 11 企业版 LTSC (64 位)
版本 24H2 (10.0.26100)
版本（内部版本号）

设备和驱动程序
处理器 12th Gen Intel® Core™ i7-12700H

显卡
Intel® Iris® Xe Graphics
Intel® Arc™ A770M Graphics

ZailiWang · 2024-10-09T07:34:03Z

The extra-index-url argument varies for discrete/integrated arc graphics. In your 2.3.110 installation step you used a url for Lunar Lake, which is not available when 2.1.40 released, so it's a bit confusing.

A NUC12 should have a discrete A770M card in it, so please try to change the url to

--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

byclear · 2024-10-09T07:43:02Z

参数extra-index-url因离散/集成弧形图形而异。在 2.3.110 安装步骤中，您使用了 Lunar Lake 的 URL，但在 2.1.40 发布时该 URL 不可用，因此有点令人困惑。

NUC12 应该有独立的 A770M 卡，因此请尝试将 URL 更改为
--extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

Traceback (most recent call last):
File "", line 1, in
File "C:\Users\clear\miniconda3\envs\torch\lib\site-packages\torch_init_.py", line 143, in
raise err
OSError: [WinError 126] 找不到指定的模块。 Error loading "C:\Users\clear\miniconda3\envs\torch\lib\site-packages\torch\lib\backend_with_compiler.dll" or one of its dependencies.

我要又要重复 2.1.40了嘛 one api嘛。我还是想使用预编译的。看看有没有会比较稳定和快。。。 one api 都停止维护了

ZailiWang · 2024-10-09T07:55:31Z

啊不是。。我也中文吧.就是你还是按2.3.110的安装文档来，只不过python -m pip install 装ipex时留意下把最后那个extra-index-url 参数改成我上边说的那个（而不是你主帖里的/lnl/us 结尾那个）

byclear · 2024-10-09T07:57:31Z

python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

你贴一下完整代码可以嘛。好大哥。我是这个

啊不是。。我也中文吧.就是你还是按2.3.110的安装文档来，只不过python -m pip install 装ipex时留意下把最后那个extra-index-url 参数改成我上边说的那个（而不是你主帖里的/lnl/us 结尾那个）

byclear · 2024-10-09T08:00:10Z

我装完就是报错了没有DLL。因为我的系统重新装了。之前的系统装oneapi。确实可以用的。现在一直反复好像找不到GPU。按照文档安装遇到RuntimeError: Can't add devices across platforms to a single context. -33 (PI_ERROR_INVALID_DEVICE) 错误。按照你给的url安装遇到缺失DLL OSError: [WinError 126] 找不到指定的模块。 Error loading "C:\Users\clear\miniconda3\envs\torch\lib\site-packages\torch\lib\backend_with_compiler.dll" or one of its dependencies.

byclear · 2024-10-09T08:02:14Z

这个问题。似乎能解决的唯一路径。就是安装oneapi 并且环境变量配置oneapi的dll 也能解决这个问题。。但是想用有没有编译好的.看看速度有没有更快。

ZailiWang · 2024-10-09T08:05:18Z

你的意思是重装完系统，安装2.3.110时，如果不单独装oneAPI, 还是会报错？即使在修改了extra-index-url 后也还一样？

jingxu10 · 2024-10-09T08:06:16Z

我装完就是报错了没有DLL。因为我的系统重新装了。之前的系统装oneapi。确实可以用的。现在一直反复好像找不到GPU。按照文档安装遇到RuntimeError: Can't add devices across platforms to a single context. -33 (PI_ERROR_INVALID_DEVICE) 错误。按照你给的url安装遇到缺失DLL OSError: [WinError 126] 找不到指定的模块。 Error loading "C:\Users\clear\miniconda3\envs\torch\lib\site-packages\torch\lib\backend_with_compiler.dll" or one of its dependencies.

你跑过conda install libuv吗？

byclear · 2024-10-09T08:21:26Z

哦不好意思，确实没有跑conda install libuv ，运行完之后还是这个问题。C:\Userslear\miniconda3\envs\pytorch\lib\site-packages\intel_extension_for_pytorch\llm_init__. py:9： UserWarning: failed to use huggingface generation fuctions due to：没有名为 'transformers' 的模块。
warnings.warn(f “failed to use huggingface generation fuctions due to: {e}.”)
2.3.1+cxx11.abi
2.3.110+xpu
回溯（最近调用）：
文件 “”, 第 1 行, 在中
File “C:\Users\clear\miniconda3\envs\pytorch\libsite-packages\torch\xpu_init_.py”, line 57, in device_count
return torch._C._xpu_getDeviceCount()
运行时错误：无法在单个上下文中添加跨平台的设备。-33 (pi_error_invalid_device)
我的命令
conda install libuv
python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

byclear · 2024-10-09T08:57:02Z

Collecting environment information...
PyTorch version: 2.1.0.post3+cxx11.abi
PyTorch CXX11 ABI: No
IPEX version: 2.1.40+xpu
IPEX commit: 80ed476
Build type: Release

OS: N/A
GCC version: N/A
Clang version: N/A
IGC version: N/A
CMake version: N/A
Libc version: N/A

Python version: 3.10.15 | packaged by Anaconda, Inc. | (main, Oct 3 2024, 07:22:19) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.26100-SP0
Is XPU available: True
DPCPP runtime version: N/A
MKL version: N/A
GPU models and configuration:
[0] _DeviceProperties(name='Intel(R) Iris(R) Xe Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu', driver_version='1.3.30714', has_fp64=0, total_memory=30149MB, max_compute_units=96, gpu_eu_count=96)
[1] _DeviceProperties(name='Intel(R) Arc(TM) A770M Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu', driver_version='1.3.30714', has_fp64=0, total_memory=15930MB, max_compute_units=512, gpu_eu_count=512)
Intel OpenCL ICD version: N/A
Level Zero version: N/A

CPU:
'wmic' 不是内部或外部命令，也不是可运行的程序
或批处理文件。

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.1.40+xpu
[pip3] numpy==1.26.4
[pip3] torch==2.1.0.post3+cxx11.abi
[pip3] torchaudio==2.1.0.post3+cxx11.abi
[pip3] torchvision==0.16.0.post3+cxx11.abi
[conda] intel-extension-for-pytorch 2.1.40+xpu pypi_0 pypi
[conda] mkl 2024.2.1 pypi_0 pypi
[conda] mkl-dpcpp 2024.2.1 pypi_0 pypi
[conda] numpy 1.26.4 pypi_0 pypi
[conda] onemkl-sycl-blas 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-datafitting 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-dft 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-lapack 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-rng 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-sparse 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-stats 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-vm 2024.2.1 pypi_0 pypi
[conda] torch 2.1.0.post3+cxx11.abi pypi_0 pypi
[conda] torchaudio 2.1.0.post3+cxx11.abi pypi_0 pypi
[conda] torchvision 0.16.0.post3+cxx11.abi pypi_0 pypi
当我降级安装 2.1.40 是可以读取到XPU的

ZailiWang · 2024-10-09T08:59:38Z

好的，我们再看看这个问题啊

ZailiWang · 2024-10-10T07:30:48Z

Hi @byclear 再确认一下啊，你装驱动那一步是装的红框的那个吗？

然后你昨天试验回退2.1.40的时候，切换ipex版本中间也没有重装过驱动是吧？

byclear · 2024-10-10T07:41:39Z

Yes, I have not reinstalled the drivers. I'm using 32.0.101.6079 WHQL. I'll test it by installing the specified version 32.0.101.5768 now. Reply later with the results. Thanks for the help. @ZailiWang

byclear · 2024-10-10T08:31:53Z

我卸载了 32.0.101.6079 并重启系统后，安装 32.0.101.5768。安装完后重启系统。运行如下命令：conda install libuv
python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

错误依然是
收集环境信息...
Traceback (most recent call last)：
文件 “C:\pythonxm\测试ARC显卡\collect_env.py”, 行 516, 在中
main()
File “C:pythonxm\ Test ARC 显卡\collect_env.py”, line 511, in main
output = get_pretty_env_info()
文件 “C:pythonxm\ 测试 ARC 显卡\collect_env.py”, 第 506 行, 在 get_pretty_env_info 中
return pretty_str(get_env_info())
文件 “C:pythonxm\ 测试 ARC 显卡\collect_env.py”, 第 367 行, 在 get_env_info 中
xpu_available_str = str(torch.xpu.is_available())
File “C:\Users\clear\miniconda3\envs\pytorch\libsite-packages\torch\xpu_init_.py”, line 63, in is_available
return device_count() > 0
File “C:\Users\clear\miniconda3\envs\pytorch\libsite-packages\torch\xpu_init_.py”, line 57, in device_count
return torch._C._xpu_getDeviceCount()
运行时错误：无法在单个上下文中添加跨平台的设备。-33 (pi_error_invalid_device)

@ZailiWang

ZailiWang · 2024-10-10T08:33:21Z

好的，能否再帮忙确认下现在回退到2.1.40是否能正常运行了

byclear · 2024-10-10T08:35:21Z

Collecting environment information...
PyTorch version: 2.1.0.post3+cxx11.abi
PyTorch CXX11 ABI: No
IPEX version: 2.1.40+xpu
IPEX commit: 80ed476
Build type: Release

OS: N/A
GCC version: N/A
Clang version: N/A
IGC version: N/A
CMake version: N/A
Libc version: N/A

Python version: 3.10.15 | packaged by Anaconda, Inc. | (main, Oct 3 2024, 07:22:19) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.26100-SP0
Is XPU available: True
DPCPP runtime version: N/A
MKL version: N/A
GPU models and configuration:
[0] _DeviceProperties(name='Intel(R) Iris(R) Xe Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu', driver_version='1.3.29803', has_fp64=0, total_memory=30149MB, max_compute_units=96, gpu_eu_count=96)
[1] _DeviceProperties(name='Intel(R) Arc(TM) A770M Graphics', platform_name='Intel(R) Level-Zero', dev_type='gpu', driver_version='1.3.29803', has_fp64=0, total_memory=15930MB, max_compute_units=512, gpu_eu_count=512)
Intel OpenCL ICD version: N/A
Level Zero version: N/A

CPU:
'wmic' 不是内部或外部命令，也不是可运行的程序
或批处理文件。

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.1.40+xpu
[pip3] numpy==2.1.2
[pip3] torch==2.1.0.post3+cxx11.abi
[pip3] torchaudio==2.1.0.post3+cxx11.abi
[pip3] torchvision==0.16.0.post3+cxx11.abi
[conda] intel-extension-for-pytorch 2.1.40+xpu pypi_0 pypi
[conda] mkl 2024.2.1 pypi_0 pypi
[conda] mkl-dpcpp 2024.2.1 pypi_0 pypi
[conda] numpy 2.1.2 pypi_0 pypi
[conda] onemkl-sycl-blas 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-datafitting 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-dft 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-lapack 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-rng 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-sparse 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-stats 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-vm 2024.2.1 pypi_0 pypi
[conda] torch 2.1.0.post3+cxx11.abi pypi_0 pypi
[conda] torchaudio 2.1.0.post3+cxx11.abi pypi_0 pypi
[conda] torchvision 0.16.0.post3+cxx11.abi pypi_0 pypi

可以运行没有报错。这是详细的检测报告 @ZailiWang

ZailiWang · 2024-10-10T08:50:33Z

感谢，我再找内部的人问问

ZailiWang · 2024-10-11T04:01:51Z

麻烦再帮着测下如果在设备管理器里disable掉集显，就是这个

[0] _DeviceProperties(name='Intel(R) Iris(R) Xe Graphics'

然后 2.3.110在arc770m上是不是就不报错了。设备管理器里找到这个集显设备然后右键->禁用应该就可以。

ZailiWang · 2024-10-11T04:03:35Z

如果禁用集显后就能用arc770了，那暂时先这么用吧，我们赶紧想办法修这个bug. 造成不便，非常抱歉~

byclear · 2024-10-11T04:07:44Z

Collecting environment information...
PyTorch version: 2.3.1+cxx11.abi
PyTorch CXX11 ABI: No
IPEX version: 2.3.110+xpu
IPEX commit: 95c9459
Build type: Release

OS: N/A
GCC version: N/A
Clang version: N/A
IGC version: N/A
CMake version: N/A
Libc version: N/A

Python version: 3.10.15 | packaged by Anaconda, Inc. | (main, Oct 3 2024, 07:22:19) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.26100-SP0
Is XPU available: True
DPCPP runtime version: N/A
MKL version: N/A
GPU models and configuration:
[0] _XpuDeviceProperties(name='Intel(R) Arc(TM) A770M Graphics', platform_name='Intel(R) Level-Zero', type='gpu', driver_version='1.3.29803', total_memory=15930MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64, max_work_group_size=1024, max_num_sub_groups=128, sub_group_sizes=[8 16 32], has_fp16=1, has_fp64=0, has_atomic64=1)
Intel OpenCL ICD version: N/A
Level Zero version: N/A

CPU:
'wmic' 不是内部或外部命令，也不是可运行的程序
或批处理文件。

Versions of relevant libraries:
[pip3] intel_extension_for_pytorch==2.3.110+xpu
[pip3] numpy==2.1.2
[pip3] torch==2.3.1+cxx11.abi
[pip3] torchaudio==2.3.1+cxx11.abi
[pip3] torchvision==0.18.1+cxx11.abi
[conda] intel-extension-for-pytorch 2.3.110+xpu pypi_0 pypi
[conda] mkl 2024.2.1 pypi_0 pypi
[conda] mkl-dpcpp 2024.2.1 pypi_0 pypi
[conda] numpy 2.1.2 pypi_0 pypi
[conda] onemkl-sycl-blas 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-datafitting 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-dft 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-lapack 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-rng 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-sparse 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-stats 2024.2.1 pypi_0 pypi
[conda] onemkl-sycl-vm 2024.2.1 pypi_0 pypi
[conda] torch 2.3.1+cxx11.abi pypi_0 pypi
[conda] torchaudio 2.3.1+cxx11.abi pypi_0 pypi
[conda] torchvision 0.18.1+cxx11.abi pypi_0 pypi

确实哦。禁用集显就能用了

ZurrTum · 2024-10-12T08:00:30Z

如果禁用集显后就能用arc770了，那暂时先这么用吧，我们赶紧想办法修这个bug. 造成不便，非常抱歉~

@ZailiWang 按照这个方式，有些显示器可能跟独显不适配，导致只能使用 Microsoft 基本显示驱动程序。
不适合长期禁用，显卡输出的帧率与显示器的刷新率不同步会导致画面割裂

处理器 13th Gen Intel(R) Core(TM) i9-13900H，2600 Mhz，14 个内核，20 个逻辑处理器
适配器类型 Intel(R) Iris(R) Xe Graphics Family, Intel Corporation 兼容
适配器类型 Intel(R) Arc(TM) A370M Graphics Family, Intel Corporation 兼容

核显支持 100赫兹 2560x1440 ，独显只能 64赫兹 2560x1440
监视器 Integrated Monitor (LEN-A570-A-C)
系统 SKU LENOVO_MT_F0GQ_BU_Lenovo_FM_XiaoXinPro 27-IRH

Python 3.12.4 | packaged by Anaconda, Inc.
intel_extension_for_pytorch 2.3.110+gitccf9c15
torch 2.3.0a0+git63d5e92
驱动版本 32.0.101.6083_101.5736 (Latest 10/7/2024)

2.3.110 版本 -33 (PI_ERROR_INVALID_DEVICE) 错误，禁用核显或者回退到 2.1.40+xpu 版本正常

ZailiWang · 2024-10-12T08:04:22Z

已经确认是bug了，后边会修复的。禁掉集显只是一个修复前暂时绕过去的方法。

DurianyDoriana · 2024-10-12T22:34:30Z

已经确认是bug了，后边会修复的。禁掉集显只是一个修复前暂时绕过去的方法。

这是为什么 IntelAI 1.21b 版本在 ARC GPU 和 iGPU 上存在问题的原因吗？

它也使用 2.3.110+xpu 吗？

我提交了以下错误报告：
intel/AI-Playground#76

ZailiWang · 2024-10-29T06:46:36Z

Hi, the issue has been fixed. Would you retry with re-installation

python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/ --force-reinstall

and check if the issue is resolved at your side. Thanks!

byclear · 2024-10-30T02:29:10Z

Hi, the issue has been fixed. Would you retry with re-installation
python -m pip install torch==2.3.1+cxx11.abi torchvision==0.18.1+cxx11.abi torchaudio==2.3.1+cxx11.abi intel-extension-for-pytorch==2.3.110+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/ --force-reinstall
and check if the issue is resolved at your side. Thanks!

Collecting environment information...
Traceback (most recent call last):
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 516, in
main()
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 511, in main
output = get_pretty_env_info()
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 506, in get_pretty_env_info
return pretty_str(get_env_info())
File "C:\pythonxm\测试ARC显卡\collect_env.py", line 367, in get_env_info
xpu_available_str = str(torch.xpu.is_available())
File "C:\Users\clear\miniconda3\envs\xpu\lib\site-packages\torch\xpu_init_.py", line 63, in is_available
return device_count() > 0
File "C:\Users\clear\miniconda3\envs\xpu\lib\site-packages\torch\xpu_init_.py", line 57, in device_count
return torch._C._xpu_getDeviceCount()
RuntimeError: Can't add devices across platforms to a single context. -33 (PI_ERROR_INVALID_DEVICE)

一样的啊。贴错命令行了？

byclear · 2024-10-30T02:30:59Z

annotated-types 0.7.0 pypi_0 pypi
bzip2 1.0.8 h2bbff1b_6
ca-certificates 2024.9.24 haa95532_0
dpcpp-cpp-rt 2024.2.1 pypi_0 pypi
filelock 3.16.1 pypi_0 pypi
fsspec 2024.10.0 pypi_0 pypi
intel-cmplr-lib-rt 2024.2.1 pypi_0 pypi
intel-cmplr-lib-ur 2024.2.1 pypi_0 pypi
intel-cmplr-lic-rt 2024.2.1 pypi_0 pypi
intel-extension-for-pytorch 2.3.110+xpu pypi_0 pypi
intel-opencl-rt 2024.2.1 pypi_0 pypi
intel-openmp 2024.2.1 pypi_0 pypi
intel-sycl-rt 2024.2.1 pypi_0 pypi
jinja2 3.1.4 pypi_0 pypi
libffi 3.4.4 hd77b12b_1
libuv 1.48.0 h827c3e9_0
markupsafe 3.0.2 pypi_0 pypi
mkl 2024.2.1 pypi_0 pypi
mkl-dpcpp 2024.2.1 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
networkx 3.4.2 pypi_0 pypi
numpy 2.1.2 pypi_0 pypi
onemkl-sycl-blas 2024.2.1 pypi_0 pypi
onemkl-sycl-datafitting 2024.2.1 pypi_0 pypi
onemkl-sycl-dft 2024.2.1 pypi_0 pypi
onemkl-sycl-lapack 2024.2.1 pypi_0 pypi
onemkl-sycl-rng 2024.2.1 pypi_0 pypi
onemkl-sycl-sparse 2024.2.1 pypi_0 pypi
onemkl-sycl-stats 2024.2.1 pypi_0 pypi
onemkl-sycl-vm 2024.2.1 pypi_0 pypi
openssl 3.0.15 h827c3e9_0
packaging 24.1 pypi_0 pypi
pillow 11.0.0 pypi_0 pypi
pip 24.2 py310haa95532_0
psutil 6.1.0 pypi_0 pypi
pydantic 2.9.2 pypi_0 pypi
pydantic-core 2.23.4 pypi_0 pypi
python 3.10.15 h4607a30_1
ruamel-yaml 0.18.6 pypi_0 pypi
ruamel-yaml-clib 0.2.12 pypi_0 pypi
setuptools 75.1.0 py310haa95532_0
sqlite 3.45.3 h2bbff1b_0
sympy 1.13.3 pypi_0 pypi
tbb 2021.13.1 pypi_0 pypi
tk 8.6.14 h0416ee5_0
torch 2.3.1+cxx11.abi pypi_0 pypi
torchaudio 2.3.1+cxx11.abi pypi_0 pypi
torchvision 0.18.1+cxx11.abi pypi_0 pypi
typing-extensions 4.12.2 pypi_0 pypi
tzdata 2024b h04d1e81_0
vc 14.40 h2eaa2aa_1
vs2015_runtime 14.40.33807 h98bb1dd_1
wheel 0.44.0 py310haa95532_0
xz 5.4.6 h8cc25b3_1
zlib 1.2.13 h8cc25b3_1

运行即报错

ZailiWang · 2024-10-30T03:07:50Z

抱歉，我得到的信息有误，这个问题的修复还没有正式更新到目前的安装包里。等修复确认发布出来了我再来告知哈。

Nuullll · 2024-11-04T08:23:26Z

@ZailiWang Did you post the wrong command accidentally? I tried the latest v2.3.110 hotfix (post0), it worked.

python -m pip install torch==2.3.1.post0+cxx11.abi torchvision==0.18.1.post0+cxx11.abi torchaudio==2.3.1.post0+cxx11.abi intel-extension-for-pytorch==2.3.110.post0+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/cn/

python -c "import torch; import intel_extension_for_pytorch as ipex; print(torch.__version__); print(ipex.__version__); [print(f'[{i}]: {torch.xpu.get_device_properties(i)}') for i in range(torch.xpu.device_count())];"
C:\Users\vfirs\miniforge3\envs\arc311\Lib\site-packages\torchvision\io\image.py:13: UserWarning: Failed to load image Python extension: 'Could not find module 'C:\Users\vfirs\miniforge3\envs\arc311\Lib\site-packages\torchvision\image.pyd' (or one of its dependencies). Try using the full path with constructor syntax.'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?
  warn(
C:\Users\vfirs\miniforge3\envs\arc311\Lib\site-packages\intel_extension_for_pytorch\llm\__init__.py:9: UserWarning: failed to use huggingface generation fuctions due to: No module named 'transformers'.
  warnings.warn(f"failed to use huggingface generation fuctions due to: {e}.")
2.3.1.post0+cxx11.abi
2.3.110.post0+xpu
[0]: _XpuDeviceProperties(name='Intel(R) UHD Graphics 770', platform_name='Intel(R) Level-Zero', type='gpu', driver_version='1.3.30398', total_memory=14829MB, max_compute_units=32, gpu_eu_count=32, gpu_subslice_count=4, max_work_group_size=512, max_num_sub_groups=64, sub_group_sizes=[8 16 32], has_fp16=1, has_fp64=0, has_atomic64=1)
[1]: _XpuDeviceProperties(name='Intel(R) Arc(TM) A770 Graphics', platform_name='Intel(R) Level-Zero', type='gpu', driver_version='1.3.30398', total_memory=15930MB, max_compute_units=512, gpu_eu_count=512, gpu_subslice_count=64, max_work_group_size=1024, max_num_sub_groups=128, sub_group_sizes=[8 16 32], has_fp16=1, has_fp64=0, has_atomic64=1)
[2]: _XpuDeviceProperties(name='Intel(R) Arc(TM) A750 Graphics', platform_name='Intel(R) Level-Zero', type='gpu', driver_version='1.3.30398', total_memory=7934MB, max_compute_units=448, gpu_eu_count=448, gpu_subslice_count=56, max_work_group_size=1024, max_num_sub_groups=128, sub_group_sizes=[8 16 32], has_fp16=1, has_fp64=0, has_atomic64=1)

ZailiWang · 2024-11-04T08:27:21Z

Yeah, the hotfix release for this issue has just released. Thanks for confirmation. @byclear Please check again by re-installing the packages, this time it should work.

byclear · 2024-11-06T03:11:33Z

2.3.1.post0+cxx11.abi
2.3.110.post0+xpu
Traceback (most recent call last):
File "C:\pythonxm\测试ARC显卡\test.py", line 38, in
print(torch.xpu.empty_cache())
File "C:\Users\clear\miniconda3\envs\xpu\lib\site-packages\intel_extension_for_pytorch\xpu\memory.py", line 21, in empty_cache
intel_extension_for_pytorch._C._emptyCache()
RuntimeError: Queue cannot be constructed with the given context and device since the device is neither a member of the context nor a descendant of its member. -33 (PI_ERROR_INVALID_DEVICE)

相关函数仍然是报错的。尤其是我想指定GPU
t = torch.Tensor([1., 2.])
print(t.to("xpu:1"))
Traceback (most recent call last):
File "C:\pythonxm\测试ARC显卡\test.py", line 41, in
print(t.to("xpu:1"))
RuntimeError: Queue cannot be constructed with the given context and device since the device is neither a member of the context nor a descendant of its member. -33 (PI_ERROR_INVALID_DEVICE)
请尽管解决。。。

ZailiWang · 2024-11-06T03:30:12Z

抱歉，看来这个问题还是没有真正解决。我反馈了，会尽快继续修复。
目前如果不想在设备管理器里禁掉iGPU，可以在运行ipex程序前先设置环境变量

ONEAPI_DEVICE_SELECTOR=*:1

这样也可以先把这个问题规避过去。

byclear · 2024-11-06T03:39:22Z

import os
os.environ["ONEAPI_DEVICE_SELECTOR"] = "*:1"

好的谢谢了。他成功的跑起来了。

ZailiWang self-assigned this Oct 5, 2024

jingxu10 added ARC ARC GPU Crash Execution crashes Windows labels Oct 10, 2024

ZailiWang added the Bug Something isn't working label Oct 11, 2024

ZailiWang added the Escalate label Oct 11, 2024

Nuullll mentioned this issue Oct 23, 2024

Need to disable integrated GPU when using A770 intel/AI-Playground#79

Closed

issue with IPEX v2.3.110 #718

issue with IPEX v2.3.110 #718

Comments

byclear commented Oct 4, 2024

Describe the bug

Versions

ZailiWang commented Oct 5, 2024

byclear commented Oct 5, 2024

ZailiWang commented Oct 9, 2024

byclear commented Oct 9, 2024

byclear commented Oct 9, 2024

ZailiWang commented Oct 9, 2024

byclear commented Oct 9, 2024

ZailiWang commented Oct 9, 2024

byclear commented Oct 9, 2024

byclear commented Oct 9, 2024

byclear commented Oct 9, 2024

ZailiWang commented Oct 9, 2024

jingxu10 commented Oct 9, 2024

byclear commented Oct 9, 2024

byclear commented Oct 9, 2024

ZailiWang commented Oct 9, 2024

ZailiWang commented Oct 10, 2024

byclear commented Oct 10, 2024

byclear commented Oct 10, 2024

ZailiWang commented Oct 10, 2024

byclear commented Oct 10, 2024 • edited Loading

ZailiWang commented Oct 10, 2024

ZailiWang commented Oct 11, 2024

ZailiWang commented Oct 11, 2024

byclear commented Oct 11, 2024

ZurrTum commented Oct 12, 2024

ZailiWang commented Oct 12, 2024

DurianyDoriana commented Oct 12, 2024

ZailiWang commented Oct 29, 2024

byclear commented Oct 30, 2024 • edited Loading

byclear commented Oct 30, 2024

ZailiWang commented Oct 30, 2024

Nuullll commented Nov 4, 2024

ZailiWang commented Nov 4, 2024

byclear commented Nov 6, 2024

ZailiWang commented Nov 6, 2024

byclear commented Nov 6, 2024

byclear commented Oct 10, 2024 •

edited

Loading

byclear commented Oct 30, 2024 •

edited

Loading