Skip to content

Commit

Permalink
First bunch of opencl 46 suggestions
Browse files Browse the repository at this point in the history
- preferences
- setting up
  • Loading branch information
jenshannoschwalm committed Dec 6, 2023
1 parent 0130657 commit fec9488
Show file tree
Hide file tree
Showing 4 changed files with 29 additions and 41 deletions.
14 changes: 8 additions & 6 deletions content/preferences-settings/processing.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,12 @@ OpenCL scheduling profile
: - _very fast GPU_: both pixelpipes are processed sequentially on the GPU.
: - _multiple GPUs_: both pixelpipes are processed in parallel on different GPUs -- see the [multiple devices](../special-topics/opencl/multiple-devices.md) section for more information,

tune OpenCL performance
: Defines how darktable will attempt to tune OpenCL performance for your system. The following options are provided (default _nothing_):
: - _nothing_: do not attempt to tune OpenCL performance.
: - _memory size_: this parameter currently (by default) applies a fixed 400MB headroom to all devices and assumes the remainder (total device memory less 400MB) is available for OpenCL module processing. You can also choose to amend this value or have darktable attempt to auto-detect available memory by changing a parameter in your `darktablerc` file. Please see the [memory & performance tuning](../special-topics/mem-performance.md#id-specific-opencl-configuration) section for more details. If you choose to enable auto-detection, switching this parameter off and on again will force a re-detection at the next pipe run.
: - _memory transfer_: when darktable needs more memory than it has available, it breaks your images into tiles, which are processed separately. When tiling, darktable frequently needs to transfer data between system and GPU memory. This option tells darktable to use a special copy mode (pinned memory transfer), which can be faster, but can also require more memory on some devices. On other devices it might degrade performance. There is no safe general way to predict how this option will function on a given device so you will have to test it for yourself. If you have multiple devices, you can switch pinned memory transfer on or off on a "per device" basis by directly editing your darktablerc file.
: - _memory size and transfer_: use both tuning mechanisms.
use all device memory
: Enable this option to allow darktable to use all OpenCL memory on all devices except a safety margin (headroom). The headroom default is 600MB but may be specified per device:

OpenCL drivers
: In most cases darktable finds correct driver setups but in some situations you may have installed several OpenCL drivers for a hardware device. Examples would be AMD cards with the vendor provided driver plus some rusticl or on windows systems a vendor driver plus the OpenCLon12.
: For most drivers you will find toggle switches, on more exotic hardware like ARM boards you have to switch on the fallback "other platforms".
: Select the drivers you want to use from the list, in case you suspect a driver to malfunction you can switch it off here.

: See the [memory & performance tuning](../special-topics/mem-performance.md) section for more information.
6 changes: 3 additions & 3 deletions content/special-topics/opencl/activate-opencl.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ draft: false
author: "people"
---

Using OpenCL in darktable requires that your PC is equipped with a suitable graphics card and that it has the required libraries in place. Most modern graphics cards from NVIDIA and AMD come with full OpenCL support. The OpenCL compiler is normally shipped as part of the proprietary graphics driver and is used as a dynamic library called `libOpenCL.so`. This library must be in a folder where it can be found by your system's dynamic linker.
Using OpenCL in darktable requires that your PC is equipped with a suitable graphics card and that it has the required libraries in place. Most modern graphics cards from NVIDIA, Intel or AMD come with full OpenCL support. The OpenCL compiler is normally shipped as part of the proprietary graphics driver and is used as a dynamic library called `libOpenCL.so`. This library must be in a folder where it can be found by your system's dynamic linker.

When darktable starts, it will first try to find and load `libOpenCL.so` and, on success, check if the available graphics card comes with OpenCL support. A sufficient amount of graphics memory (1GB+) needs to be available for darktable to take advantage of the GPU. If that check passes, darktable tries to setup its OpenCL environment: a processing context needs to be initialized, a calculation pipeline to be started, OpenCL source code files (extension is `.cl`) needs to be read and compiled and the included routines (OpenCL kernels) need to be prepared for darktable's modules. If all of that completes successfully, the preparation is complete.
When darktable starts, it will first try to find and load `libOpenCL.so` and, on success, check if the available graphics card comes with OpenCL support. A minimally-sufficient amount of graphics memory (1GB+) needs to be available for darktable to take advantage of the GPU. If that check passes, darktable tries to setup its OpenCL environment: a processing context needs to be initialized, a calculation pipeline to be started, OpenCL source code files (extension is `.cl`) needs to be read and compiled and the included routines (OpenCL kernels) need to be prepared for darktable's modules. If all of that completes successfully, the preparation is complete.

By default, OpenCL support is activated in darktable if all the above steps were successful. If you want to de-activate it you can do so in [preferences > processing > cpu/gpu/memory](../../preferences-settings/processing#cpu--gpu--memory). This configuration parameter is grayed out if the OpenCL initialization failed.

You can switch OpenCL support off and on at any time without requiring a restart. Depending on the type of modules you are using, you will notice the effect as a general speed-up during interactive work and export. Most modules in darktable can take advantage of OpenCL but not all modules are demanding enough to make a noticeable difference. In order to feel a real difference, use modules like [_diffuse or sharpen_](../../module-reference/processing-modules/diffuse.md), and [_denoise (profiled)_](../../module-reference/processing-modules/denoise-profiled.md).

If you are interested in profiling statistics, you can start darktable with command line parameters `-d opencl -d perf`. After each run of the pixelpipe you will be shown details of processing time for each module plus an even more fine-grained profile for all used OpenCL kernels.

Apart from the speed-up you should not see any difference in the results between CPU and GPU processing. Except for some rounding errors, the results are designed to be identical. If, for some reason, darktable fails to properly finish a GPU calculation, it will normally detect the failure and automatically (and transparently) fall back to CPU processing.
Apart from the speed-up you should not see any difference in the results between CPU and GPU processing. Except for some rounding errors, the results are designed to be identical. If, for some reason, darktable fails to properly finish a GPU calculation, it will normally detect the failure and automatically (and transparently) fall back to CPU processing.
3 changes: 2 additions & 1 deletion content/special-topics/opencl/problems-solutions.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,5 @@ Here are a few cases that have been observed in the past:

- darktable fails to compile its OpenCL source files at run-time. In this case you will see a number of error messages looking like typical compiler errors. This could indicate an incompatibility between your OpenCL implementation and darktable's interpretation of the standard. In that case please raise an issue on [github](https://github.com/darktable-org/darktable/issues/new/choose) and we will try to assist. Please also report if you see significant differences between CPU and GPU processing of an image.

A few on-CPU implementations of OpenCL also exist, coming as drivers provided by INTEL or AMD. We have observed that they do not provide any speed gain versus our hand-optimized CPU code. Therefore darktable simply discards these devices by default. This behavior can be changed by setting the configuration variable `opencl_use_cpu_devices` (in `$HOME/.config/darktablerc`) to `TRUE`.
- you have installed a number of OpenCL drivers meant for the same hardware, this will always lead to severe problems and must strictly be avoided. On windows systems you often have the `Microsoft OpenCLon12` driver installed. Inspect and check at [preferences > processing > cpu/gpu/memory](../../preferences-settings/processing#cpu--gpu--memory)
A few on-CPU implementations of OpenCL also exist, coming as drivers provided by INTEL or AMD. We have observed that they do not provide any speed gain versus our hand-optimized CPU code. Therefore darktable simply discards these drivers / devices by default.
47 changes: 16 additions & 31 deletions content/special-topics/opencl/setting-up.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ draft: false
author: "people"
---

The huge diversity of systems and the marked differences between OpenCL vendors and driver versions makes it impossible to give an comprehensive overview of how to setup OpenCL. We only can give you an example, in this case for NVIDIA driver version 331.89 on Ubuntu 14.04. We hope that this will serve as a basic introduction and will help you to solve any problems specific to your setup.
The huge diversity of systems and the marked differences between OpenCL vendors and driver versions makes it impossible to give an comprehensive overview of how to setup OpenCL. We only can give you an example, in this case for NVIDIA driver version 542.29.06 on Fedora 39. We hope that this will serve as a basic introduction and will help you to solve any problems specific to your setup.

The principle OpenCL function flow is like this:

Expand All @@ -16,7 +16,7 @@ The principle OpenCL function flow is like this:

- `libOpenCL.so` reads the vendor-specific information file (`/etc/OpenCL/vendors/nvidia.icd`) to find the library that contains the vendor-specific OpenCL implementation.

- The vendor-specific OpenCL implementation comes as a library `libnvidia-opencl.so.1` (which in our case is a symbolic link to `libnvidia-opencl.so.331.89`).
- The vendor-specific OpenCL implementation comes as a library `libnvidia-opencl.so.1` (which in our case is a symbolic link to `libnvidia-opencl.so.545.29.06`).

- `libnvidia-opencl.so.1` needs to talk to the vendor-specific kernel modules `nvidia` and `nvidia_uvm` via device special files `/dev/nvidia0`, `/dev/nvidiactl`, and `/dev/nvidia-uvm`.

Expand All @@ -27,34 +27,19 @@ A user account that needs to make use of OpenCL from within darktable must have
To summarise, the packages that needed to be installed in this specific case were:

```
nvidia-331 (331.89-0ubuntu1~xedgers14.04.2)
nvidia-331-dev (331.89-0ubuntu1~xedgers14.04.2)
nvidia-331-uvm (331.89-0ubuntu1~xedgers14.04.2)
nvidia-libopencl1-331 (331.89-0ubuntu1~xedgers14.04.2)
nvidia-modprobe (340.24-1)
nvidia-opencl-dev:amd64 (5.5.22-3ubuntu1)
nvidia-opencl-icd-331 (331.89-0ubuntu1~xedgers14.04.2)
nvidia-settings (340.24-0ubuntu1~xedgers14.04.1)
nvidia-settings-304 (340.24-0ubuntu1~xedgers14.04.1)
nvidia-libopencl1-331 (331.89-0ubuntu1~xedgers14.04.2)
nvidia-opencl-dev:amd64 (5.5.22-3ubuntu1)
nvidia-opencl-icd-331 (331.89-0ubuntu1~xedgers14.04.2)
opencl-headers (1.2-2013.10.23-1)
xorg-x11-drv-nvidia
xorg-x11-drv-nvidia-libs
xorg-x11-drv-nvidia-cuda
xorg-x11-drv-nvidia-cuda-libs
xorg-x11-drv-nvidia-power
akmod-nvidia
nvidia-settings
nvidia-modprobe
nvidia-persistenced
opencl-headers
opencl-filesystem
ocl-icd
ocd-icd-devel
```

The list of NVIDIA related kernel modules as reported by lsmod was:

```
nvidia
nvidia_uvm
```

The list of NVIDIA related device special files (`ls -l /dev/nvidia*`) should read like:

```
crw-rw-rw- 1 root root 195, 0 Jul 28 21:13 /dev/nvidia0
crw-rw-rw- 1 root root 195, 255 Jul 28 21:13 /dev/nvidiactl
crw-rw-rw- 1 root root 250, 0 Jul 28 21:13 /dev/nvidia-uvm
```

Beware that the major/minor numbers (e.g. `250/0` for `/dev/nvidia-uvm` in this example) may vary depending on your system.
On linux systems you might also want the `clinfo` package giving you a lot of information about your OpenCL system and settings.

0 comments on commit fec9488

Please sign in to comment.