Skip to content

Commit

Permalink
Merge branch '370-add-small-examples-to-api-documentation' into 'deve…
Browse files Browse the repository at this point in the history
…lop_stream'

Resolve "Add small examples to API documentation"

Closes ROCm#370

See merge request amd/libraries/rocRAND!351
  • Loading branch information
matyas-streamhpc authored and Naraenda committed Oct 28, 2024
2 parents e2a914d + e18e5bc commit c43bec1
Showing 1 changed file with 124 additions and 0 deletions.
124 changes: 124 additions & 0 deletions docs/api-reference/cpp-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,136 @@ To search an API, refer to the API :ref:`genindex`.

Device functions
================

To use the device API, include the file ``rocrand_kernel.h`` in files that define kernels that use rocRAND device functions. The typical usage of device functions consists of the following operations in the device kernel definition:

1. Create a new generator state object of the desired generator type.

2. Initialize the generator state parameters using ``rocrand_init``.

3. Generate random numbers by calling generation function on the generator state.

4. Use the results.

Since the rocRAND device functions are invoked from inside the user kernel, the generated numbers may be used right away in the kernel without a need to copy them to the host memory.

In the below example, random number generation is using XORWOW generator.

.. code-block:: cpp
#include <hip/hip_runtime.h>
#include <rocrand/rocrand_kernel.h>
__global__
void test()
{
uint tid = blockDim.x * blockIdx.x + threadIdx.x;
rocrand_state_xorwow state;
rocrand_init(123, tid, 0, &state);
for(int i = 0; i < 3; ++i)
{
const auto value = rocrand(&state);
printf("thread %d, index %u: %u\n", tid, i, value);
}
}
int main()
{
test<<<dim3(1), dim3(32)>>>();
hipDeviceSynchronize();
}
.. doxygengroup:: rocranddevice

C host API
==========

C host API allows encapsulation of the internal generator state. Random numbers may be produced either on the host CPU or device GPU, whether an appropriate generator object was created. The typical sequence of operations for GPU generation consists of the following steps:

1. Allocate memory on the device with ``hipMalloc``.

2. Create a new generator of the desired type with ``rocrand_create_generator``.

3. Set the generator options, for example, use ``rocrand_set_seed`` to set the seed.

4. Generate random numbers with ``rocrand_generate`` or another generation function.

5. Use the results.

6. Clean up with ``rocrand_destroy_generator`` and ``hipFree``.

To generate random numbers on the host CPU, the memory allocation in step one should be made by host memory allocation call, in step two ``rocrand_create_generator_host`` should be called respectfully, in the last step appropriate memory release should be made beside the ``rocrand_destroy_generator``. All other calls work identically whether you are generating random numbers on the device or on the host CPU.

In the example below the C host API is used to generate 10 random floats using GPU capabilities.

.. code-block:: c
#include <hip/hip_runtime.h>
#include <rocrand.h>
#include <stdio.h>
int main()
{
size_t n = 10;
rocrand_generator gen;
float * d_rand, *h_rand;
h_rand = (float*)malloc(sizeof(float) * n);
hipMalloc((void**)&d_rand, n * sizeof(float));
rocrand_create_generator(&gen, ROCRAND_RNG_PSEUDO_DEFAULT);
rocrand_set_seed(gen, 123);
rocrand_generate_uniform(gen, d_rand, n);
hipMemcpy(h_rand, d_rand, n * sizeof(float), hipMemcpyDeviceToHost);
for(int i = 0; i < n; i++)
{
printf("%f\n", h_rand[i]);
}
rocrand_destroy_generator(gen);
hipFree(d_rand);
return 0;
}
.. doxygengroup:: rocrandhost

C++ host API wrapper
====================

C++ host API wrapper provides resource management and object-oriented interface for random number generation facilities.

In the example below C++ host API wrapper is used to produce a random number using default generation parameters.

.. code-block:: cpp
#include <hip/hip_runtime.h>
#include <rocrand/rocrand.hpp>
#include <iostream>
int main()
{
float* d_rand;
float h_rand;
hipMalloc((void**)&d_rand, sizeof(float));
rocrand_cpp::xorwow gen;
rocrand_cpp::normal_distribution<> dist;
dist(gen, d_rand, 1);
hipMemcpy(&h_rand, d_rand, sizeof(float), hipMemcpyDeviceToHost);
std::cout << h_rand << std::endl;
hipFree(d_rand);
return 0;
}
.. doxygengroup:: rocrandhostcpp

0 comments on commit c43bec1

Please sign in to comment.