/build/arrayfire/src/arrayfire-full-3.6.1/docs/pages/interop_opencl.md
Go to the documentation of this file.
1 Interoperability with OpenCL {#interop_opencl}
2 ========
3 
4 Although ArrayFire is quite extensive, there remain many cases in which you
5 may want to write custom kernels in OpenCL or [CUDA](\ref interop_cuda).
6 For example, you may wish to add ArrayFire to an existing code base to increase
7 your productivity, or you may need to supplement ArrayFire's functionality
8 with your own custom implementation of specific algorithms.
9 
10 ArrayFire manages its own context, queue, memory, and creates custom IDs
11 for devices. As such, most of the interoperability functions focus on reducing
12 potential synchronization conflicts between ArrayFire and OpenCL.
13 
14 # Basics
15 
16 It is fairly straightforward to interface ArrayFire with your own custom OpenCL
17 code. ArrayFire provides several functions to ease this process including:
18 
19 | Function | Purpose |
20 |-----------------------|-----------------------------------------------------|
21 | af::array(...) | Construct an ArrayFire array from cl_mem references or cl::Buffer objects |
22 | af::array.device() | Obtain a pointer to the cl_mem reference (implies lock()) |
23 | af::array.lock() | Removes ArrayFire's control of a cl_mem buffer |
24 | af::array.unlock() | Restore's ArrayFire's control over a cl_mem buffer |
25 | afcl::getPlatform() | Get ArrayFire's current cl_platform |
26 | af::getDevice() | Get the current ArrayFire Device ID |
27 | afcl::getDeviceId() | Get ArrayFire's current cl_device_id |
28 | af::setDevice() | Set ArrayFire's device from an ArrayFire device ID |
29 | afcl::setDeviceId() | Set ArrayFire's device from a cl_device_id |
30 | afcl::setDevice() | Set ArrayFire's device from a cl_device_id and cl_context |
31 | afcl::getContext() | Get ArrayFire's current cl_context |
32 | afcl::getQueue() | Get ArrayFire's current cl_command_queue |
33 | afcl::getDeviceType() | Get the current afcl_device_type |
34 
35 Additionally, the OpenCL backend permits the programmer to add and remove custom
36 devices from the ArrayFire device manager. These permit you to attach ArrayFire
37 directly to the OpenCL queue used by other portions of your application.
38 
39 | Function | Purpose |
40 |-----------------------|---------------------------------------------------|
41 | afcl::addDevice() | Add a new device to ArrayFire's device manager |
42 | afcl::deleteDevice() | Remove a device from ArrayFire's device manager |
43 
44 Below we provide two worked examples on how ArrayFire can be integrated
45 into new and existing projects.
46 
47 # Adding custom OpenCL kernels to an existing ArrayFire application
48 
49 By default, ArrayFire manages its own context, queue, memory, and creates custom
50 IDs for devices. Thus there is some bookkeeping that needs to be done to
51 integrate your custom OpenCL kernel.
52 
53 If your kernels can share operate in the same queue as ArrayFire, you should:
54 
55 1. Add an include for `af/opencl.h` to your project
56 2. Obtain the OpenCL context, device, and queue used by ArrayFire
57 3. Obtain cl_mem references to af::array objects
58 4. Load, build, and use your kernels
59 5. Return control of af::array memory to ArrayFire
60 
61 Note, ArrayFire uses an in-order queue, thus when ArrayFire and your kernels
62 are operating in the same queue, there is no need to perform any
63 synchronization operations.
64 
65 This process is best illustrated with a fully worked example:
66 
67 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
68 #include <arrayfire.h>
69 // 1. Add the af/opencl.h include to your project
70 #include <af/opencl.h>
71 
72 int main() {
73  size_t length = 10;
74 
75  // Create ArrayFire array objects:
76  af::array A = af::randu(length, f32);
77  af::array B = af::constant(0, length, f32);
78 
79  // ... additional ArrayFire operations here
80 
81  // 2. Obtain the device, context, and queue used by ArrayFire
82  static cl_context af_context = afcl::getContext();
83  static cl_device_id af_device_id = afcl::getDeviceId();
84  static cl_command_queue af_queue = afcl::getQueue();
85 
86  // 3. Obtain cl_mem references to af::array objects
87  cl_mem * d_A = A.device<cl_mem>();
88  cl_mem * d_B = B.device<cl_mem>();
89 
90  // 4. Load, build, and use your kernels.
91  // For the sake of readability, we have omitted error checking.
92  int status = CL_SUCCESS;
93 
94  // A simple copy kernel, uses C++11 syntax for multi-line strings.
95  const char * kernel_name = "copy_kernel";
96  const char * source = R"(
97  void __kernel
98  copy_kernel(__global float * gA, __global float * gB)
99  {
100  int id = get_global_id(0);
101  gB[id] = gA[id];
102  }
103  )";
104 
105  // Create the program, build the executable, and extract the entry point
106  // for the kernel.
107  cl_program program = clCreateProgramWithSource(af_context, 1, &source, NULL, &status);
108  status = clBuildProgram(program, 1, &af_device_id, NULL, NULL, NULL);
109  cl_kernel kernel = clCreateKernel(program, kernel_name, &status);
110 
111  // Set arguments and launch your kernels
112  clSetKernelArg(kernel, 0, sizeof(cl_mem), d_A);
113  clSetKernelArg(kernel, 1, sizeof(cl_mem), d_B);
114  clEnqueueNDRangeKernel(af_queue, kernel, 1, NULL, &length, NULL, 0, NULL, NULL);
115 
116  // 5. Return control of af::array memory to ArrayFire
117  A.unlock();
118  B.unlock();
119 
120  // ... resume ArrayFire operations
121 
122  // Because the device pointers, d_x and d_y, were returned to ArrayFire's
123  // control by the unlock function, there is no need to free them using
124  // clReleaseMemObject()
125 
126  return 0;
127 }
128 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
129 
130 If your kernels needs to operate in their own OpenCL queue, the process is
131 essentially identical, except you need to instruct ArrayFire to complete
132 its computations using the af::sync() function prior to launching your
133 own kernel and ensure your kernels are complete using `clFinish`
134 (or similar) commands prior to returning control of the memory to ArrayFire:
135 
136 1. Add an include for `af/opencl.h` to your project
137 2. Obtain the OpenCL context, device, and queue used by ArrayFire
138 3. Obtain cl_mem references to af::array objects
139 4. Instruct ArrayFire to finish operations using af::sync()
140 5. Load, build, and use your kernels
141 6. Instruct OpenCL to finish operations using clFinish() or similar commands.
142 5. Return control of af::array memory to ArrayFire
143 
144 # Adding ArrayFire to an existing OpenCL application
145 
146 Adding ArrayFire to an existing OpenCL application is slightly more involved
147 and can be somewhat tricky due to several optimizations we implement. The
148 most important are as follows:
149 
150 * ArrayFire assumes control of all memory provided to it.
151 * ArrayFire does not (in general) support in-place memory transactions.
152 
153 We will discuss the implications of these items below. To add ArrayFire
154 to existing code you need to:
155 
156 1. Add includes
157 2. Instruct OpenCL to complete its operations using clFinish (or similar)
158 3. Instruct ArrayFire to use the user-created OpenCL Context
159 4. Create ArrayFire arrays from OpenCL memory objects
160 5. Perform ArrayFire operations on the Arrays
161 6. Instruct ArrayFire to finish operations using af::sync()
162 7. Obtain cl_mem references for important memory
163 8. Continue your OpenCL application
164 
165 To create the af::array objects, you should use one of the following
166 constructors:
167 
168 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
169 // 1D - 3D af::array constructors
170 static af::array array (dim_t dim0, cl_mem buf, af::dtype type, bool retain=false)
171 static af::array array (dim_t dim0, dim_t dim1, cl_mem buf, af::dtype type, bool retain=false)
172 static af::array array (dim_t dim0, dim_t dim1, dim_t dim2, cl_mem buf, af::dtype type, bool retain=false)
173 static af::array array (dim_t dim0, dim_t dim1, dim_t dim2, dim_t dim3, cl_mem buf, af::dtype type, bool retain=false)
174 
175 // af::array constructor using a dim4 object
176 static af::array array (af::dim4 idims, cl_mem buf, af::dtype type, bool retain=false)
177 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
178 
179 *NOTE*: With all of these constructors, ArrayFire's memory manager automatically
180 assumes responsibility for any memory provided to it. If you are creating
181 an array from a `cl::Buffer`, you should specify `retain=true` to ensure your
182 memory is not deallocated if your `cl::Buffer` were to go out of scope.
183 We use this technique in the example below.
184 If you do not wish for ArrayFire to manage your memory, you may call the
185 `array::unlock()` function and manage the memory yourself; however, if you do
186 so, please be cautious not to call `clReleaseMemObj` on a `cl_mem` when
187 ArrayFire might be using it!
188 
189 The eight steps above are best illustrated using a fully-worked example. Below we
190 use the OpenCL 2.0 C++ API and omit error checking to keep the code readable.
191 
192 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~{.cpp}
193 #include <CL/cl2.hpp>
194 
195 // 1. Add arrayfire.h and af/opencl.h to your application
196 #include "arrayfire.h"
197 #include "af/opencl.h"
198 
199 #include <cstdio>
200 #include <vector>
201 
202 int main() {
203 
204  // Set up the OpenCL context, device, and queues
205  cl::Context context(CL_DEVICE_TYPE_ALL);
206  vector<cl::Device> devices = context.getInfo<CL_CONTEXT_DEVICES>();
207  cl::Device device = devices[0];
208  cl::CommandQueue queue(context, device);
209 
210  // Create a buffer of size 10 filled with ones, copy it to the device
211  int length = 10;
212  vector<float> h_A(length, 1);
213  cl::Buffer cl_A(context, CL_MEM_READ_WRITE, length * sizeof(float), h_A.data());
214 
215  // 2. Instruct OpenCL to complete its operations using clFinish (or similar)
216  queue.finish();
217 
218  // 3. Instruct ArrayFire to use the user-created context
219  // First, create a device from the current OpenCL device + context + queue
220  afcl::addDevice(device(), context(), queue());
221  // Next switch ArrayFire to the device using the device and context as
222  // identifiers:
223  afcl::setDevice(device(), context());
224 
225  // 4. Create ArrayFire arrays from OpenCL memory objects
226  af::array af_A = afcl::array(length, cl_A(), f32, true);
227 
228  // 5. Perform ArrayFire operations on the Arrays
229  af_A = af_A + af::randu(length);
230 
231  // NOTE: ArrayFire does not perform the above transaction using in-place memory,
232  // thus the underlying OpenCL buffers containing the memory containing memory to
233  // probably have changed
234 
235  // 6. Instruct ArrayFire to finish operations using af::sync
236  af::sync();
237 
238  // 7. Obtain cl_mem references for important memory
239  cl_A = *af_A.device<cl_mem>();
240 
241  // 8. Continue your OpenCL application
242 
243  // ...
244 
245  return 0;
246 }
247 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
248 
249 # Using multiple devices
250 
251 If you are using ArrayFire and OpenCL with multiple devices be sure to use
252 `afcl::addDevice` to add your custom context + device + queue to ArrayFire's
253 device manager. This will let you switch ArrayFire devices using your current
254 `cl_device_id` and `cl_context`.