/build/arrayfire/src/arrayfire-full-3.6.1/docs/pages/release_notes.md
Go to the documentation of this file.
1 Release Notes {#releasenotes}
2 ==============
3 
4 v3.6.1
5 ======
6 
7 Improvements
8 ------------
9 - FreeImage is now a run-time dependency [#2164]
10 - Reduced binary size by setting the symbol visibility to hidden [#2168]
11 - Add memory manager logging using the AF_TRACE=mem environment variable [#2169]
12 - Improved CPU Anisotropic Diffusion performance [#2174]
13 - Perform normalization after FFT for improved accuracy [#2185][#2192]
14 - Updated CLBlast to v1.4.0 [#2178]
15 - Added additional validation when using af::seq for indexing [#2153]
16 - Perform checks for unsupported cards by the CUDA implementation [#2182]
17 
18 Bug Fixes
19 ---------
20 - Fixed region when all pixels were the foreground or background [#2152]
21 - Fixed several memory leaks [#2202][#2201][#2180][#2179][#2177][#2175]
22 - Fixed bug in setDevice which didn't allow you to select the last device [#2189]
23 - Fixed bug in min/max where the first element of the array was a NaN value [#2155]
24 - Fixed window cell indexing for graphics [#2207]
25 
26 v3.6.0
27 ======
28 
29 The source code with submodules can be downloaded directly from the following link:
30 http://arrayfire.com/arrayfire_source/arrayfire-full-3.6.0.tar.bz2
31 
32 Major Updates
33 -------------
34 
35 - Added the `topk()` function
36  [Documentation](http://arrayfire.org/docs/group__stat__func__topk.htm).
37  <sup>[1](https://github.com/arrayfire/arrayfire/pull/2061)</sup>
38 - Added batched matrix multiply support.
39  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1898)</sup>
40  <sup>[3](https://github.com/arrayfire/arrayfire/pull/2059)</sup>
41 - Added anisotropic diffusion, `anisotropicDiffusion()`.
42  [Documentation](http://arrayfire.org/docs/group__image__func__anisotropic__diffusion.htm)
43  <sup>[4](https://github.com/arrayfire/arrayfire/pull/1850)</sup>.
44 
45 Features
46 --------
47 
48 - Added support for batched matrix multiply.
49  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1898)</sup>
50  <sup>[2](https://github.com/arrayfire/arrayfire/pull/2059)</sup>
51 - New anisotropic diffusion function, `anisotropicDiffusion()`.
52  [Documentation](http://arrayfire.org/docs/group__image__func__anisotropic__diffusion.htm)
53  <sup>[3](https://github.com/arrayfire/arrayfire/pull/1850)</sup>.
54 - New `topk()` function, which returns the top k elements along a given
55  dimension of the input.
56  [Documentation](http://arrayfire.org/docs/group__stat__func__topk.htm).
57  <sup>[4](https://github.com/arrayfire/arrayfire/pull/2061)</sup>
58 - New gradient diffusion
59  [example](https://github.com/arrayfire/arrayfire/blob/master/examples/image_processing/gradient_diffusion.cpp).
60 
61 Improvements
62 ------------
63 
64 - JITted `select()` and `shift()` functions for CUDA and OpenCL backends.
65  <sup>[1](https://github.com/arrayfire/arrayfire/pull/2047)</sup>
66 - Significant CMake improvements.
67  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1861)</sup>
68  <sup>[3](https://github.com/arrayfire/arrayfire/pull/2070)</sup>
69  <sup>[4](https://github.com/arrayfire/arrayfire/pull/2018)</sup>
70 - Improved the quality of the random number generator, thanks to Ralf Stubner.
71  <sup>[5](https://github.com/arrayfire/arrayfire/pull/2122)</sup>
72 - Modified `af_colormap` struct to match forge's definition.
73  <sup>[6](https://github.com/arrayfire/arrayfire/pull/2082)</sup>
74 - Improved Black Scholes example.
75  <sup>[7](https://github.com/arrayfire/arrayfire/pull/2079)</sup>
76 - Using CPack to generate installers.
77  <sup>[8](https://github.com/arrayfire/arrayfire/pull/1861)</sup>
78 - Refactored
79  [black_scholes_options](https://github.com/arrayfire/arrayfire/blob/master/examples/financial/black_scholes_options.cpp)
80  example to use built-in `af::erfc` function for cumulative normal
81  distribution.<sup>[9](https://github.com/arrayfire/arrayfire/pull/2079)</sup>.
82 - Reduced the scope of mutexes in memory manager
83  <sup>[10](https://github.com/arrayfire/arrayfire/pull/2125)</sup>
84 - Official installers do not require the CUDA toolkit to be installed
85 - Significant CMake improvements have been made. Using CPack to generate
86  installers. <sup>[11](https://github.com/arrayfire/arrayfire/pull/1861)</sup>
87  <sup>[12](https://github.com/arrayfire/arrayfire/pull/2070)</sup>
88  <sup>[13](https://github.com/arrayfire/arrayfire/pull/2018)</sup>
89 - Corrected assert function calls in select() tests.
90  <sup>[14](https://github.com/arrayfire/arrayfire/pull/2058)</sup>
91 
92 Bug fixes
93 -----------
94 
95 - Fixed `shfl_down()` warnings with CUDA 9.
96  <sup>[1](https://github.com/arrayfire/arrayfire/pull/2040)</sup>
97 - Disabled CUDA JIT debug flags on ARM
98  architecture.<sup>[2](https://github.com/arrayfire/arrayfire/pull/2037)</sup>
99 - Fixed CLBLast install lib dir for linux platform where `lib` directory has
100  arch(64) suffix.<sup>[3](https://github.com/arrayfire/arrayfire/pull/2094)</sup>
101 - Fixed assert condition in 3d morph opencl
102  kernel.<sup>[4](https://github.com/arrayfire/arrayfire/pull/2033)</sup>
103 - Fix JIT errors with large non-linear
104  kernels<sup>[5](https://github.com/arrayfire/arrayfire/pull/2127)</sup>
105 - Fix bug in CPU jit after moddims was called
106  <sup>[5](https://github.com/arrayfire/arrayfire/pull/2127)</sup>
107 - Fixed deadlock caused by calls to from the worker thread
108  <sup>[6](https://github.com/arrayfire/arrayfire/pull/2124)</sup>
109 
110 Documentation
111 -------------
112 
113 - Fixed variable name typo in `vectorization.md`.
114  <sup>[1](https://github.com/arrayfire/arrayfire/pull/2032)</sup>
115 - Fixed `AF_API_VERSION` value in Doxygen config file.
116  <sup>[2](https://github.com/arrayfire/arrayfire/pull/2053)</sup>
117 
118 Known issues
119 ------------
120 
121 - Several OpenCL tests failing on OSX:
122  - `canny_opencl, fft_opencl, gen_assign_opencl, homography_opencl,
123  reduce_opencl, scan_by_key_opencl, solve_dense_opencl,
124  sparse_arith_opencl, sparse_convert_opencl, where_opencl`
125 
126 Community contributions
127 -----------------------
128 
129 Special thanks to our contributors:
130 [Adrien F. Vincent](https://github.com/afvincent), [Cedric
131 Nugteren](https://github.com/CNugteren),
132 [Felix](https://github.com/fzimmermann89), [Filip
133 Matzner](https://github.com/FloopCZ),
134 [HoneyPatouceul](https://github.com/HoneyPatouceul), [Patrick
135 Lavin](https://github.com/plavin), [Ralf Stubner](https://github.com/rstub),
136 [William Tambellini](https://github.com/WilliamTambellini)
137 
138 
139 v3.5.1
140 ======
141 
142 The source code with submodules can be downloaded directly from the following
143 link: http://arrayfire.com/arrayfire_source/arrayfire-full-3.5.1.tar.bz2
144 
145 Installer CUDA Version: 8.0 (Required) Installer OpenCL Version: 1.2 (Minimum)
146 
147 Improvements
148 ------------
149 - Relaxed `af::unwrap()` function's arguments.
150  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1853)</sup>
151 - Changed behavior of af::array::allocated() to specify memory allocated.
152  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1877)</sup>
153 - Removed restriction on the number of bins for `af::histogram()` on CUDA and
154  OpenCL kernels. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1895)</sup>
155 
156 
157 Performance
158 -----------
159 
160 - Improved JIT performance.
161  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1864)</sup>
162 - Improved CPU element-wise operation performance.
163  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1890)</sup>
164 - Improved regions performance using texture objects. <sup>
165  [1](https://github.com/arrayfire/arrayfire/pull/1903)</sup>
166 
167 
168 Bug fixes
169 ---------
170 - Fixed overflow issues in mean.
171  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1849)</sup>
172 - Fixed memory leak when chaining indexing operations.
173  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1879)</sup>
174 - Fixed bug in array assignment when using an empty array to index.
175  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1897)</sup>
176 - Fixed bug with `af::matmul()` which occured when its RHS argument was an
177  indexed vector.
178  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1883)</sup>
179 - Fixed bug deadlock bug when sparse array was used with a JIT Array.
180  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1889)</sup>
181 - Fixed pixel tests for FAST kernels.
182  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1891)</sup>
183 - Fixed `af::replace` so that it is now copy-on-write.
184  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1892)</sup>
185 - Fixed launch configuration issues in CUDA JIT.
186  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1893)</sup>
187 - Fixed segfaults and "Pure Virtual Call" error warnings when exiting on
188  Windows. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1899)
189  [2](https://github.com/arrayfire/arrayfire/pull/1924)</sup>
190 - Workaround for `clEnqueueReadBuffer` bug on OSX.
191  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1888)</sup>
192 
193 Build
194 -----
195 
196 - Fixed issues when compiling with GCC 7.1.
197  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1872)</sup>
198  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1876)</sup>
199 - Eliminated unnecessary Boost dependency from CPU and CUDA backends.
200  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1857)</sup>
201 
202 Misc
203 ----
204 
205 - Updated support links to point to Slack instead of Gitter.
206  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1905)</sup>
207 
208 
209 
210 v3.5.0
211 ==============
212 
213 Major Updates
214 -------------
215 
216 * ArrayFire now supports threaded applications.
217  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1706)</sup>
218 * Added Canny edge detector.
219  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1743)</sup>
220 * Added Sparse-Dense arithmetic operations.
221  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1696)</sup>
222 
223 Features
224 --------
225 
226 * ArrayFire Threading
227  * \ref af::array can be read by multiple threads
228  * All ArrayFire functions can be executed concurrently by multiple threads
229  * Threads can operate on different devices to simplify Muli-device workloads
230 * New Canny edge detector function, \ref af::canny().
231  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1743)</sup>
232  * Can automatically calculate high threshold with `AF_CANNY_THRESHOLD_AUTO_OTSU`
233  * Supports both L1 and L2 Norms to calculate gradients
234 * New tuned OpenCL BLAS backend,
235  [CLBlast](https://github.com/arrayfire/arrayfire/pull/1727).
236 
237 Improvements
238 ------------
239 
240 * Converted CUDA JIT to use
241  [NVRTC](http://docs.nvidia.com/cuda/nvrtc/index.html) instead of
242  [NVVM](http://docs.nvidia.com/cuda/nvvm-ir-spec/index.html).
243 * Performance improvements in \ref af::reorder().
244  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1766)</sup>
245 * Performance improvements in \ref af::array::scalar<T>().
246  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1809)</sup>
247 * Improved unified backend performance.
248  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1770)</sup>
249 * ArrayFire now depends on Forge
250  v1.0. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1800)</sup>
251 * Can now specify the FFT plan cache size using the
252  \ref af::setFFTPlanCacheSize() function.
253 * Get the number of physical bytes allocated by the memory manager
254  \ref af_get_allocated_bytes(). <sup>[1](https://github.com/arrayfire/arrayfire/pull/1630)</sup>
255 * \ref af::dot() can now return a scalar value to the
256  host. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1628)</sup>
257 
258 Bug Fixes
259 ---------
260 
261 * Fixed improper release of default Mersenne random
262  engine. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1716)</sup>
263 * Fixed \ref af::randu() and \ref af::randn() ranges for floating point
264  types. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1784)</sup>
265 * Fixed assignment bug in CPU
266  backend. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1765)</sup>
267 * Fixed complex (`c32`,`c64`) multiplication in OpenCL convolution
268  kernels. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1816)</sup>
269 * Fixed inconsistent behavior with \ref af::replace() and \ref
270  af_replace_scalar(). <sup>[1](https://github.com/arrayfire/arrayfire/pull/1773)</sup>
271 * Fixed memory leak in \ref
272  af_fir(). <sup>[1](https://github.com/arrayfire/arrayfire/pull/1765)</sup>
273 * Fixed memory leaks in \ref af_cast for sparse arrays.
274  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1826)</sup>
275 * Fixing correctness of \ref af_pow for complex numbers by using Cartesian
276  form. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1765)</sup>
277 * Corrected \ref af::select() with indexing in CUDA and OpenCL
278  backends. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1731)</sup>
279 * Workaround for VS2015 compiler ternary
280  bug. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1771)</sup>
281 * Fixed memory corruption in
282  `cuda::findPlan()`. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1793)</sup>
283 * Argument checks in \ref af_create_sparse_array avoids inputs of type
284  int64. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1747)</sup>
285 * Fixed issue with indexing an array with a step size != 1. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1846)</sup>
286 
287 Build fixes
288 -----------
289 
290 * On OSX, utilize new GLFW package from the brew package
291  manager. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1720)</sup>
292  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1775)</sup>
293 * Fixed CUDA PTX names generated by CMake
294  v3.7. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1689)</sup>
295 * Support `gcc` > 5.x for
296  CUDA. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1708)</sup>
297 
298 Examples
299 --------
300 
301 * New genetic algorithm example.
302  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1695)</sup>
303 
304 Documentation
305 -------------
306 
307 * Updated `README.md` to improve readability and
308  formatting. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1726)</sup>
309 * Updated `README.md` to mention Julia and Nim
310  wrappers. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1714)</sup>
311 * Improved installation instructions -
312  `docs/pages/install.md`. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1740)</sup>
313 
314 Miscellaneous
315 -------------
316 
317 * A few improvements for ROCm
318  support. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1710)</sup>
319 * Removed CUDA 6.5 support.
320  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1687)</sup>
321 
322 Known issues
323 ------------
324 
325 * Windows
326  * The Windows NVIDIA driver version `37x.xx` contains a bug which causes
327  `fftconvolve_opencl` to fail. Upgrade or downgrade to a different version of
328  the driver to avoid this failure.
329  * The following tests fail on Windows with NVIDIA hardware:
330  `threading_cuda`,`qr_dense_opencl`, `solve_dense_opencl`.
331 * macOS
332  * The Accelerate framework, used by the CPU backend on macOS, leverages Intel
333  graphics cards (Iris) when there are no discrete GPUs available. This OpenCL
334  implementation is known to give incorrect results on the following tests:
335  `lu_dense_{cpu,opencl}`, `solve_dense_{cpu,opencl}`,
336  `inverse_dense_{cpu,opencl}`.
337  * Certain tests intermittently fail on macOS with NVIDIA GPUs apparently due
338  to inconsistent driver behavior: `fft_large_cuda` and `svd_dense_cuda`.
339  * The following tests are currently failing on macOS with AMD GPUs:
340  `cholesky_dense_opencl` and `scan_by_key_opencl`.
341 
342 
343 v3.4.2
344 ==============
345 
346 Deprecation Announcement
347 ------------------------
348 
349 This release supports CUDA 6.5 and higher. The next ArrayFire relase will
350 support CUDA 7.0 and higher, dropping support for CUDA 6.5. Reasons for no
351 longer supporting CUDA 6.5 include:
352 
353 * CUDA 7.0 NVCC supports the C++11 standard (whereas CUDA 6.5 does not), which
354  is used by ArrayFire's CPU and OpenCL backends.
355 * Very few ArrayFire users still use CUDA 6.5.
356 
357 As a result, the older Jetson TK1 / Tegra K1 will no longer be supported in
358 the next ArrayFire release. The newer Jetson TX1 / Tegra X1 will continue to
359 have full capability with ArrayFire.
360 
361 Docker
362 ------
363 * [ArrayFire has been Dockerized](https://github.com/arrayfire/arrayfire-docker).
364 
365 Improvements
366 ------------
367 * Implemented sparse storage format conversions between \ref AF_STORAGE_CSR
368  and \ref AF_STORAGE_COO.
369  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1642)</sup>
370  * Directly convert between \ref AF_STORAGE_COO <--> \ref AF_STORAGE_CSR
371  using the af::sparseConvertTo() function.
372  * af::sparseConvertTo() now also supports converting to dense.
373 * Added cast support for [sparse arrays](\ref sparse_func).
374  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1653)</sup>
375  * Casting only changes the values array and the type. The row and column
376  index arrays are not changed.
377 * Reintroduced automated computation of chart axes limits for graphics functions.
378  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1639)</sup>
379  * The axes limits will always be the minimum/maximum of the current and new
380  limit.
381  * The user can still set limits from API calls. If the user sets a limit
382  from the API call, then the automatic limit setting will be disabled.
383 * Using `boost::scoped_array` instead of `boost::scoped_ptr` when managing
384  array resources.
385  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1637)</sup>
386 * Internal performance improvements to getInfo() by using `const` references
387  to avoid unnecessary copying of `ArrayInfo` objects.
388  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1665)</sup>
389 * Added support for scalar af::array inputs for af::convolve() and
390  [set functions](\ref set_mat).
391  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1660)</sup>
392  <sup>[2](https://github.com/arrayfire/arrayfire/issues/1675)</sup>
393  <sup>[3](https://github.com/arrayfire/arrayfire/pull/1668)</sup>
394 * Performance fixes in af::fftConvolve() kernels.
395  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1679)</sup>
396  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1680)</sup>
397 
398 Build
399 -----
400 * Support for Visual Studio 2015 compilation.
401  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1632)</sup>
402  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1640)</sup>
403 * Fixed `FindCBLAS.cmake` when PkgConfig is used.
404  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1657)</sup>
405 
406 Bug fixes
407 ---------
408 * Fixes to JIT when tree is large.
409  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1646)</sup>
410  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1638)</sup>
411 * Fixed indexing bug when converting dense to sparse af::array as \ref
412  AF_STORAGE_COO.
413  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1642)</sup>
414 * Fixed af::bilateral() OpenCL kernel compilation on OS X.
415  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1638)</sup>
416 * Fixed memory leak in af::regions() (CPU) and af::rgb2ycbcr().
417  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1664)</sup>
418  <sup>[2](https://github.com/arrayfire/arrayfire/issues/1664)</sup>
419  <sup>[3](https://github.com/arrayfire/arrayfire/pull/1666)</sup>
420 
421 Installers
422 ----------
423 * Major OS X installer fixes.
424  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1629)</sup>
425  * Fixed installation scripts.
426  * Fixed installation symlinks for libraries.
427 * Windows installer now ships with more pre-built examples.
428 
429 Examples
430 --------
431 * Added af::choleskyInPlace() calls to `cholesky.cpp` example.
432  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1671)</sup>
433 
434 Documentation
435 -------------
436 * Added `u8` as supported data type in `getting_started.md`.
437  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1661)</sup>
438 * Fixed typos.
439  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1652)</sup>
440 
441 CUDA 8 on OSX
442 -------------
443 * [CUDA 8.0.55](https://developer.nvidia.com/cuda-toolkit) supports Xcode 8.
444  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1664)</sup>
445 
446 Known Issues
447 ------------
448 * Known failures with CUDA 6.5. These include all functions that use
449  sorting. As a result, sparse storage format conversion between \ref
450  AF_STORAGE_COO and \ref AF_STORAGE_CSR has been disabled for CUDA 6.5.
451 
452 v3.4.1
453 ==============
454 
455 Installers
456 ----------
457 * Installers for Linux, OS X and Windows
458  * CUDA backend now uses [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit).
459  * Uses [Intel MKL 2017](https://software.intel.com/en-us/intel-mkl).
460  * CUDA Compute 2.x (Fermi) is no longer compiled into the library.
461 * Installer for OS X
462  * The libraries shipping in the OS X Installer are now compiled with Apple
463  Clang v7.3.1 (previously v6.1.0).
464  * The OS X version used is 10.11.6 (previously 10.10.5).
465 * Installer for Jetson TX1 / Tegra X1
466  * Requires [JetPack for L4T 2.3](https://developer.nvidia.com/embedded/jetpack)
467  (containing Linux for Tegra r24.2 for TX1).
468  * CUDA backend now uses [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit) 64-bit.
469  * Using CUDA's cusolver instead of CPU fallback.
470  * Uses OpenBLAS for CPU BLAS.
471  * All ArrayFire libraries are now 64-bit.
472 
473 Improvements
474 ------------
475 * Add [sparse array](\ref sparse_func) support to \ref af::eval().
476  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1598)</sup>
477 * Add OpenCL-CPU fallback support for sparse \ref af::matmul() when running on
478  a unified memory device. Uses MKL Sparse BLAS.
479 * When using CUDA libdevice, pick the correct compute version based on device.
480  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1612)</sup>
481 * OpenCL FFT now also supports prime factors 7, 11 and 13.
482  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1383)</sup>
483  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1619)</sup>
484 
485 Bug Fixes
486 ---------
487 * Allow CUDA libdevice to be detected from custom directory.
488 * Fix `aarch64` detection on Jetson TX1 64-bit OS.
489  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1593)</sup>
490 * Add missing definition of `af_set_fft_plan_cache_size` in unified backend.
491  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1591)</sup>
492 * Fix intial values for \ref af::min() and \ref af::max() operations.
493  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1594)</sup>
494  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1595)</sup>
495 * Fix distance calculation in \ref af::nearestNeighbour for CUDA and OpenCL backend.
496  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1596)</sup>
497  <sup>[2](https://github.com/arrayfire/arrayfire/pull/1595)</sup>
498 * Fix OpenCL bug where scalars where are passed incorrectly to compile options.
499  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1595)</sup>
500 * Fix bug in \ref af::Window::surface() with respect to dimensions and ranges.
501  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1604)</sup>
502 * Fix possible double free corruption in \ref af_assign_seq().
503  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1605)</sup>
504 * Add missing eval for key in \ref af::scanByKey in CPU backend.
505  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1605)</sup>
506 * Fixed creation of sparse values array using \ref AF_STORAGE_COO.
507  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1620)</sup>
508  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1621)</sup>
509 
510 Examples
511 --------
512 * Add a [Conjugate Gradient solver example](\ref benchmarks/cg.cpp)
513  to demonstrate sparse and dense matrix operations.
514  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1599)</sup>
515 
516 CUDA Backend
517 ------------
518 * When using [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit),
519  compute 2.x are no longer in default compute list.
520  * This follows [CUDA 8.0](https://developer.nvidia.com/cuda-toolkit)
521  deprecating computes 2.x.
522  * Default computes for CUDA 8.0 will be 30, 50, 60.
523 * When using CUDA pre-8.0, the default selection remains 20, 30, 50.
524 * CUDA backend now uses `-arch=sm_30` for PTX compilation as default.
525  * Unless compute 2.0 is enabled.
526 
527 Known Issues
528 ------------
529 * \ref af::lu() on CPU is known to give incorrect results when built run on
530  OS X 10.11 or 10.12 and compiled with Accelerate Framework.
531  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1617)</sup>
532  * Since the OS X Installer libraries uses MKL rather than Accelerate
533  Framework, this issue does not affect those libraries.
534 
535 
536 v3.4.0
537 ==============
538 
539 Major Updates
540 -------------
541 * [Sparse Matrix and BLAS](\ref sparse_func). <sup>[1](https://github.com/arrayfire/arrayfire/issues/821)
542  [2](https://github.com/arrayfire/arrayfire/pull/1319)</sup>
543 * Faster JIT for CUDA and OpenCL. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1472)
544  [2](https://github.com/arrayfire/arrayfire/pull/1462)</sup>
545 * Support for [random number generator engines](\ref af::randomEngine).
546  <sup>[1](https://github.com/arrayfire/arrayfire/issues/868)
547  [2](https://github.com/arrayfire/arrayfire/pull/1551)</sup>
548 * Improvements to graphics. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1555)
549  [2](https://github.com/arrayfire/arrayfire/pull/1566)</sup>
550 
551 Features
552 ----------
553 * **[Sparse Matrix and BLAS](\ref sparse_func)** <sup>[1](https://github.com/arrayfire/arrayfire/issues/821)
554 [2](https://github.com/arrayfire/arrayfire/pull/1319)</sup>
555  * Support for [CSR](\ref AF_STORAGE_CSR) and [COO](\ref AF_STORAGE_COO)
556  [storage types](\ref af_storage).
557  * Sparse-Dense Matrix Multiplication and Matrix-Vector Multiplication as a
558  part of af::matmul() using \ref AF_STORAGE_CSR format for sparse.
559  * Conversion to and from [dense](\ref AF_STORAGE_DENSE) matrix to [CSR](\ref AF_STORAGE_CSR)
560  and [COO](\ref AF_STORAGE_COO) [storage types](\ref af_storage).
561 * **Faster JIT** <sup>[1](https://github.com/arrayfire/arrayfire/issues/1472)
562  [2](https://github.com/arrayfire/arrayfire/pull/1462)</sup>
563  * Performance improvements for CUDA and OpenCL JIT functions.
564  * Support for evaluating multiple outputs in a single kernel. See af::array::eval() for more.
565 * **[Random Number Generation](\ref af::randomEngine)**
566  <sup>[1](https://github.com/arrayfire/arrayfire/issues/868)
567  [2](https://github.com/arrayfire/arrayfire/pull/1551)</sup>
568  * af::randomEngine(): A random engine class to handle setting the [type](af_random_type) and seed
569  for random number generator engines.
570  * Supported engine types are (\ref af_random_engine_type):
571  * [Philox](http://www.thesalmons.org/john/random123/)
572  * [Threefry](http://www.thesalmons.org/john/random123/)
573  * [Mersenne Twister](http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/MTGP/)
574 * **Graphics** <sup>[1](https://github.com/arrayfire/arrayfire/pull/1555)
575  [2](https://github.com/arrayfire/arrayfire/pull/1566)</sup>
576  * Using [Forge v0.9.0](https://github.com/arrayfire/forge/releases/tag/v0.9.0)
577  * [Vector Field](\ref af::Window::vectorField) plotting functionality.
578  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1566)</sup>
579  * Removed [GLEW](http://glew.sourceforge.net/) and replaced with [glbinding](https://github.com/cginternals/glbinding).
580  * Removed usage of GLEW after support for MX (multithreaded) was dropped in v2.0.
581  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1540)</sup>
582  * Multiple overlays on the same window are now possible.
583  * Overlays support for same type of object (2D/3D)
584  * Supported by af::Window::plot, af::Window::hist, af::Window::surface,
585  af::Window::vectorField.
586  * New API to set axes limits for graphs.
587  * Draw calls do not automatically compute the limits. This is now under user control.
588  * af::Window::setAxesLimits can be used to set axes limits automatically or manually.
589  * af::Window::setAxesTitles can be used to set axes titles.
590  * New API for plot and scatter:
591  * af::Window::plot() and af::Window::scatter() now can handle 2D and 3D and determine appropriate order.
592  * af_draw_plot_nd()
593  * af_draw_plot_2d()
594  * af_draw_plot_3d()
595  * af_draw_scatter_nd()
596  * af_draw_scatter_2d()
597  * af_draw_scatter_3d()
598 * **New [interpolation methods](\ref af_interp_type)**
599 <sup>[1](https://github.com/arrayfire/arrayfire/issues/1562)</sup>
600  * Applies to
601  * \ref af::resize()
602  * \ref af::transform()
603  * \ref af::approx1()
604  * \ref af::approx2()
605 * **Support for [complex mathematical functions](\ref mathfunc_mat)**
606  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1507)</sup>
607  * Add complex support for \ref trig_mat, \ref af::sqrt(), \ref af::log().
608 * **af::medfilt1(): Median filter for 1-d signals** <sup>[1](https://github.com/arrayfire/arrayfire/pull/1479)</sup>
609 * <b>Generalized scan functions: \ref scan_func_scan and \ref scan_func_scanbykey</b>
610  * Now supports inclusive or exclusive scans
611  * Supports binary operations defined by \ref af_binary_op.
612  <sup>[1](https://github.com/arrayfire/arrayfire/issues/388)</sup>
613 * **[Image Moments](\ref moments_mat) functions**
614  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1453)</sup>
615 * <b>Add af::getSizeOf() function for \ref af_dtype</b>
616  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1404)</sup>
617 * <b>Explicitly extantiate \ref af::array::device() for `void *</b>
618  <sup>[1](https://github.com/arrayfire/arrayfire/issues/1503)</sup>
619 
620 Bug Fixes
621 --------------
622 * Fixes to edge-cases in \ref morph_mat. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1564)</sup>
623 * Makes JIT tree size consistent between devices. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1457)</sup>
624 * Delegate higher-dimension in \ref convolve_mat to correct dimensions. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1445)</sup>
625 * Indexing fixes with C++11. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1426) [2](https://github.com/arrayfire/arrayfire/pull/1426)</sup>
626 * Handle empty arrays as inputs in various functions. <sup>[1](https://github.com/arrayfire/arrayfire/issues/799)</sup>
627 * Fix bug when single element input to af::median. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1423)</sup>
628 * Fix bug in calculation of time from af::timeit(). <sup>[1](https://github.com/arrayfire/arrayfire/pull/1414)</sup>
629 * Fix bug in floating point numbers in af::seq. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1404)</sup>
630 * Fixes for OpenCL graphics interop on NVIDIA devices.
631  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1408/commits/e1f16e6)</sup>
632 * Fix bug when compiling large kernels for AMD devices.
633  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1465)</sup>
634 * Fix bug in af::bilateral when shared memory is over the limit.
635  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1478)</sup>
636 * Fix bug in kernel header compilation tool `bin2cpp`.
637  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1544)</sup>
638 * Fix inital values for \ref morph_mat functions.
639  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1547)</sup>
640 * Fix bugs in af::homography() CPU and OpenCL kernels.
641  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1584)</sup>
642 * Fix bug in CPU TNJ.
643  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1587)</sup>
644 
645 
646 Improvements
647 ------------
648 * CUDA 8 and compute 6.x(Pascal) support, current installer ships with CUDA 7.5. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1432) [2](https://github.com/arrayfire/arrayfire/pull/1487) [3](https://github.com/arrayfire/arrayfire/pull/1539)</sup>
649 * User controlled FFT plan caching. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1448)</sup>
650 * CUDA performance improvements for \ref image_func_wrap, \ref image_func_unwrap and \ref approx_mat.
651  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1411)</sup>
652 * Fallback for CUDA-OpenGL interop when no devices does not support OpenGL.
653  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1415)</sup>
654 * Additional forms of batching with the \ref transform_func_transform functions.
655  [New behavior defined here](https://github.com/arrayfire/arrayfire/pull/1412).
656  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1412)</sup>
657 * Update to OpenCL2 headers. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1344)</sup>
658 * Support for integration with external OpenCL contexts. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1140)</sup>
659 * Performance improvements to interal copy in CPU Backend.
660  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1440)</sup>
661 * Performance improvements to af::select and af::replace CUDA kernels.
662  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1587)</sup>
663 * Enable OpenCL-CPU offload by default for devices with Unified Host Memory.
664  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1521)</sup>
665  * To disable, use the environment variable `AF_OPENCL_CPU_OFFLOAD=0`.
666 
667 Build
668 ------
669 * Compilation speedups. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1526)</sup>
670 * Build fixes with MKL. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1526)</sup>
671 * Error message when CMake CUDA Compute Detection fails. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1535)</sup>
672 * Several CMake build issues with Xcode generator fixed.
673  <sup>[1](https://github.com/arrayfire/arrayfire/pull/1493) [2](https://github.com/arrayfire/arrayfire/pull/1499)</sup>
674 * Fix multiple OpenCL definitions at link time. <sup>[1](https://github.com/arrayfire/arrayfire/issues/1429)</sup>
675 * Fix lapacke detection in CMake. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1423)</sup>
676 * Update build tags of
677  * [clBLAS](https://github.com/clMathLibraries/clBLAS)
678  * [clFFT](https://github.com/clMathLibraries/clFFT)
679  * [Boost.Compute](https://github.com/boostorg/compute)
680  * [Forge](https://github.com/arrayfire/forge)
681  * [glbinding](https://github.com/cginternals/glbinding)
682 * Fix builds with GCC 6.1.1 and GCC 5.3.0. <sup>[1](https://github.com/arrayfire/arrayfire/pull/1409)</sup>
683 
684 Installers
685 ----------
686 * All installers now ship with ArrayFire libraries build with MKL 2016.
687 * All installers now ship with Forge development files and examples included.
688 * CUDA Compute 2.0 has been removed from the installers. Please contact us
689  directly if you have a special need.
690 
691 Examples
692 -------------
693 * Added [example simulating gravity](\ref graphics/field.cpp) for
694  demonstration of vector field.
695 * Improvements to \ref financial/black_scholes_options.cpp example.
696 * Improvements to \ref graphics/gravity_sim.cpp example.
697 * Fix graphics examples to use af::Window::setAxesLimits and
698  af::Window::setAxesTitles functions.
699 
700 Documentation & Licensing
701 -------------------------
702 * [ArrayFire copyright and trademark policy](http://arrayfire.com/trademark-policy)
703 * Fixed grammar in license.
704 * Add license information for glbinding.
705 * Remove license infomation for GLEW.
706 * Random123 now applies to all backends.
707 * Random number functions are now under \ref random_mat.
708 
709 Deprecations
710 ------------
711 The following functions have been deprecated and may be modified or removed
712 permanently from future versions of ArrayFire.
713 * \ref af::Window::plot3(): Use \ref af::Window::plot instead.
714 * \ref af_draw_plot(): Use \ref af_draw_plot_nd or \ref af_draw_plot_2d instead.
715 * \ref af_draw_plot3(): Use \ref af_draw_plot_nd or \ref af_draw_plot_3d instead.
716 * \ref af::Window::scatter3(): Use \ref af::Window::scatter instead.
717 * \ref af_draw_scatter(): Use \ref af_draw_scatter_nd or \ref af_draw_scatter_2d instead.
718 * \ref af_draw_scatter3(): Use \ref af_draw_scatter_nd or \ref af_draw_scatter_3d instead.
719 
720 Known Issues
721 -------------
722 Certain CUDA functions are known to be broken on Tegra K1. The following ArrayFire tests are currently failing:
723 * assign_cuda
724 * harris_cuda
725 * homography_cuda
726 * median_cuda
727 * orb_cudasort_cuda
728 * sort_by_key_cuda
729 * sort_index_cuda
730 
731 
732 v3.3.2
733 ==============
734 
735 Improvements
736 ------------
737 * Family of [Sort](\ref sort_mat) functions now support
738  [higher order dimensions](https://github.com/arrayfire/arrayfire/pull/1373).
739 * Improved performance of batched sort on dim 0 for all [Sort](\ref sort_mat) functions.
740 * [Median](\ref stat_func_median) now also supports higher order dimensions.
741 
742 Bug Fixes
743 --------------
744 
745 * Fixes to [error handling](https://github.com/arrayfire/arrayfire/issues/1352) in C++ API for binary functions.
746 * Fixes to [external OpenCL context management](https://github.com/arrayfire/arrayfire/issues/1350).
747 * Fixes to [JPEG_GREYSCALE](https://github.com/arrayfire/arrayfire/issues/1360) for FreeImage versions <= 3.154.
748 * Fixed for [non-float inputs](https://github.com/arrayfire/arrayfire/issues/1386) to \ref af::rgb2gray().
749 
750 Build
751 ------
752 * [Disable CPU Async](https://github.com/arrayfire/arrayfire/issues/1378) when building with GCC < 4.8.4.
753 * Add option to [disable CPUID](https://github.com/arrayfire/arrayfire/issues/1369) from CMake.
754 * More verbose message when [CUDA Compute Detection fails](https://github.com/arrayfire/arrayfire/issues/1362).
755 * Print message to use [CUDA library stub](https://github.com/arrayfire/arrayfire/issues/1363)
756  from CUDA Toolkit if CUDA Library is not found from default paths.
757 * [Build Fixes](https://github.com/arrayfire/arrayfire/pull/1385) on Windows.
758  * For compiling tests our of source.
759  * For compiling ArrayFire with static MKL.
760 * [Exclude <sys/sysctl.h>](https://github.com/arrayfire/arrayfire/pull/1368) when building on GNU Hurd.
761 * Add [manual CMake options](https://github.com/arrayfire/arrayfire/pull/1389) to build DEB and RPM packages.
762 
763 Documentation
764 -------------
765 * Fixed documentation for \ref af::replace().
766 * Fixed images in [Using on OSX](\ref using_on_osx) page.
767 
768 Installer
769 ---------
770 * Linux x64 installers will now be compiled with GCC 4.9.2.
771 * OSX installer gives better error messages on brew failures and
772  now includes link to [Fixing OS X Installer Failures] (https://github.com/arrayfire/arrayfire/wiki/Fixing-Common-OS-X-Installer-Failures)
773  for brew installation failures.
774 
775 v3.3.1
776 ==============
777 
778 Bug Fixes
779 --------------
780 
781 * Fixes to \ref af::array::device()
782  * CPU Backend: [evaluate arrays](https://github.com/arrayfire/arrayfire/issues/1316)
783  before returning pointer with asynchronous calls in CPU backend.
784  * OpenCL Backend: [fix segfaults](https://github.com/arrayfire/arrayfire/issues/1324)
785  when requested for device pointers on empty arrays.
786 * Fixed \ref af::array::operator%() from using [rem to mod](https://github.com/arrayfire/arrayfire/issues/1318).
787 * Fixed [array destruction](https://github.com/arrayfire/arrayfire/issues/1321)
788  when backends are switched in Unified API.
789 * Fixed [indexing](https://github.com/arrayfire/arrayfire/issues/1331) after
790  \ref af::moddims() is called.
791 * Fixes FFT calls for CUDA and OpenCL backends when used on
792  [multiple devices](https://github.com/arrayfire/arrayfire/issues/1332).
793 * Fixed [unresolved external](https://github.com/arrayfire/arrayfire/commit/32965ef)
794  for some functions from \ref af::array::array_proxy class.
795 
796 Build
797 ------
798 * CMake compiles files in alphabetical order.
799 * CMake fixes for BLAS and LAPACK on some Linux distributions.
800 
801 Improvements
802 ------------
803 * Fixed [OpenCL FFT performance](https://github.com/arrayfire/arrayfire/issues/1323) regression.
804 * \ref af::array::device() on OpenCL backend [returns](https://github.com/arrayfire/arrayfire/issues/1311)
805  `cl_mem` instead of `(void*)cl::Buffer*`.
806 * In Unified backend, [load versioned libraries](https://github.com/arrayfire/arrayfire/issues/1312)
807  at runtime.
808 
809 Documentation
810 ------
811 * Reorganized, cleaner README file.
812 * Replaced non-free lena image in assets with free-to-distribute lena image.
813 
814 v3.3.0
815 ==============
816 
817 Major Updates
818 -------------
819 
820 * CPU backend supports aysnchronous execution.
821 * Performance improvements to OpenCL BLAS and FFT functions.
822 * Improved performance of memory manager.
823 * Improvements to visualization functions.
824 * Improved sorted order for OpenCL devices.
825 * Integration with external OpenCL projects.
826 
827 Features
828 ----------
829 
830 * \ref af::getActiveBackend(): Returns the current backend being used.
831 * [Scatter plot](https://github.com/arrayfire/arrayfire/pull/1116) added to graphics.
832 * \ref af::transform() now supports perspective transformation matrices.
833 * \ref af::infoString(): Returns `af::info()` as a string.
834 * \ref af::printMemInfo(): Print a table showing information about buffer from the memory manager
835  * The \ref AF_MEM_INFO macro prints numbers and total sizes of all buffers (requires including af/macros.h)
836 * \ref af::allocHost(): Allocates memory on host.
837 * \ref af::freeHost(): Frees host side memory allocated by arrayfire.
838 * OpenCL functions can now use CPU implementation.
839  * Currently limited to Unified Memory devices (CPU and On-board Graphics).
840  * Functions: af::matmul() and all [LAPACK](\ref linalg_mat) functions.
841  * Takes advantage of optimized libraries such as MKL without doing memory copies.
842  * Use the environment variable `AF_OPENCL_CPU_OFFLOAD=1` to take advantage of this feature.
843 * Functions specific to OpenCL backend.
844  * \ref afcl::addDevice(): Adds an external device and context to ArrayFire's device manager.
845  * \ref afcl::deleteDevice(): Removes an external device and context from ArrayFire's device manager.
846  * \ref afcl::setDevice(): Sets an external device and context from ArrayFire's device manager.
847  * \ref afcl::getDeviceType(): Gets the device type of the current device.
848  * \ref afcl::getPlatform(): Gets the platform of the current device.
849 * \ref af::createStridedArray() allows [array creation user-defined strides](https://github.com/arrayfire/arrayfire/issues/1177) and device pointer.
850 * [Expose functions](https://github.com/arrayfire/arrayfire/issues/1131) that provide information
851  about memory layout of Arrays.
852  * \ref af::getStrides(): Gets the strides for each dimension of the array.
853  * \ref af::getOffset(): Gets the offsets for each dimension of the array.
854  * \ref af::getRawPtr(): Gets raw pointer to the location of the array on device.
855  * \ref af::isLinear(): Returns true if all elements in the array are contiguous.
856  * \ref af::isOwner(): Returns true if the array owns the raw pointer, false if it is a sub-array.
857  * \ref af::getStrides(): Gets the strides of the array.
858  * \ref af::getStrides(): Gets the strides of the array.
859 * \ref af::getDeviceId(): Gets the device id on which the array resides.
860 * \ref af::isImageIOAvailable(): Returns true if ArrayFire was compiled with Freeimage enabled
861 * \ref af::isLAPACKAvailable(): Returns true if ArrayFire was compiled with LAPACK functions enabled
862 
863 Bug Fixes
864 --------------
865 
866 * Fixed [errors when using 3D / 4D arrays](https://github.com/arrayfire/arrayfire/pull/1251) in select and replace
867 * Fixed [JIT errors on AMD devices](https://github.com/arrayfire/arrayfire/pull/1238) for OpenCL backend.
868 * Fixed [imageio bugs](https://github.com/arrayfire/arrayfire/pull/1229) for 16 bit images.
869 * Fixed [bugs when loading and storing images](https://github.com/arrayfire/arrayfire/pull/1228) natively.
870 * Fixed [bug in FFT for NVIDIA GPUs](https://github.com/arrayfire/arrayfire/issues/615) when using OpenCL backend.
871 * Fixed [bug when using external context](https://github.com/arrayfire/arrayfire/pull/1241) with OpenCL backend.
872 * Fixed [memory leak](https://github.com/arrayfire/arrayfire/issues/1269) in \ref af_median_all().
873 * Fixed [memory leaks and performance](https://github.com/arrayfire/arrayfire/pull/1274) in graphics functions.
874 * Fixed [bugs when indexing followed by moddims](https://github.com/arrayfire/arrayfire/issues/1275).
875 * \ref af_get_revision() now returns actual commit rather than AF_REVISION.
876 * Fixed [releasing arrays](https://github.com/arrayfire/arrayfire/issues/1282) when using different backends.
877 * OS X OpenCL: [LAPACK functions](\ref linalg_mat) on CPU devices use OpenCL offload (previously threw errors).
878 * [Add support for 32-bit integer image types](https://github.com/arrayfire/arrayfire/pull/1287) in Image IO.
879 * Fixed [set operations for row vectors](https://github.com/arrayfire/arrayfire/issues/1300)
880 * Fixed [bugs](https://github.com/arrayfire/arrayfire/issues/1243) in \ref af::meanShift() and af::orb().
881 
882 Improvements
883 --------------
884 
885 * Optionally [offload BLAS and LAPACK](https://github.com/arrayfire/arrayfire/pull/1221) functions to CPU implementations to improve performance.
886 * Performance improvements to the memory manager.
887 * Error messages are now more detailed.
888 * Improved sorted order for OpenCL devices.
889 * JIT heuristics can now be tweaked using environment variables. See
890  [Environment Variables](\ref configuring_environment) tutorial.
891 * Add `BUILD_<BACKEND>` [options to examples and tests](https://github.com/arrayfire/arrayfire/issues/1286)
892  to toggle backends when compiling independently.
893 
894 Examples
895 ----------
896 
897 * New visualization [example simulating gravity](\ref graphics/gravity_sim.cpp).
898 
899 Build
900 ----------
901 
902 * Support for Intel `icc` compiler
903 * Support to compile with Intel MKL as a BLAS and LAPACK provider
904 * Tests are now available for building as standalone (like examples)
905 * Tests can now be built as a single file for each backend
906 * Better handling of NONFREE build options
907 * [Searching for GLEW in CMake default paths](https://github.com/arrayfire/arrayfire/pull/1292)
908 * Fixes for compiling with MKL on OSX.
909 
910 Installers
911 ----------
912 * Improvements to OSX Installer
913  * CMake config files are now installed with libraries
914  * Independent options for installing examples and documentation components
915 
916 Deprecations
917 -----------
918 
919 * `af_lock_device_arr` is now deprecated to be removed in v4.0.0. Use \ref af_lock_array() instead.
920 * `af_unlock_device_arr` is now deprecated to be removed in v4.0.0. use \ref af_unlock_array() instead.
921 
922 Documentation
923 --------------
924 
925 * Fixes to documentation for \ref matchTemplate().
926 * Improved documentation for deviceInfo.
927 * Fixes to documentation for \ref exp().
928 
929 Known Issues
930 ------------
931 
932 * [Solve OpenCL fails on NVIDIA Maxwell devices](https://github.com/arrayfire/arrayfire/issues/1246)
933  for f32 and c32 when M > N and K % 4 is 1 or 2.
934 
935 
936 v3.2.2
937 ==============
938 
939 Bug Fixes
940 --------------
941 
942 * Fixed [memory leak](https://github.com/arrayfire/arrayfire/pull/1145) in
943  CUDA Random number generators
944 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1157) in
945  af::select() and af::replace() tests
946 * Fixed [exception](https://github.com/arrayfire/arrayfire/issues/1164)
947  thrown when printing empty arrays with af::print()
948 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1170) in CPU
949  random number generation. Changed the generator to
950  [mt19937](http://en.cppreference.com/w/cpp/numeric/random)
951 * Fixed exception handling (internal)
952  * [Exceptions](https://github.com/arrayfire/arrayfire/issues/1188)
953  now show function, short file name and line number
954  * Added [AF_RETURN_ERROR](https://github.com/arrayfire/arrayfire/issues/1186)
955  macro to handle returning errors.
956  * Removed THROW macro, and renamed AF_THROW_MSG to AF_THROW_ERR.
957 * Fixed [bug](https://github.com/arrayfire/arrayfire/commit/9459c6)
958  in \ref af::identity() that may have affected CUDA Compute 5.2 cards
959 
960 
961 Build
962 ------
963 * Added a [MIN_BUILD_TIME](https://github.com/arrayfire/arrayfire/issues/1193)
964  option to build with minimum optimization compiler flags resulting in faster
965  compile times
966 * Fixed [issue](https://github.com/arrayfire/arrayfire/issues/1143) in CBLAS
967  detection by CMake
968 * Fixed tests failing for builds without optional components
969  [FreeImage](https://github.com/arrayfire/arrayfire/issues/1143) and
970  [LAPACK](https://github.com/arrayfire/arrayfire/issues/1167)
971 * Added a [test](https://github.com/arrayfire/arrayfire/issues/1192)
972  for unified backend
973 * Only [info and backend tests](https://github.com/arrayfire/arrayfire/issues/1192)
974  are now built for unified backend
975 * [Sort tests](https://github.com/arrayfire/arrayfire/issues/1199)
976  execution alphabetically
977 * Fixed compilation flags and errors in tests and examples
978 * [Moved AF_REVISION and AF_COMPILER_STR](https://github.com/arrayfire/arrayfire/commit/2287c5)
979  into src/backend. This is because as revision is updated with every commit,
980  entire ArrayFire would have to be rebuilt in the old code.
981  * v3.3 will add a af_get_revision() function to get the revision string.
982 * [Clean up examples](https://github.com/arrayfire/arrayfire/pull/1158)
983  * Remove getchar for Windows (this will be handled by the installer)
984  * Other miscellaneous code cleanup
985  * Fixed bug in [plot3.cpp](\ref graphics/plot3.cpp) example
986 * [Rename](https://github.com/arrayfire/arrayfire/commit/35f0fc2) clBLAS/clFFT
987  external project suffix from external -> ext
988 * [Add OpenBLAS](https://github.com/arrayfire/arrayfire/pull/1197) as a
989  lapack/lapacke alternative
990 
991 Improvements
992 ------------
993 * Added \ref AF_MEM_INFO macro to print memory info from ArrayFire's memory
994  manager ([cross issue](https://github.com/arrayfire/arrayfire/issues/1172))
995 * Added [additional paths](https://github.com/arrayfire/arrayfire/issues/1184)
996  for searching for `libaf*` for Unified backend on unix-style OS.
997  * Note: This still requires dependencies such as forge, CUDA, NVVM etc to be
998  in `LD_LIBRARY_PATH` as described in [Unified Backend](\ref unifiedbackend)
999 * [Create streams](https://github.com/arrayfire/arrayfire/commit/ed0373f)
1000  for devices only when required in CUDA Backend
1001 
1002 Documentation
1003 ------
1004 * [Hide scrollbars](https://github.com/arrayfire/arrayfire/commit/9d218a5)
1005  appearing for pre and code styles
1006 * Fix [documentation](https://github.com/arrayfire/arrayfire/commit/ac09f91) for af::replace
1007 * Add [code sample](https://github.com/arrayfire/arrayfire/commit/4e06483)
1008  for converting the output of af::getAvailableBackends() into bools
1009 * Minor fixes in documentation
1010 
1011 v3.2.1
1012 ==============
1013 
1014 Bug Fixes
1015 --------------
1016 
1017 * Fixed [bug](https://github.com/arrayfire/arrayfire/pull/1136) in homography()
1018 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1135) in behavior
1019  of af::array::device()
1020 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1129) when
1021  indexing with span along trailing dimension
1022 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1127) when
1023  indexing in [GFor](\ref gfor)
1024 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1122) in CPU
1025  information fetching
1026 * Fixed compilation [bug](https://github.com/arrayfire/arrayfire/issues/1117)
1027  in unified backend caused by missing link library
1028 * Add [missing symbol](https://github.com/arrayfire/arrayfire/pull/1114) for
1029  af_draw_surface()
1030 
1031 Build
1032 ------
1033 * Tests can now be used as a [standalone project](https://github.com/arrayfire/arrayfire/pull/1120)
1034  * Tests can now be built using pre-compiled libraries
1035  * Similar to how the examples are built
1036 * The install target now installs the examples source irrespective of the
1037  BUILD_EXAMPLES value
1038  * Examples are not built if BUILD_EXAMPLES is off
1039 
1040 Documentation
1041 ------
1042 * HTML documentation is now [built and installed](https://github.com/arrayfire/arrayfire/pull/1109)
1043  in docs/html
1044 * Added documentation for \ref af::seq class
1045 * Updated [Matrix Manipulation](\ref matrixmanipulation) tutorial
1046 * Examples list is now generated by CMake
1047  * <a href="examples.htm">Examples</a> are now listed as dir/example.cpp
1048 * Removed dummy groups used for indexing documentation (affcted doxygen < 1.8.9)
1049 
1050 v3.2.0
1051 =================
1052 
1053 Major Updates
1054 -------------
1055 
1056 * Added Unified backend
1057  * Allows switching backends at runtime
1058  * Read [Unified Backend](\ref unifiedbackend) for more.
1059 * Support for 16-bit integers (\ref s16 and \ref u16)
1060  * All functions that support 32-bit interger types (\ref s32, \ref u32),
1061  now also support 16-bit interger types
1062 
1063 Function Additions
1064 ------------------
1065 * Unified Backend
1066  * \ref setBackend() - Sets a backend as active
1067  * \ref getBackendCount() - Gets the number of backends available for use
1068  * \ref getAvailableBackends() - Returns information about available backends
1069  * \ref getBackendId() - Gets the backend enum for an array
1070 
1071 * Vision
1072  * \ref homography() - Homography estimation
1073  * \ref gloh() - GLOH Descriptor for SIFT
1074 
1075 * Image Processing
1076  * \ref loadImageNative() - Load an image as native data without modification
1077  * \ref saveImageNative() - Save an image without modifying data or type
1078 
1079 * Graphics
1080  * \ref af::Window::plot3() - 3-dimensional line plot
1081  * \ref af::Window::surface() - 3-dimensional curve plot
1082 
1083 * Indexing
1084  * \ref af_create_indexers()
1085  * \ref af_set_array_indexer()
1086  * \ref af_set_seq_indexer()
1087  * \ref af_set_seq_param_indexer()
1088  * \ref af_release_indexers()
1089 
1090 * CUDA Backend Specific
1091  * \ref setNativeId() - Set the CUDA device with given native id as active
1092  * ArrayFire uses a modified order for devices. The native id for a
1093  device can be retreived using `nvidia-smi`
1094 
1095 * OpenCL Backend Specific
1096  * \ref setDeviceId() - Set the OpenCL device using the `clDeviceId`
1097 
1098 Other Improvements
1099 ------------------------
1100 * Added \ref c32 and \ref c64 support for \ref isNaN(), \ref isInf() and \ref iszero()
1101 * Added CPU information for `x86` and `x86_64` architectures in CPU backend's \ref info()
1102 * Batch support for \ref approx1() and \ref approx2()
1103  * Now can be used with gfor as well
1104 * Added \ref s64 and \ref u64 support to:
1105  * \ref sort() (along with sort index and sort by key)
1106  * \ref setUnique(), \ref setUnion(), \ref setIntersect()
1107  * \ref convolve() and \ref fftConvolve()
1108  * \ref histogram() and \ref histEqual()
1109  * \ref lookup()
1110  * \ref mean()
1111 * Added \ref AF_MSG macro
1112 
1113 Build Improvements
1114 ------------------
1115 * Submodules update is now automatically called if not cloned recursively
1116 * [Fixes for compilation](https://github.com/arrayfire/arrayfire/issues/766) on Visual Studio 2015
1117 * Option to use [fallback to CPU LAPACK](https://github.com/arrayfire/arrayfire/pull/1053)
1118  for linear algebra functions in case of CUDA 6.5 or older versions.
1119 
1120 Bug Fixes
1121 --------------
1122 * Fixed [memory leak](https://github.com/arrayfire/arrayfire/pull/1096) in \ref susan()
1123 * Fixed [failing test](https://github.com/arrayfire/arrayfire/commit/144a2db)
1124  in \ref lower() and \ref upper() for CUDA compute 53
1125 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1092) in CUDA for indexing out of bounds
1126 * Fixed [dims check](https://github.com/arrayfire/arrayfire/commit/6975da8) in \ref iota()
1127 * Fixed [out-of-bounds access](https://github.com/arrayfire/arrayfire/commit/7fc3856) in \ref sift()
1128 * Fixed [memory allocation](https://github.com/arrayfire/arrayfire/commit/5e88e4a) in \ref fast() OpenCL
1129 * Fixed [memory leak](https://github.com/arrayfire/arrayfire/pull/994) in image I/O functions
1130 * \ref dog() now returns float-point type arrays
1131 
1132 Documentation Updates
1133 ---------------------
1134 * Improved tutorials documentation
1135  * More detailed Using on [Linux](\ref using_on_linux), [OSX](\ref using_on_osx),
1136  [Windows](\ref using_on_windows) pages.
1137 * Added return type information for functions that return different type
1138  arrays
1139 
1140 New Examples
1141 ------------
1142 * Graphics
1143  * [Plot3](\ref graphics/plot3.cpp)
1144  * [Surface](\ref graphics/surface.cpp)
1145 * [Shallow Water Equation](\ref pde/swe.cpp)
1146 * [Basic](\ref unified/basic.cpp) as a Unified backend example
1147 
1148 Installers
1149 -----------
1150 * All installers now include the Unified backend and corresponding CMake files
1151 * Visual Studio projects include Unified in the Platform Configurations
1152 * Added installer for Jetson TX1
1153 * SIFT and GLOH do not ship with the installers as SIFT is protected by
1154  patents that do not allow commercial distribution without licensing.
1155 
1156 v3.1.3
1157 ==============
1158 
1159 Bug Fixes
1160 ---------
1161 
1162 * Fixed [bugs](https://github.com/arrayfire/arrayfire/issues/1042) in various OpenCL kernels without offset additions
1163 * Remove ARCH_32 and ARCH_64 flags
1164 * Fix [missing symbols](https://github.com/arrayfire/arrayfire/issues/1040) when freeimage is not found
1165 * Use CUDA driver version for Windows
1166 * Improvements to SIFT
1167 * Fixed [memory leak](https://github.com/arrayfire/arrayfire/issues/1045) in median
1168 * Fixes for Windows compilation when not using MKL [#1047](https://github.com/arrayfire/arrayfire/issues/1047)
1169 * Fixed for building without LAPACK
1170 
1171 Other
1172 -------
1173 
1174 * Documentation: Fixed documentation for select and replace
1175 * Documentation: Fixed documentation for af_isnan
1176 
1177 v3.1.2
1178 ==============
1179 
1180 Bug Fixes
1181 ---------
1182 
1183 * Fixed [bug](https://github.com/arrayfire/arrayfire/commit/4698f12) in assign that was causing test to fail
1184 * Fixed bug in convolve. Frequency condition now depends on kernel size only
1185 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1005) in indexed reductions for complex type in OpenCL backend
1186 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1006) in kernel name generation in ireduce for OpenCL backend
1187 * Fixed non-linear to linear indices in ireduce
1188 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1011) in reductions for small arrays
1189 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/1010) in histogram for indexed arrays
1190 * Fixed [compiler error](https://github.com/arrayfire/arrayfire/issues/1015) CPUID for non-compliant devices
1191 * Fixed [failing tests](https://github.com/arrayfire/arrayfire/issues/1008) on i386 platforms
1192 * Add missing AFAPI
1193 
1194 Other
1195 -------
1196 
1197 * Documentation: Added missing examples and other corrections
1198 * Documentation: Fixed warnings in documentation building
1199 * Installers: Send error messages to log file in OSX Installer
1200 
1201 v3.1.1
1202 ==============
1203 
1204 Installers
1205 -----------
1206 
1207 * CUDA backend now depends on CUDA 7.5 toolkit
1208 * OpenCL backend now require OpenCL 1.2 or greater
1209 
1210 Bug Fixes
1211 --------------
1212 
1213 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/981) in reductions after indexing
1214 * Fixed [bug](https://github.com/arrayfire/arrayfire/issues/976) in indexing when using reverse indices
1215 
1216 Build
1217 ------
1218 
1219 * `cmake` now includes `PKG_CONFIG` in the search path for CBLAS and LAPACKE libraries
1220 * [heston_model.cpp](\ref financial/heston_model.cpp) example now builds with the default ArrayFire cmake files after installation
1221 
1222 Other
1223 ------
1224 
1225 * Fixed bug in [image_editing.cpp](\ref image_processing/image_editing.cpp)
1226 
1227 v3.1.0
1228 ==============
1229 
1230 Function Additions
1231 ------------------
1232 * Computer Vision Functions
1233  * \ref nearestNeighbour() - Nearest Neighbour with SAD, SSD and SHD distances
1234  * \ref harris() - Harris Corner Detector
1235  * \ref susan() - Susan Corner Detector
1236  * \ref sift() - Scale Invariant Feature Transform (SIFT)
1237  * Method and apparatus for identifying scale invariant features"
1238  "in an image and use of same for locating an object in an image,\" David"
1239  "G. Lowe, US Patent 6,711,293 (March 23, 2004). Provisional application"
1240  "filed March 8, 1999. Asignee: The University of British Columbia. For"
1241  "further details, contact David Lowe (lowe@cs.ubc.ca) or the"
1242  "University-Industry Liaison Office of the University of British"
1243  "Columbia.")
1244  * SIFT is available for compiling but does not ship with ArrayFire
1245  hosted installers/pre-built libraries
1246  * \ref dog() - Difference of Gaussians
1247 
1248 * Image Processing Functions
1249  * \ref ycbcr2rgb() and \ref rgb2ycbcr() - RGB <->YCbCr color space conversion
1250  * \ref wrap() and \ref unwrap() Wrap and Unwrap
1251  * \ref sat() - Summed Area Tables
1252  * \ref loadImageMem() and \ref saveImageMem() - Load and Save images to/from memory
1253  * \ref af_image_format - Added imageFormat (af_image_format) enum
1254 
1255 * Array & Data Handling
1256  * \ref copy() - Copy
1257  * array::lock() and array::unlock() - Lock and Unlock
1258  * \ref select() and \ref replace() - Select and Replace
1259  * Get array reference count (af_get_data_ref_count)
1260 
1261 * Signal Processing
1262  * \ref fftInPlace() - 1D in place FFT
1263  * \ref fft2InPlace() - 2D in place FFT
1264  * \ref fft3InPlace() - 3D in place FFT
1265  * \ref ifftInPlace() - 1D in place Inverse FFT
1266  * \ref ifft2InPlace() - 2D in place Inverse FFT
1267  * \ref ifft3InPlace() - 3D in place Inverse FFT
1268  * \ref fftR2C() - Real to complex FFT
1269  * \ref fftC2R() - Complex to Real FFT
1270 
1271 * Linear Algebra
1272  * \ref svd() and \ref svdInPlace() - Singular Value Decomposition
1273 
1274 * Other operations
1275  * \ref sigmoid() - Sigmoid
1276  * Sum (with option to replace NaN values)
1277  * Product (with option to replace NaN values)
1278 
1279 * Graphics
1280  * Window::setSize() - Window resizing using Forge API
1281 
1282 * Utility
1283  * Allow users to set print precision (print, af_print_array_gen)
1284  * \ref saveArray() and \ref readArray() - Stream arrays to binary files
1285  * \ref toString() - toString function returns the array and data as a string
1286 
1287 * CUDA specific functionality
1288  * \ref getStream() - Returns default CUDA stream ArrayFire uses for the current device
1289  * \ref getNativeId() - Returns native id of the CUDA device
1290 
1291 Improvements
1292 ------------
1293 * dot
1294  * Allow complex inputs with conjugate option
1295 * AF_INTERP_LOWER interpolation
1296  * For resize, rotate and transform based functions
1297 * 64-bit integer support
1298  * For reductions, random, iota, range, diff1, diff2, accum, join, shift
1299  and tile
1300 * convolve
1301  * Support for non-overlapping batched convolutions
1302 * Complex Arrays
1303  * Fix binary ops on complex inputs of mixed types
1304  * Complex type support for exp
1305 * tile
1306  * Performance improvements by using JIT when possible.
1307 * Add AF_API_VERSION macro
1308  * Allows disabling of API to maintain consistency with previous versions
1309 * Other Performance Improvements
1310  * Use reference counting to reduce unnecessary copies
1311 * CPU Backend
1312  * Device properties for CPU
1313  * Improved performance when all buffers are indexed linearly
1314 * CUDA Backend
1315  * Use streams in CUDA (no longer using default stream)
1316  * Using async cudaMem ops
1317  * Add 64-bit integer support for JIT functions
1318  * Performance improvements for CUDA JIT for non-linear 3D and 4D arrays
1319 * OpenCL Backend
1320  * Improve compilation times for OpenCL backend
1321  * Performance improvements for non-linear JIT kernels on OpenCL
1322  * Improved shared memory load/store in many OpenCL kernels (PR 933)
1323  * Using cl.hpp v1.2.7
1324 
1325 Bug Fixes
1326 ---------
1327 * Common
1328  * Fix compatibility of c32/c64 arrays when operating with scalars
1329  * Fix median for all values of an array
1330  * Fix double free issue when indexing (30cbbc7)
1331  * Fix [bug](https://github.com/arrayfire/arrayfire/issues/901) in rank
1332  * Fix default values for scale throwing exception
1333  * Fix conjg raising exception on real input
1334  * Fix bug when using conjugate transpose for vector input
1335  * Fix issue with const input for array_proxy::get()
1336 * CPU Backend
1337  * Fix randn generating same sequence for multiple calls
1338  * Fix setSeed for randu
1339  * Fix casting to and from complex
1340  * Check NULL values when allocating memory
1341  * Fix [offset issue](https://github.com/arrayfire/arrayfire/issues/923) for CPU element-wise operations
1342 
1343 New Examples
1344 ------------
1345 * Match Template
1346 * Susan
1347 * Heston Model (contributed by Michael Nowotny)
1348 
1349 Installer
1350 ----------
1351 * Fixed bug in automatic detection of ArrayFire when using with CMake in Windows
1352 * The Linux libraries are now compiled with static version of FreeImage
1353 
1354 Known Issues
1355 ------------
1356 * OpenBlas can cause issues with QR factorization in CPU backend
1357 * FreeImage older than 3.10 can cause issues with loadImageMem and
1358  saveImageMem
1359 * OpenCL backend issues on OSX
1360  * AMD GPUs not supported because of driver issues
1361  * Intel CPUs not supported
1362  * Linear algebra functions do not work on Intel GPUs.
1363 * Stability and correctness issues with open source OpenCL implementations such as Beignet, GalliumCompute.
1364 
1365 v3.0.2
1366 ==============
1367 
1368 Bug Fixes
1369 --------------
1370 
1371 * Added missing symbols from the compatible API
1372 * Fixed a bug affecting corner rows and elements in \ref grad()
1373 * Fixed linear interpolation bugs affecting large images in the following:
1374  - \ref approx1()
1375  - \ref approx2()
1376  - \ref resize()
1377  - \ref rotate()
1378  - \ref scale()
1379  - \ref skew()
1380  - \ref transform()
1381 
1382 Documentation
1383 -----------------
1384 
1385 * Added missing documentation for \ref constant()
1386 * Added missing documentation for `array::scalar()`
1387 * Added supported input types for functions in `arith.h`
1388 
1389 v3.0.1
1390 ==============
1391 
1392 Bug Fixes
1393 --------------
1394 
1395 * Fixed header to work in Visual Studio 2015
1396 * Fixed a bug in batched mode for FFT based convolutions
1397 * Fixed graphics issues on OSX
1398 * Fixed various bugs in visualization functions
1399 
1400 Other improvements
1401 ---------------
1402 
1403 * Improved fractal example
1404 * New OSX installer
1405 * Improved Windows installer
1406  * Default install path has been changed
1407 * Fixed bug in machine learning examples
1408 
1409 <br>
1410 
1411 v3.0.0
1412 =================
1413 
1414 Major Updates
1415 -------------
1416 
1417 * ArrayFire is now open source
1418 * Major changes to the visualization library
1419 * Introducing handle based C API
1420 * New backend: CPU fallback available for systems without GPUs
1421 * Dense linear algebra functions available for all backends
1422 * Support for 64 bit integers
1423 
1424 Function Additions
1425 ------------------
1426 * Data generation functions
1427  * range()
1428  * iota()
1429 
1430 * Computer Vision Algorithms
1431  * features()
1432  * A data structure to hold features
1433  * fast()
1434  * FAST feature detector
1435  * orb()
1436  * ORB A feature descriptor extractor
1437 
1438 * Image Processing
1439  * convolve1(), convolve2(), convolve3()
1440  * Specialized versions of convolve() to enable better batch support
1441  * fftconvolve1(), fftconvolve2(), fftconvolve3()
1442  * Convolutions in frequency domain to support larger kernel sizes
1443  * dft(), idft()
1444  * Unified functions for calling multi dimensional ffts.
1445  * matchTemplate()
1446  * Match a kernel in an image
1447  * sobel()
1448  * Get sobel gradients of an image
1449  * rgb2hsv(), hsv2rgb(), rgb2gray(), gray2rgb()
1450  * Explicit function calls to colorspace conversions
1451  * erode3d(), dilate3d()
1452  * Explicit erode and dilate calls for image morphing
1453 
1454 * Linear Algebra
1455  * matmulNT(), matmulTN(), matmulTT()
1456  * Specialized versions of matmul() for transposed inputs
1457  * luInPlace(), choleskyInPlace(), qrInPlace()
1458  * In place factorizations to improve memory requirements
1459  * solveLU()
1460  * Specialized solve routines to improve performance
1461  * OpenCL backend now Linear Algebra functions
1462 
1463 * Other functions
1464  * lookup() - lookup indices from a table
1465  * batchFunc() - helper function to perform batch operations
1466 
1467 * Visualization functions
1468  * Support for multiple windows
1469  * window.hist()
1470  * Visualize the output of the histogram
1471 
1472 * C API
1473  * Removed old pointer based C API
1474  * Introducing handle base C API
1475  * Just In Time compilation available in C API
1476  * C API has feature parity with C++ API
1477  * bessel functions removed
1478  * cross product functions removed
1479  * Kronecker product functions removed
1480 
1481 Performance Improvements
1482 ------------------------
1483 * Improvements across the board for OpenCL backend
1484 
1485 API Changes
1486 ---------------------
1487 * `print` is now af_print()
1488 * seq(): The step parameter is now the third input
1489  * seq(start, step, end) changed to seq(start, end, step)
1490 * gfor(): The iterator now needs to be seq()
1491 
1492 Deprecated Function APIs
1493 ------------------------
1494 Deprecated APIs are in af/compatible.h
1495 
1496 * devicecount() changed to getDeviceCount()
1497 * deviceset() changed to setDevice()
1498 * deviceget() changed to getDevice()
1499 * loadimage() changed to loadImage()
1500 * saveimage() changed to saveImage()
1501 * gaussiankernel() changed to gaussianKernel()
1502 * alltrue() changed to allTrue()
1503 * anytrue() changed to anyTrue()
1504 * setunique() changed to setUnique()
1505 * setunion() changed to setUnion()
1506 * setintersect() changed to setIntersect()
1507 * histequal() changed to histEqual()
1508 * colorspace() changed to colorSpace()
1509 * filter() deprecated. Use convolve1() and convolve2()
1510 * mul() changed to product()
1511 * deviceprop() changed to deviceProp()
1512 
1513 Known Issues
1514 ----------------------
1515 * OpenCL backend issues on OSX
1516  * AMD GPUs not supported because of driver issues
1517  * Intel CPUs not supported
1518  * Linear algebra functions do not work on Intel GPUs.
1519 * Stability and correctness issues with open source OpenCL implementations such as Beignet, GalliumCompute.