I'm writing an OpenCL 1.2 backend for Tensorflow at https://github.com/hughperkins/tensorflow-cl
This fork of tensorflow for OpenCL has the following characteristics:
- it targets any/all OpenCL 1.2 devices. It doesnt need OpenCL 2.0, doesnt need SPIR-V, or SPIR. Doesnt need Shared Virtual Memory. And so on ...
- it's based on an underlying library called 'cuda-on-cl', https://github.com/hughperkins/cuda-on-cl
- cuda-on-cl targets to be able to take any NVIDIA® CUDA™ soure-code, and compile it for OpenCL 1.2 devices. It's a very general goal, and a very general compiler
- for now, the following functionalities are implemented:
- per-element operations, using Eigen over OpenCL, (more info at https://bitbucket.org/hughperkins/eigen/src/eigen-cl/unsupported/test/cuda-on-cl/?at=eigen-cl )
- blas / matrix-multiplication, using Cedric Nugteren's CLBlast https://github.com/cnugteren/CLBlast
- reductions, argmin, argmax, again using Eigen, as per earlier info and links
- learning, trainers, gradients. At least, StochasticGradientDescent trainer is working, and the others are commited, but not yet tested
- it is developed on Ubuntu 16.04 (using Intel HD5500, and NVIDIA GPUs) and Mac Sierra (using Intel HD 530, and Radeon Pro 450)
This is not the only OpenCL fork of Tensorflow available. There is also a fork being developed by Codeplay https://www.codeplay.com , using Computecpp, https://www.codeplay.com/products/computesuite/computecpp Their fork has stronger requirements than my own, as far as I know, in terms of which specific GPU devices it works on. You would need to check the Platform Support Notes (at the bottom of hte computecpp page), to determine whether your device is supported. The codeplay fork is actually an official Google fork, which is here: https://github.com/benoitsteiner/tensorflow-opencl