To enable Intel TBB, CUDA, and CPU specific compiler optimizations... one will almost certainly need to re-build the library, and customize your application build.
Some tasks degrade in performance on a GPU, and others are 740 times faster... ymmv. =3