GPU WU's

Author	Message
[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 105834 - Posted: 5 Apr 2022, 6:43:22 UTC - in response to Message 105285. Orochi Orochi is a library loading HIP and CUDA APIs dynamically, allowing the user to switch APIs at runtime. Therefore you don't need to compile two separate implementations for each API. This allows you to compile and maintain a single binary that can run on both AMD and NVIDIA GPUs. Unlike HIP, which uses hipamd or CUDA at compile-time, Orochi will dynamically load the corresponding HIP/CUDA shared libraries depending on your platform. In other words, it combines the functionality offered by HIPEW and CUEW into a single library ID: 105834 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 107904 - Posted: 29 Dec 2022, 15:37:13 UTC Video: GPU Performance Portability Using Standard C++ with SYCL ID: 107904 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 108089 - Posted: 19 Feb 2023, 22:43:16 UTC Intel opened it's cpu opencl runtime ID: 108089 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 108145 - Posted: 4 Mar 2023, 19:24:33 UTC - in response to Message 108089. IWOCL 2023 Program ID: 108145 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 108606 - Posted: 21 Sep 2023, 9:20:54 UTC Last modified: 21 Sep 2023, 9:21:59 UTC HypSycl/OpenSycl has renamed as AdaptiveCpp AdaptiveCpp is the independent, community-driven modern platform for C++-based heterogeneous programming models targeting CPUs and GPUs from all major vendors. AdaptiveCpp lets applications adapt themselves to all the hardware found in the system. This includes use cases where a single binary needs to be able to target all supported hardware, or utilize hardware from different vendors simultaneously. It currently supports the following programming models: SYCL: At its core is a SYCL implementation that supports many use cases and approaches of implementing SYCL. C++ standard parallelism: Additionally, AdaptiveCpp features experimental support for offloading C++ algorithms from the parallel STL. See here for details on which algorithms can be offloaded. AdaptiveCpp is currently the only solution that can offload C++ standard parallelism constructs to GPUs from Intel, NVIDIA and AMD -- even from a single binary. AdaptiveCpp has been repeatedly shown to deliver very competitive performance compared to other SYCL implementations or proprietary solutions like CUDA. ID: 108606 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 108979 - Posted: 12 Mar 2024, 9:35:38 UTC - in response to Message 108606. Last modified: 12 Mar 2024, 9:37:54 UTC AdaptiveCpp has been repeatedly shown to deliver very competitive performance compared to other SYCL implementations or proprietary solutions like CUDA. The new version increases performance! And... No targets specification needed anymore! AdaptiveCpp now by default compiles with --acpp-targets=generic. This means that a simple compiler invocation such as acpp -o test -O3 test.cpp will create a binary that can run on Intel, NVIDIA and AMD GPUs. AdaptiveCpp 24.02 is the world's only SYCL compiler that does not require specifying compilation targets to generate a binary that can run "everywhere". ID: 108979 · Rating: 0 · rate: / Reply Quote

[VENETO] boboviz Send message Joined: 1 Dec 05 Posts: 1887 Credit: 8,443,729 RAC: 10,797	Message 109233 - Posted: 6 May 2024, 10:04:24 UTC - in response to Message 108979. EuroHack24 EuroHack is a unique opportunity for current or prospective users groups of large hybrid CPU-GPU systems to either (1) port their (potentially) scalable application to GPU accelerators, (2) optimize an existing GPU-enabled application, on a state-of-the-art GPU system, or (3) optimize for the multicore. Focus should be in any case the parallelism of the application. The goal is that the development teams leave at the end of the week with applications executing faster, or at least with a clear roadmap of how to get there. ID: 109233 · Rating: 0 · rate: / Reply Quote