|
|
|
Wenpeng Ma, Wu Yuan and Xiazhen Liu
Incomplete Sparse Approximate Inverses (ISAI) has shown some advantages over sparse triangular solves on GPUs when it is used for the incomplete LU based preconditioner. In this paper, we extend the single GPU method for Block?ISAI to multiple GPUs algor...
ver más
|
|
|
|
|
|
|
Muhammad Ali Shafique, Arslan Munir and Joonho Kong
Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning...
ver más
|
|
|
|
|
|
|
Mariano Ruiz, Julián Nieto, Víctor Costa, Teddy Craciunescu, Emmanuele Peluso, Jesús Vega, Andrea Murari and JET Contributors
In recent years, a new tomographic inversion method based on the Maximum Likelihood (ML) approach has been adapted to JET bolometry. Apart from its accuracy and reliability, the key advantage is its ability to provide reliable estimates of the uncertaint...
ver más
|
|
|
|
|
|
|
Chao Zhou and Tao Zhang
In real applications, massive data with graph structures are often incomplete due to various restrictions. Therefore, graph data imputation algorithms have been widely used in the fields of social networks, sensor networks, and MRI to solve the graph dat...
ver más
|
|
|
|
|
|
|
Chinthakindi Balaram Murthy, Mohammad Farukh Hashmi, Neeraj Dhanraj Bokde and Zong Woo Geem
In recent years there has been remarkable progress in one computer vision application area: object detection. One of the most challenging and fundamental problems in object detection is locating a specific object from the multiple objects present in a sc...
ver más
|
|
|
|
|
|
|
David Goz, Georgios Ieronymakis, Vassilis Papaefstathiou, Nikolaos Dimou, Sara Bertocco, Francesco Simula, Antonio Ragagnin, Luca Tornatore, Igor Coretti and Giuliano Taffoni
New challenges in Astronomy and Astrophysics (AA) are urging the need for many exceptionally computationally intensive simulations. ?Exascale? (and beyond) computational facilities are mandatory to address the size of theoretical problems and data coming...
ver más
|
|
|
|
|
|
|
Thaha Muhammed, Rashid Mehmood, Aiiad Albeshri and Iyad Katib
Sparse matrix-vector (SpMV) multiplication is a vital building block for numerous scientific and engineering applications. This paper proposes SURAA (translates to speed in arabic), a novel method for SpMV computations on graphics processing units (GPUs)...
ver más
|
|
|
|
|
|
|
Da Xu and Tao Zhang
Radio-frequency (RF) tomographic imaging is a promising technique for inferring multi-dimensional physical space by processing RF signals traversed across a region of interest. Tensor-based approaches for tomographic imaging are superior at detecting the...
ver más
|
|
|
|
|
|
|
Jiri Jaros,Filip Vaverka,Bradley E. Treeby
Pág. 40 - 55
The simulation of ultrasound wave propagation through biological tissue has a wide range of practical applications. However, large grid sizes are generally needed to capture the phenomena of interest. Here, a novel approach to reduce the computational co...
ver más
|
|
|
|
|
|
|
Jan Masek,Radim Burget,Lukas Povoda,Malay Kishore Dutta
Pág. 101 - 107
Using modern Graphic Processing Units (GPUs) becomes very useful for computing complex and time consuming processes. GPUs provide high?performance computation capabilities with a good price. This paper deals with a multi?GPU OpenCL and CUDA implementatio...
ver más
|
|
|
|
|
|
|
Guiming Zhang and Jin Xu
Kernel density estimation (KDE) is a commonly used method for spatial point pattern analysis, but it is computationally demanding when analyzing large datasets. GPU-based parallel computing has been adopted to address such computational challenges. The e...
ver más
|
|
|
|
|
|
|
Michael Knobloch,Bernd Mohr
Pág. 91 - 111
General purpose GPUs are now ubiquitous in high-end supercomputing. All but one (the Japanese Fugaku system, which is based on ARM processors) of the announced (pre-)exascale systems contain vast amounts of GPUs that deliver the majority of the performan...
ver más
|
|
|
|
|
|
|
Daniel Molinero-Hernández, Sergio R. Galván-González, Nicolás D. Herrera-Sandoval, Pablo Guzman-Avalos, J. Jesús Pacheco-Ibarra and Francisco J. Domínguez-Mota
Driven by the emergence of Graphics Processing Units (GPUs), the solution of increasingly large and intricate numerical problems has become feasible. Yet, the integration of GPUs into Computational Fluid Dynamics (CFD) codes still presents a significant ...
ver más
|
|
|
|
|
|
|
Federico Piscaglia and Federico Ghioldi
We introduce algorithmic advancements designed to expedite simulations in OpenFOAM using GPUs. These developments include the following. (a) The amgx4Foam library, which connects the open-source AmgX library from NVIDIA to OpenFOAM. Matrix generation, in...
ver más
|
|
|
|
|
|
|
Eduardo Medeiros, Leonel Corado, Luís Rato, Paulo Quaresma and Pedro Salgueiro
Automatic speech recognition (ASR), commonly known as speech-to-text, is the process of transcribing audio recordings into text, i.e., transforming speech into the respective sequence of words. This paper presents a deep learning ASR system optimization ...
ver más
|
|
|
|
|
|
|
Dominic Windisch, Christian Kaever, Guido Juckeland and André Bieberle
In this article, we introduce a parallel algorithm for connected-component analysis (CCA) on GPUs which drastically reduces the volume of data to transfer from GPU to the host. CCA algorithms targeting GPUs typically store the extracted features in array...
ver más
|
|
|
|
|
|
|
David Rohr,Gvozden Neskovic,Volker Lindenstruth
Pág. 41 - 48
The L-CSC (Lattice Computer for Scientific Computing) is a general purpose compute cluster built with commodity hardware installed at GSI. Its main operational purpose is Lattice QCD (LQCD) calculations for physics simulations. Quantum Chromo Dynamics (Q...
ver más
|
|
|
|
|
|
|
Md Momin Al Aziz, Md Toufique Morshed Tamal and Noman Mohammed
Fully homomorphic encryption (FHE) cryptographic systems enable limitless computations over encrypted data, providing solutions to many of today?s data security problems. While effective FHE platforms can address modern data security concerns in unsecure...
ver más
|
|
|
|
|
|
|
Rina Komatsu and Tad Gonsalves
In CycleGAN, an image-to-image translation architecture was established without the use of paired datasets by employing both adversarial and cycle consistency loss. The success of CycleGAN was followed by numerous studies that proposed new translation mo...
ver más
|
|
|
|
|
|
|
Lei Zhang and Xiaoli Zhi
Convolutional neural networks (CNN for short) have made great progress in face detection. They mostly take computation intensive networks as the backbone in order to obtain high precision, and they cannot get a good detection speed without the support of...
ver más
|
|
|
|