|
|
Venues (Conferences, Journals, ...)
|
|
GrowBag graphs for keyword ? (Num. hits/coverage)
Group by:
The graphs summarize 18 occurrences of 18 keywords
|
|
|
Results
Found 26 publication records. Showing 26 according to the selection in the facets
Hits ?▲ |
Authors |
Title |
Venue |
Year |
Link |
Author keywords |
77 | John A. Gunnels, Fred G. Gustavson, Keshav Pingali, Kamen Yotov |
Is Cache-Oblivious DGEMM Viable? ![Search on Bibsonomy](Pics/bibsonomy.png) |
PARA ![In: Applied Parallel Computing. State of the Art in Scientific Computing, 8th International Workshop, PARA 2006, Umeå, Sweden, June 18-21, 2006, Revised Selected Papers, pp. 919-928, 2006, Springer, 978-3-540-75754-2. The full citation details ...](Pics/full.jpeg) |
2006 |
DBLP DOI BibTeX RDF |
|
70 | David S. Wise, Jeremy D. Frens, Yuhong Gu, Gregory A. Alexander |
Language support for Morton-order matrices. ![Search on Bibsonomy](Pics/bibsonomy.png) |
PPoPP ![In: Proceedings of the 2001 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'01), Snowbird, Utah, USA, June 18-20, 2001, pp. 24-33, 2001, ACM, 1-58113-346-4. The full citation details ...](Pics/full.jpeg) |
2001 |
DBLP DOI BibTeX RDF |
paging, quadtrees |
63 | David Rohr, Matthias Bach, Matthias Kretz, Volker Lindenstruth |
Multi-GPU DGEMM and High Performance Linpack on Highly Energy-Efficient Clusters. ![Search on Bibsonomy](Pics/bibsonomy.png) |
IEEE Micro ![In: IEEE Micro 31(5), pp. 18-27, 2011. The full citation details ...](Pics/full.jpeg) |
2011 |
DBLP DOI BibTeX RDF |
multi-GPU, HPL, High Performance Linpack, DGEMM, double-precision general matrix multiply, GPGPU, system architecture, Green IT, heterogeneous (hybrid) systems |
56 | Rolf Rabenseifner, Sunil R. Tiyyagura, Matthias S. Müller |
Network Bandwidth Measurements and Ratio Analysis with the HPC Challenge Benchmark Suite (HPCC). ![Search on Bibsonomy](Pics/bibsonomy.png) |
PVM/MPI ![In: Recent Advances in Parallel Virtual Machine and Message Passing Interface, 12th European PVM/MPI Users' Group Meeting, Sorrento, Italy, September 18-21, 2005, Proceedings, pp. 368-378, 2005, Springer, 3-540-29009-5. The full citation details ...](Pics/full.jpeg) |
2005 |
DBLP DOI BibTeX RDF |
HPCC, HPL, DGEMM, PTRANS, FFTE, benchmarking, STREAM, latency, effective bandwidth, network bandwidth, Linpack |
54 | Daniel Hackenberg, Robert Schöne, Wolfgang E. Nagel, Stefan Pflüger |
Optimizing OpenMP Parallelized DGEMM Calls on SGI Altix 3700. ![Search on Bibsonomy](Pics/bibsonomy.png) |
Euro-Par ![In: Euro-Par 2006, Parallel Processing, 12th International Euro-Par Conference, Dresden, Germany, August 28 - September 1, 2006, Proceedings, pp. 145-154, 2006, Springer, 3-540-37783-2. The full citation details ...](Pics/full.jpeg) |
2006 |
DBLP DOI BibTeX RDF |
|
47 | Stéphane Zuckerman, Marc Pérache, William Jalby |
Fine Tuning Matrix Multiplications on Multicore. ![Search on Bibsonomy](Pics/bibsonomy.png) |
HiPC ![In: High Performance Computing - HiPC 2008, 15th International Conference, Bangalore, India, December 17-20, 2008. Proceedings, pp. 30-41, 2008, Springer, 978-3-540-89893-1. The full citation details ...](Pics/full.jpeg) |
2008 |
DBLP DOI BibTeX RDF |
multicore, cache coherency, BLAS |
30 | Hiroyuki Ootomo, Katsuhisa Ozaki, Rio Yokota |
DGEMM on Integer Matrix Multiplication Unit. ![Search on Bibsonomy](Pics/bibsonomy.png) |
CoRR ![In: CoRR abs/2306.11975, 2023. The full citation details ...](Pics/full.jpeg) |
2023 |
DBLP DOI BibTeX RDF |
|
30 | Pedro Valero-Lara, Ian Jorquera, Frank Liu 0001, Jeffrey S. Vetter |
Mixed-Precision S/DGEMM Using the TF32 and TF64 Frameworks on Low-Precision AI Tensor Cores. ![Search on Bibsonomy](Pics/bibsonomy.png) |
SC Workshops ![In: Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis, SC-W 2023, Denver, CO, USA, November 12-17, 2023, pp. 177-186, 2023, ACM. The full citation details ...](Pics/full.jpeg) |
2023 |
DBLP DOI BibTeX RDF |
|
30 | Jialin Li, Huang Ye, Shaobo Tian, Xinyuan Li, Jian Zhang 0070 |
A Fine-grained Prefetching Scheme for DGEMM Kernels on GPU with Auto-tuning Compatibility. ![Search on Bibsonomy](Pics/bibsonomy.png) |
IPDPS ![In: 2022 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2022, Lyon, France, May 30 - June 3, 2022, pp. 863-874, 2022, IEEE, 978-1-6654-8106-9. The full citation details ...](Pics/full.jpeg) |
2022 |
DBLP DOI BibTeX RDF |
|
30 | Yi Wei, Lin Deng, Sizheng Sun, Sisi Li, Li Shen 0007 |
DGEMM Optimization Oriented to ARM SVE Instruction Set Architecture. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ICPADS ![In: 28th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2022, Nanjing, China, January 10-12, 2023, pp. 514-521, 2022, IEEE, 978-1-6654-7315-6. The full citation details ...](Pics/full.jpeg) |
2022 |
DBLP DOI BibTeX RDF |
|
30 | Daichi Mukunoki, Katsuhisa Ozaki, Takeshi Ogita, Toshiyuki Imamura |
DGEMM Using Tensor Cores, and Its Accurate and Reproducible Versions. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ISC ![In: High Performance Computing - 35th International Conference, ISC High Performance 2020, Frankfurt/Main, Germany, June 22-25, 2020, Proceedings, pp. 230-248, 2020, Springer, 978-3-030-50742-8. The full citation details ...](Pics/full.jpeg) |
2020 |
DBLP DOI BibTeX RDF |
|
30 | Tom Cornebize, Arnaud Legrand |
DGEMM performance is data-dependent. ![Search on Bibsonomy](Pics/bibsonomy.png) |
CoRR ![In: CoRR abs/1912.05381, 2019. The full citation details ...](Pics/full.jpeg) |
2019 |
DBLP BibTeX RDF |
|
30 | Pedro Valero-Lara, Ivan Martínez-Pérez, Sergi Mateo, Raül Sirvent, Vicenç Beltran 0001, Xavier Martorell, Jesús Labarta |
Variable Batched DGEMM. ![Search on Bibsonomy](Pics/bibsonomy.png) |
PDP ![In: 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing, PDP 2018, Cambridge, United Kingdom, March 21-23, 2018, pp. 363-367, 2018, IEEE Computer Society, 978-1-5386-4975-6. The full citation details ...](Pics/full.jpeg) |
2018 |
DBLP DOI BibTeX RDF |
|
30 | John D. McCalpin |
HPL and DGEMM performance variability on the Xeon Platinum 8160 processor. ![Search on Bibsonomy](Pics/bibsonomy.png) |
SC ![In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018, Dallas, TX, USA, November 11-16, 2018, pp. 18:1-18:13, 2018, IEEE / ACM. The full citation details ...](Pics/full.jpeg) |
2018 |
DBLP BibTeX RDF |
|
30 | Lijuan Jiang, Chao Yang 0002, Yulong Ao, Wanwang Yin, Wenjing Ma, Qiao Sun, Fangfang Liu, Rongfen Lin, Peng Zhang |
Towards Highly Efficient DGEMM on the Emerging SW26010 Many-Core Processor. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ICPP ![In: 46th International Conference on Parallel Processing, ICPP 2017, Bristol, United Kingdom, August 14-17, 2017, pp. 422-431, 2017, IEEE Computer Society, 978-1-5386-1042-8. The full citation details ...](Pics/full.jpeg) |
2017 |
DBLP DOI BibTeX RDF |
|
30 | David Rohr, Volker Lindenstruth |
A Flexible and Portable Large-Scale DGEMM Library for Linpack on Next-Generation Multi-GPU Systems. ![Search on Bibsonomy](Pics/bibsonomy.png) |
PDP ![In: 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2015, Turku, Finland, March 4-6, 2015, pp. 664-668, 2015, IEEE Computer Society, 978-1-4799-8491-6. The full citation details ...](Pics/full.jpeg) |
2015 |
DBLP DOI BibTeX RDF |
|
30 | Hao Jiang 0001, Feng Wang, Kuan Li, Canqun Yang, Kejia Zhao, Chun Huang |
Implementation of an Accurate and Efficient Compensated DGEMM for 64-bit ARMv8 Multi-Core Processors. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ICPADS ![In: 21st IEEE International Conference on Parallel and Distributed Systems, ICPADS 2015, Melbourne, Australia, December 14-17, 2015, pp. 491-498, 2015, IEEE Computer Society, 978-0-7695-5785-4. The full citation details ...](Pics/full.jpeg) |
2015 |
DBLP DOI BibTeX RDF |
|
30 | Feng Wang, Hao Jiang 0001, Ke Zuo, Xing Su, Jingling Xue, Canqun Yang |
Design and Implementation of a Highly Efficient DGEMM for 64-Bit ARMv8 Multi-core Processors. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ICPP ![In: 44th International Conference on Parallel Processing, ICPP 2015, Beijing, China, September 1-4, 2015, pp. 200-209, 2015, IEEE Computer Society, 978-1-4673-7587-0. The full citation details ...](Pics/full.jpeg) |
2015 |
DBLP DOI BibTeX RDF |
|
30 | Pawel Gepner, Victor Gamayunov, David L. Fraser, Eric Houdard, Ludovic Sauge, Damien Déclat, Mathieu Dubois |
Evaluation of DGEMM Implementation on Intel Xeon Phi Coprocessor. ![Search on Bibsonomy](Pics/bibsonomy.png) |
J. Comput. ![In: J. Comput. 9(7), pp. 1566-1571, 2014. The full citation details ...](Pics/full.jpeg) |
2014 |
DBLP BibTeX RDF |
|
30 | Pawel Gepner, Victor Gamayunov, David L. Fraser |
Effective Implementation of DGEMM on Modern Multicore CPU. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ICCS ![In: Proceedings of the International Conference on Computational Science, ICCS 2012, Omaha, Nebraska, USA, 4-6 June, 2012, pp. 126-135, 2012, Elsevier. The full citation details ...](Pics/full.jpeg) |
2012 |
DBLP DOI BibTeX RDF |
|
30 | Gideon Nimako, Ekow J. Otoo, Daniel Ohene-Kwofie |
Cache-sensitive MapReduce DGEMM algorithms for shared memory architectures. ![Search on Bibsonomy](Pics/bibsonomy.png) |
SAICSIT ![In: 2012 South African Institute of Computer Scientists and Information Technologists Conference, SAICSIT '12, Pretoria, South Africa, October 1-3, 2012, pp. 100-110, 2012, ACM, 978-1-4503-1308-7. The full citation details ...](Pics/full.jpeg) |
2012 |
DBLP DOI BibTeX RDF |
|
30 | Jiajia Li 0001, Xingjian Li 0002, Guangming Tan, Mingyu Chen 0001, Ninghui Sun |
An optimized large-scale hybrid DGEMM design for CPUs and ATI GPUs. ![Search on Bibsonomy](Pics/bibsonomy.png) |
ICS ![In: International Conference on Supercomputing, ICS'12, Venice, Italy, June 25-29, 2012, pp. 377-386, 2012, ACM, 978-1-4503-1316-2. The full citation details ...](Pics/full.jpeg) |
2012 |
DBLP DOI BibTeX RDF |
|
30 | Guangming Tan, Linchuan Li, Sean Triechle, Everett H. Phillips, Yungang Bao, Ninghui Sun |
Fast implementation of DGEMM on Fermi GPU. ![Search on Bibsonomy](Pics/bibsonomy.png) |
SC ![In: Conference on High Performance Computing Networking, Storage and Analysis, SC 2011, Seattle, WA, USA, November 12-18, 2011, pp. 35:1-35:11, 2011, ACM, 978-1-4503-0771-0. The full citation details ...](Pics/full.jpeg) |
2011 |
DBLP DOI BibTeX RDF |
|
23 | Akira Nukada, Satoshi Matsuoka |
Auto-tuning 3-D FFT library for CUDA GPUs. ![Search on Bibsonomy](Pics/bibsonomy.png) |
SC ![In: Proceedings of the ACM/IEEE Conference on High Performance Computing, SC 2009, November 14-20, 2009, Portland, Oregon, USA, 2009, ACM, 978-1-60558-744-8. The full citation details ...](Pics/full.jpeg) |
2009 |
DBLP DOI BibTeX RDF |
|
23 | Massimiliano Fatica |
Accelerating linpack with CUDA on heterogenous clusters. ![Search on Bibsonomy](Pics/bibsonomy.png) |
GPGPU ![In: Proceedings of 2nd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU 2009, Washington, DC, USA, March 8, 2009, pp. 46-51, 2009, ACM, 978-1-60558-517-8. The full citation details ...](Pics/full.jpeg) |
2009 |
DBLP DOI BibTeX RDF |
|
23 | Fred G. Gustavson, Isak Jonsson |
High Performance Cholesky Factorization via Blocking and Recursion That Uses Minimal Storage. ![Search on Bibsonomy](Pics/bibsonomy.png) |
PARA ![In: Applied Parallel Computing, New Paradigms for HPC in Industry and Academia, 5th International Workshop, PARA 2000 Bergen, Norway, June 18-20, 2000 Proceedings, pp. 82-91, 2000, Springer, 3-540-41729-X. The full citation details ...](Pics/full.jpeg) |
2000 |
DBLP DOI BibTeX RDF |
packed format, level 3 BLAS parallelism, recursive algorithm, Cholesky factorization, recursive data structure |
Displaying result #1 - #26 of 26 (100 per page; Change: )
|
|