Japanese

Kazuya MATSUMOTO

Kazuya Matsumoto
Ph.D. in Computer Science and Engineering
Associate Professor,
Department of Computer Science and Engineering, School of Computer Science and Engineering
/ Distributed Parallel Processing Laboratory, Division of Computer Engineering,
The University of Aizu

Publication List

Journals/Transactions

  1. Kazuya Matsumoto, Yasuhiro Idomura, Takuya Ino, Akie Mayumi, Susumu Yamada, "Implementation and performance evaluation of a communication-avoiding GMRES method for stencil-based code on GPU cluster," The Journal of Supercomputing, Vol. 75, Springer, pp. 8115-8146, September 2019. DOI:10.1007/s11227-019-02983-7 [full paper, refereed]
  2. Kazuya Matsumoto, Toshihiro Hanawa, Yuetsu Kodama, Hisafumi Fujii, Taisuke Boku, "Implementation and performance evaluation of collective communication with proprietary interconnect TCA for direct communication between GPUs," IPSJ Transactions on Advanced Computing Systems, Vol. 8, No. 4, IPSJ, pp. 36-49, November 2015. [regular paper in Japanese, refereed]
  3. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "Blocked united algorithm for the all-pairs shortest paths problem on hybrid CPU-GPU systems," IEICE Transactions on Information and Systems, Special Section on Parallel and Distributed Computing and Networking, E95-D, No. 12, IEICE, pp. 2759-2768, December 2012. DOI:10.1587/transinf.E95.D.2759 [full paper, refereed] pdf (copyright(c)2012 IEICE)
  4. Kazuya Matsumoto, Stanislav G. Sedukhin, "A solution of the all-pairs shortest paths problem on the Cell Broadband Engine processor," IEICE Transactions on Information and Systems, E92-D, No. 6, IEICE, pp. 1225-1231, June 2009. DOI:10.1587/transinf.E92.D.1225 [full paper, refereed] pdf (copyright(c)2009 IEICE)

International Conferences/Symposiums/Workshops

  1. Kazuya Matsumoto, Yoichi Tomioka, Stanislav Sedukhin, "High performance software systolic array computing of multi-channel convolution on a GPU," In Proceedings of the 22nd International Conference on Computational Science and Its Applications (ICCSA 2022), LNCS 1375, pp. 298-309, Springer International Publishing The University of Malaga, Malaga, Spain, July 4 - 7, 2022. DOI:10.1007/978-3-031-10522-7_21 [full paper, refereed]
  2. Kazuya Matsumoto, Naohito Nakasato, Toshiaki Hishinuma, "Effectiveness of performance tuning techniques for general matrix multiplication on the PEZY-SC2," In Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART 2019), No. 8, 6 pages, Nagasaki Prefectural Art Museum, Nagasaki, Japan, June 6 - 7, 2019. DOI:10.1145/3337801.3337817 [regular paper, refereed]
  3. Kazuya Matsumoto, Norihisa Fujita, Toshihiro Hanawa, Taisuke Boku, "Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA," In the 12th International Meeting on High Performance Computing for Computational Science (VECPAR 2016), Porto, Portugal, June 30 - July 1, 2016. [regular paper, refereed]
  4. Toshihiro Hanawa, Hisafumi Fujii, Norihisa Fujita, Tetsuya Odajima, Kazuya Matsumoto, Taisuke Boku, "Evaluation of FFT for GPU cluster using Tightly Coupled Accelerators architecture," In the 4th Workshop on Heterogeneous and Unconventional Cluster Architectures and Applications (HUCAA 2015) - Proceedings of the IEEE Cluster 2015, IEEE, pp. 635-641, Chicago, Illinois, USA, September 8-11, 2015. DOI:10.1109/CLUSTER.2015.113 [regular paper, refereed]
  5. Toshihiro Hanawa, Hisafumi Fujii, Norihisa Fujita, Tetsuya Odajima, Kazuya Matsumoto, Yuetsu Kodama, Taisuke Boku, "Improving Strong-Scaling on GPU Cluster Based on Tightly Coupled Accelerators Architecture," In Proceedings of the IEEE Cluster 2015, IEEE, pp. 88-91, Chicago, Illinois, USA, September 8-11, 2015. DOI:10.1109/CLUSTER.2015.154 [short paper, refereed]
  6. Kazuya Matsumoto, Toshihiro Hanawa, Yuetsu Kodama, Hisafumi Fujii, Taisuke Boku, "Implementation of CG method on GPU cluster with proprietary interconnect TCA for GPU direct communication", In The Fifth International Workshop on Accelerators and Hybrid Exascale Systems (AsHES 2015) - Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW 2015), IEEE, pp. 647-655, Hyderabad International Convention Centre, Hyderabad, India, May 25, 2015. DOI:10.1109/IPDPSW.2015.102 [regular paper, refereed]
  7. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "Performance tuning of matrix multiplication in OpenCL on different GPUs and CPUs," In the 3rd International Workshop on Performace Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS12) - Proceedings of the 2012 SC Companion: High Performance Computing, Networking, Storage and Analysis (SCC), IEEE CS's Conference Publishing Service, pp. 396-405, Salt Palace Convention Center, Salt Lake City, Utah, USA, November 12, 2012. DOI:10.1109/SC.Companion.2012.59 [full paper, refereed]
  8. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "Implementing a code generator for fast matrix multiplication in OpenCL on the GPU," In Special Session: Auto-Tuning for Multicore and GPU (ATMG) - Proceedings of the IEEE 6th International Symposium on Embedded Multicore SoCs (MCSoC-12), IEEE Computer Society, pp. 198-204, University of Aizu, Aizu-Wakamatsu City, Fukushima, Japan, September 20-22, 2012. DOI:10.1109/MCSoC.2012.30 [regular paper, refereed] Manuscript is available (ftp) as Technical Report 2012-002, The University of Aizu.
  9. Kazuya Matsumoto, "The algebraic path problem on hybrid CPU-GPU systems," SC11 Early Adopters Ph.D. Workshop: Building the Next Generation of Application Scientists," The Grand Hyatt Seattle, Seattle, Washington, USA, November 14, 2011. [poster]
  10. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "Blocked all-pairs shortest paths algorithm for hybrid CPU-GPU system," In Proceedings of the 13th IEEE International Conference on High Performance Computing and Communications (HPCC-2011), IEEE Computer Society Press, pp. 145-152, The Banff Center, Banff, Alberta, Canada, September 2-4, 2011. DOI:10.1109/HPCC.2011.28 [regular paper, refereed (acceptance rate: 21.7%, 59/271)]
  11. Kazuya Matsumoto, Naohito Nakasato, Tomoya Sakai, Hideki Yahagi, Stanislav G. Sedukhin, "Multi-level optimization of matrix multiplication for GPU-equipped systems," Procedia Computer Science: In Proceedings of the 11th International Conference on Computational Science (ICCS 2011), Volume 4, Elsevier B.V., pp. 342-351, Nanyang Technological University, Singapore, June 1-3, 2011. DOI:10.1016/j.procs.2011.04.036 [full paper, refereed] slides
  12. Kazuya Matsumoto, Stanislav G. Sedukhin, "Matrix multiply-add in min-plus algebra on a short-vector SIMD processor of Cell/B.E.," In the International Workshop on Advances in Networking and Computing (WANC) - Proceedings of the First International Conference on Networking and Computing (ICNC'10), IEEE CS's Conference Publishing Service, pp. 272-274, Hiroshima University, Higashi Hiroshima, Japan, November 17-19, 2010. DOI:10.1109/IC-NC.2010.29 [short paper, refereed]
  13. Shodai Yokoyama, Kazuya Matsumoto, Stanislav G. Sedukhin, "Matrix inversion on the Cell/B.E. processor," In Proceedings of the 11th IEEE International Conference on High Performance Computing and Communications (HPCC-09), IEEE Computer Society Press, pp. 148-153, Korea University, Seoul, Korea, June 25-27, 2009. DOI:10.1109/HPCC.2009.78 [regular paper, refereed (acceptance rate: 21.6%, 54/249)]
  14. Kazuya Matsumoto, Dmitry Vazhenin, Stanislav G. Sedukhin, "Transitive closure on the PlayStation 3," In Proceedings of the 2nd international Workshop on Automatic Performance Tuning (iWAPT 2007), p. 33, University of Tokyo, Tokyo, Japan, September 20-21, 2007. [poster paper]

Japan Domestic Conferences/Symposiums/Workshops

(Publications with Japanese title are listed just for consistency)
  1. Kazuya Matsumoto, Yuuichi Asahi, Takuya Ina, Yasuhiro Idomura, "High Performance Implementation of Nuclear Fusion Simulation Code on GPU Cluster," In 2016 Fall meeting of the Atomic Energy Society of Japan, 1 page, Kurume City Plaza, Kurume, Fukuoka, September 7-9, 2016. [conference abstract in Japanese]
  2. Kenta Sato, Norihisa Fujita, Toshihiro Hanawa, Kazuya Matsumoto, Taisuke Boku, Khaled Ibrahim, "Implementation and Evaluation of GPU-aware GASNet by Tightly Coupled Accelerators," In Proceedings of HPCS 2015, Sakura Hall, Tohoku University, Sendai, Miyagi, June 6-7, 2016. [regular paper in Japanese]
  3. In Proceedings of HPCS2015, pp. 120-128, May 19-20, 2015. [regular paper, refereed]
  4. Kazuya Matsumoto, Toshihiro Hanawa, Yuetsu Kodama, Hisafumi Fujii, Taisuke Boku, "Implementing CG method on GPU cluster with proprietary interconnect TCA for GPU direct communication", Annual Meeting on Advanced Computing System and Infrastructure (ACSI2015), International Congress Center EPOCHAL TSUKUABA, Tsukuba, Japan, January 26-28, 2015. [regular paper (not publicly available), refereed]
  5. IPSJ Research Report, Vol. 2014-HPC-147, No. 23, 10 pages, HOKKE-22, December 9-10, 2014. [research report in Japanese]
  6. IPSJ Research Report, Vol. 2014-HPC-144, No. 12, 9 pages, HPC144, May 26-27, 2014. [research report in Japanese]
  7. IPSJ Research Report, Vol. 2012-HPC-135, No. 39, 8 pages, SWoPP2012, August 1-3, 2012. [research report in Japanese]
  8. Tomoya Sakai, Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "LU factorization on Cypress GPU," In Proceedings of the 73rd National Convention of IPSJ, Vol. 1, pp. 205-206, Tokyo Institute of Technology, Tokyo, March 2-4, 2011. [short paper]
  9. Kazuya Matsumoto, Naohito Nakasato, Tomoya Sakai, Stanislav G. Sedukhin, "Optimized matrix multiplication on GPU," In Proceedings of the 10th High Performance Computing Symposium (HPCS2011), p. 76, AIST, Tsukuba, January 18-19, 2011. [poster paper]
  10. Kazuya Matsumoto, Naohito Nakasato, Tomoya Sakai, Hideki Yahagi, Stanislav G. Sedukhin, "Performance evaluation of matrix multiplication on GPU," Next Generation Supercomputing Symposium 2010, Nichii Gakkan, Kobe, January 17, 2011. [poster]
  11. Kazuya Matsumoto, Stanislav G. Sedukhin, In Proceedings of the 8th Symposium on Advanced Computing Systems and Infrastructures (SACSIS 2010), pp. 129-130, Nara Prefecture New Public Hall, Nara, May 27-28, 2010. [poster paper]
  12. Shodai Yokoyama, Kazuya Matsumoto, Stanislav G. Sedukhin, "Blocked matrix inversion on PlayStation3," In Proceedings of the 7th Symposium on Advanced Computing Systems and Infrastructures (SACSIS 2009), pp. 175-176, International Conference Center Hiroshima, Hiroshima, May 28-29, 2009. [poster paper]
  13. In Proceedings of the 6th Symposium on Advanced Computing Systems and Infrastructures (SACSIS 2008), pp. 15-16, June 11-13, 2008. [poster paper in Japanese]
  14. Kazuya Matsumoto, Stanislav Sedukhin, "All-pairs shortest path problem on the PLAYSTATION 3," The 19th Computer System Symposium (ComSys 2007), 2 pages, Tokyo Fashion Town Building, Tokyo, November 27-28, 2007. [poster paper in Japanese]

Theses

Others

(Documents with Japanese title are listed just for consistency)
  1. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "Implementing Level-3 BLAS Routines in OpenCL on Different Processing Units," Technical Report 2014-001, The University of Aizu, October 2014. ftp
  2. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin, "Different matrix multiplication routines in OpenCL," Presented at SC'13 Exhibits, Colorado Convention Center, Denver, Colorado, USA, November 18-21, 2013. [poster]
  3. Kazuya Matsumoto, Tomoya Sakai "Fast GEMM implementation on heterogeneous CPU-GPU systems," Presented at AMD Fusion Developer Summit 2012 (AFDS 12), Hyatt Regency Bellevue, Bellevue, Washington, USA, June 11-14, 2012.
  4. Kazuya Matsumoto, Tomoya Sakai, Naohito Nakasato, Stanislav G. Sedukhin "Optimization of matrix multiplication for CPU-GPU systems," Presented at SC'11 Exhibits, November 14-17, 2011. [poster]
  5. Kazuya Matsumoto, Naohito Nakasato, Stanislav G. Sedukhin "Blocked all-pairs shortest paths algorithm for hybrid CPU-GPU system," Presented at SC'11 Exhibits, November 14-17, 2011. [poster]
  6. Kazuya Matsumoto, Stanislav G. Sedukhin, "The algebraic path problem on the Cell/B.E. processor," Technical Report 2010-002, The University of Aizu, November 2010. ftp
  7. Kazuya Matsumoto, Public Document for Cell Speed Challenge 2008, 4 pages, July 2008. [document in Japanese]

Profile

Work Experience

Education

Memberships

  1. IPSJ (Information Processing Society of Japan) [SIGHPC], 2008 - present
  2. IEEE (Institute of Electrical and Electronics Engineers), 2009 - present
  3. AESJ (Atomic Energy Society of Japan) [Computational Science and Engineering Division], 2016 - present

Contacts