Xevolver CREST:ポストペタスケール高性能計算に資するシステムソフトウェア技術の創出研究課題

論文リスト

2017年度

招待講演

須田礼仁、「複合的自動チューニングのための数理ライブラリの構築」、計算工学講演会論文集 vol. 22, May 31st, 2017, 大宮市ソニックシティ.

口頭講演

山本堅太郎、松本正晴、須田礼仁、「遺伝的アルゴリズムに基づく疎行列解法のパラメタに関するオンライン自動チューニング」(Online Autotuning of Parameters of Genetic Algorithm-based Sparse Linear Solver), 情報処理学会第159回 HPC 研究会、April 17, 2017, 東京大学柏の葉キャンパス.

2016年度

原著論文

Cong Li, “Communication-Avoiding Conjugate Gradient Method for Next Generation Supercomputing Systems,” ISC High Performance (ISC 2016) PhD Forum, June 20, 2016.

Daisuke Takahashi, “Implementation of Multiple-Precision Floating-Point Arithmetic on Intel Xeon Phi Coprocessors,” Proc. 16th International Conference on Computational Science and Its Applications (ICCSA 2016), Part II, Lecture Notes in Computer Science, Vol. 9787, pp. 60–70, Springer International Publishing (2016).

Hiroshi Maeda and Daisuke Takahashi, “Parallel Sparse Matrix-Vector Multiplication Using Accelerators,” Proc. 16th International Conference on Computational Science and Its Applications (ICCSA 2016), Part II, Lecture Notes in Computer Science, Vol. 9787, pp. 3–18, Springer International Publishing (2016). (NVIDIA Best Paper Award)

Takuya Ikuzawa, Fumihiko Ino, and Kenichi Hagihara, “Reducing Memory Usage by the Lifting-based Discrete Wavelet Transform with a Unified Buffer on a GPU,” Journal of Parallel and Distributed Computing, Vol. 93/94, pp. 44–55, (2016-07).

Kazuhiko Komatsu, Ryusuke Egawa, Shoichi Hirasawa, Hiroyuki Takizawa, Ken’ichi Itakura, Hiroaki Kobayashi, “Translation of Large-Scale Simulation Codes for an OpenACC Platform Using the Xevolver Framework,” International Journal on Networking and Computing (special issue on CANDAR’16), Vol. 6, No. 2, pp. 167-180 , Aug. 2016.

Raghunandan Mathur, Hiroshi Matsuoka, Osamu Watanabe, Akihiro Musa, Ryusuke Egawa and Hiroaki Kobayashi, “A Memory-Efficient Implementation of a Plasmonics Simulation Application on SX-ACE, “International Journal on Networking and Computing (special issue on CANDAR’16), Vol. 6, No. 2, pp. 243-262, Aug. 2016.

Reiji Suda, Hiroyuki Takizawa, Shoichi Hirasawa, “Xevtgen: Fortran code transformer generator for high performance scientific codes,” International Journal on Networking and Computing (special issue on CANDAR’16), Vol. 6, No. 2, pp. 263-289 , Aug. 2016.

Daisuke Takahashi, “Automatic Tuning of Computation-Communication Overlap for Parallel 1-D FFT (SP),” 19th IEEE International Conference on Computational Science and Engineering (CSE 2016), Paris, France, August 24-26, 2016.

Toshiaki Hishinuma, Takuma Sakakibara, Akihiro Fujii, Teruo Tanaka, Shoichi Hirasawa, “Xev-GMP: Automatic code generation for GMP multiple-precision code from C code,” 19th IEEE International Conference on Computational Science and Engineering (CSE 2016), Paris, France, August 24-26, 2016.

Cui Hang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “A Code Selection Mechanism Using Deep Learning,” IEEE 10th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-16), Lyon, France, September 21-23, 2016.

Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “The Importance of Dynamic Load Balancing among OpenMP Thread Teams for Irregular Workloads,” The Fourth International Symposium on Computing and Networking, Hiroshima, Japan, November 22-25, pp. 529-535, 2016.

Yasuharu Hayashi, Hiroyuki Takizawa and Hiroaki Kobayashi, “A User-Defined Code Transformation Approach to Overlapping MPI Communication with Computation,” The Fourth International Symposium on Computing and Networking, Hiroshima, Japan, November 22-25, pp. 508-514, 2016.

Reiji Suda and Hiroyuki Takizawa, “A software system supporting XML-based source-to-source code transformations on Fortran programs,” The Fourth International Symposium on Computing and Networking, Hiroshima, Japan, November 22-25, pp. 522-528, 2016.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “A Directive Generation Approach Using User-defined Rules,” The Fourth International Symposium on Computing and Networking, Hiroshima, Japan, November 22-25, pp. 515-521, 2016.

Y. Sakaguchi, K. Kataumi, H. Matsuoka, O. Watanabe, A. Musa, K. Komatsu, R. Egawa, H. Kobayashi, S. Yamamoto, “Performance Optimization of Numerical Turbine for Supercomputer SX-ACE,” the 28th International Conference on Parallel Computational Fluid Dynamics, May 9-12, 2016.

角川拓也, 平澤将一, 滝沢寛之, 小林広明, “ディレクティブに基づくステンシル計算の性能パラメータ自動設定”, 情報処理学会論文誌コンピューティングシステム(ACS), Vol. 9, No. 4, pp. 25-37, 2016.

Nobuhiro Miki, Fumihiko Ino, and Kenichi Hagihara, “An Extension of OpenACC Directives for Out-of-Core Stencil Computation with Temporal Blocking,” In Proceedings of the 3rd Workshop on Accelerator Programming Using Directives (WACCPD 2016), pp. 36–45, Salt Lake City, UT, USA, (2016-11).

Ryotaro Sakai, Fumihiko Ino, and Kenichi Hagihara, “Towards Automating Multi-dimensional Data Decomposition for Executing a Single-GPU Code on a Multi-GPU System,” In Proceedings of the 4th International Symposium on Networking and Computing (CANDAR 2016), pp. 408–414, Hiroshima, Japan, (2016-11). Presented at the 4th International Workshop on Computer Systems and Architectures (CSA 2016).

Yuki Takeuchi, Yoshihide Yoshimoto, and Reiji Suda, “Second order accuracy finite difference methods for space-fractional partial differential equations,” Journal of Computational and Applied Mathematics, Vol. 320, pp. 101-119, 2017.

Ryusuke Egawa, Kazuhiko Komatsu, Shintaro Momose, Yoko Isobe, Akihiro Musa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Potential of a Modern Vector Supercomputer for Practical Applications – Performance Evaluation of SX-ACE –,”Journal of Supercomputing, pp. 1 – 29, 2017, DOI: 10.1007/s11227-017-1993-y.

西尾渉, 小寺紗千子, 平田晃正, 佐々木大輔, 山下毅, 江川隆輔, 小林広明, 曽根秀昭,”太陽光および暑熱同時ばく露に対する熱中症リスク評価シミュレータの開発,” 電子情報通信学会論文誌C, pp. 1–8, 2017 (to appear)

Yuta Sakaguchi, Kenryo Kataumi, Hiroshi Matsuoka, Osamu Watanabe, Akihiro Musa, Kazuhiko, Komatsu, Ryusuke Egawa, Hiroaki Kobayashi, Satoru Yamamoto, “A Case Study of Performance Optimization on Numerical Turbine for Supercomputer SX-ACE”, Computers & Fluids, 2017 (to appear).

著作物（総説、解説、著書）

Hiroyuki Takizawa, Takeshi Yamada, Shoichi Hirasawa, and Reiji Suda, “A Use Case of a Code Transformation Rule Generator for Data Layout Optimization,” Sustained Simulation Performance 2016, Springer-Verlang, pp. 21-30, 2016.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “Directive Translation for Various HPC Systems Using the Xevolver Framework,” Sustained Simulation Performance 2016, Springer-Verlang, pp. 109-117, 2016.

Shoichi Hirasawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “An Automatic Performance Tracking System for Large-scale Numerical Applications,” Sustained Simulation Performance 2016, Springer-Verlang, pp. 119-127, 2016.

招待講演

小林昇平, “Improvement and evaluation of RDFT, no-pivoting LU decomposition by DFT matrix”, Sapporo Summer HPC Seminar 2016.

Daisuke Takahashi,“Implementation of Parallel FFTs on Knights Landing Cluster,” SIAM Conference on Computational Science and Engineering (CSE17), February 28, 2017.

Daisuke Takahashi, “Automatic Tuning for Parallel FFTs on Cluster of Intel Xeon Phi Processors,” 2017 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2017), March 11, 2017.

Kazuhiko Komatsu, “Directive Translation Approach in Keeping a Code Clean,” 2017 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2017), March 11, 2017.

Ryusuke Egawa, “An HPC Refactoring Catalog – Accumulating Know-Hows of Sytem Specific Optimization and its Practical Usage,” 2017 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2017), March 12, 2017.

Reiji Suda, “Generation of Math Library for Multi-Parameter Autotuning,” 2017 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2017), March 12, 2017.

Hiroyuki Takizawa, “Combining Autotuning and Code Transformations,” 2017 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2017), March 12, 2017.

滝沢寛之, “Xevolverプロジェクト: 計算科学と計算機科学をつなぐ架け橋を目指して,” 平成28年度高速化ワークショップ, March 24, 2017.

口頭講演

Kobayashi Shohei, “Numerical Unstability and Improvement of a No-Pivoting LU Decomposition Algorithm by a Discrete Fourier Matrix,” Information Processing Society of Japan, SIG Technical Reports, 2016-HPC-154, 8 pages, April 2016.

五味歩武, 高橋大介, “最適化手法を自動化するXevolverフレームワーク用定義ファイルの実装”, 情報処理学会研究報告, 2016-HPC-155, 6 pages, 8月, 2016.

酒井亮太郎, 伊野文彦, 萩原兼一, “単一GPUコードをマルチGPU環境で実行するための多次元データ分割手法の検討,” 情報処理学会研究報告, 2016-HPC-155, (2016-08). 7 pages.

須田礼仁, “一般化菱形行列冪カーネルのための領域分割アルゴリズム”, 情報処理学会研究報告, 2016-HPC-155, 9 pages, 8月, 2016.

三木脩弘, 伊野文彦, 萩原兼一, “アウトオブコア・ステンシル計算に対する自動テンポラルブロッキングのためのアクセラレータ向けディレクティブ”, 情報処理学会研究報告, 2016-HPC-155, 7 pages, 8月, 2016.

須田礼仁, “複合的自動チューニングのための数理とソフトウェア”, 計算工学講演会論文集，Vol. 21，F-1-4, 2016.

川原畑勇希, 平澤将一, 滝沢寛之, 小林広明, “機械学習を用いたコード変換に関する研究”, 平成28年度電気関係学会東北支部連合大会, 8月30日-31日, 2016.

菱沼利彰, 藤井昭宏, 田中輝雄, 平澤将一, “GMPを用いた混合精度型プログラムの自動生成機構の提案”, 日本応用数理学会 2016年度年会, 9月12日-14日, 2016.

斯波柾, 菱沼利彰, 藤井昭宏, 田中輝雄, 平澤将一, “多倍長精度プログラムの自動生成機構Xev-GMPにおける混合精度プログラムの生成と評価”, 情報処理学会第157回ハイパフォーマンスコンピューティング研究発表会, 12月, 2016.

Reiji Suda, “Diamond Tiling Extended to General Sparse Matrix Powers Kernel”, First International Workshop on Deepening Performance Models for Automatic Tuning (DPMAT), Sep. 7th, 2016, Nagoya University.

Hiroyuki Takizawa, “Autotuning meets Code Transformations – A case study of Xevolver framework –,” The 24th Workshop on Sustained Simulation Performance, Stuttgart, December 6, 2016.

山下毅, 山崎国人, 江川隆輔, 吉岡匠哉, 土浦宏紀, 小林広明, 曽根秀昭, “『2 バンドモデルに対する揺らぎ交換近似』コードのSX-ACE 向け最適化,”大学ICT推進協議会年次大会HPCテクノロジーセッション，2016年12月14日．

Ryusuke Egawa, Yoko Isobe, Soya Fujimoto, Power and Performance Analysis of SX-ACE, The 24th Workshop on Sustained Simulation Performance, Stuttgart, December 6, 2016.

Kazuhiko Komatsu, “A Directive Generation Using A Code Translation Framework,” The 24th Workshop on Sustained Simulation Performance, Stuttgart, December 6, 2016.

滝沢寛之, 須田礼仁, 高橋大介, 江川隆輔, “Xeolverプロジェクトの概要,” ポストペタワークショップ, 12月15日, 2016.

Hirokazu Honda, Yoshinori Tamada, Reiji Suda, “Efficient Parallel Algorithm for Optimal DAG Structure Search on Parallel Computer with Torus Network”, Proc. ICA3PP 2016: Algorithms and Architectures for Parallel Processing, Dec. 14-16, 2016, Granada, Spain, LNCS 10048, pp. 483-502, DOI:10.1007/978-3-319-49583-5_37, Dec. 2016

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi. “User-defined Directive Translation using the Xevovler Framework,” 2017 SIAM Conference on Compuiter Science and Engineering (CSE17), Hilton Atlanta, Altanta, USA, February 27 – March 3, 2017.

Hiroyuki Takizawa, “Performance Tuning with Machine Learning,” The 25th Workshop on Sustained Simulation Performance, Sendai, March 13, 2017.

ポスター発表

小林英敏, 平澤将一, 滝沢寛之, 小林広明, “プロファイラと連携する自動性能追跡システム”, 2016年ハイパフォーマンスコンピューティングと計算科学シンポジウム(HPCS2016), 2016. (ポスター)

三木脩弘, “アウトオブコア・ステンシル計算に対する自動テンポラルブロッキングのためのアクセラレータ向けディレクティブPACC”, GTC Japan 2016.

Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, and Hiroaki Kobayashi, “Making a Legacy Code Auto-tunable without Messing It Up,” ACM/IEEE Supercomputing Conference 2016 (SC16), 2016. (poster)

Keiichiro Fukazawa, Ryusuke Egawa, Yuko Isobe and Ikuo Miyoshi, “Performance Evaluation of MHD Simulation Code on SX-ACE and FX100,” Poster presentation at International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2016), Kyoto Japan, June 2016.(abstract review)

2015年度

原著論文

Alfian Amrizal, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Automatic Parameter Tuning of Hierarchical Incremental Checkpointing,” High Performance Computing for Computational Science — VECPAR 2014, Lecture Notes in Computer Science Volume 8969, pp 298-309, 2015.

Hiroyuki Takizawa, Shoichi Hirasawa, Makoto Sugawara, Isaac Gelado, Hiroaki Kobayashi and Wen-mei W. Hwu, “Optimized Data Transfers Based on the OpenCL Event Management Mechanism,” Scientific Programming, vol. 2015, Article ID 576498, 16 pages, 2015. doi:10.1155/2015/576498.

Shoichi Hirasawa, Hiroyuki Takizawa and Hiroaki Kobayashi, “A Light-weight Rollback Mechanism for Testing Kernel Variants in Auto-tuning,” IEICE Transactions on Information and Systems, Vol.E98-D, No.12, pp.2178-2186, Dec. 2015.

Takeshi Yamada, Shoichi Hirasawa, Hiroyuki Takizawa and Hiroaki Kobayashi, “A Case Study of User-Defined Code Transformations for Data Layout Optimizations,” The Third International Symposium on Computing and Networking — Across Practical Development and Theoretical Research —, Sapporo, Hokkaido, Japan, December 8-11, 2015.

Kazuhiko Komatsu, Ryusuke Egawa, Shoichi Hirasawa, Hiroyuki Takizawa, Ken’Ichi Itakura and Hiroaki Kobayashi, “Migration of an Atmospheric Simulation Code to an OpenACC Platform Using the Xevolver Framework,” The Third International Symposium on Computing and Networking — Across Practical Development and Theoretical Research —, Sapporo, Hokkaido, Japan, December 8-11, 2015.

Raghunandan Mathur, Hiroshi Matsuoka, Osamu Watanabe, Akihiro Musa, Ryusuke Egawa and Hiroaki Kobayashi, “A Case Study of Memory Optimization for Migration of a Plasmonics Simulation Application to SX-ACE,” The Third International Symposium on Computing and Networking — Across Practical Development and Theoretical Research —, Sapporo, Hokkaido, Japan, December 8-11, 2015.

Reiji Suda, Hiroyuki Takizawa and Shoichi Hirasawa, “Xevtgen: fortran code transformer generator for high performance scientific codes,” The Third International Symposium on Computing and Networking — Across Practical Development and Theoretical Research —, Sapporo, Hokkaido, Japan, December 8-11, 2015.

Shoichi Hirasawa, Hiroyuki Takizawa and Hiroaki Kobayashi, “A Verification Framework for Streamlining Empirical Auto-tuning,” The Third International Symposium on Computing and Networking — Across Practical Development and Theoretical Research —, Sapporo, Hokkaido, Japan, December 8-11, 2015.

Kei Ikeda, Fumihiko Ino, and Kenichi Hagihara, “An OpenACC Optimizer for Accelerating Histogram Computation on a GPU,” Proceedings of the 24th Euromicro International Conference on Parallel, Distributed and Network-Based Computing (PDP 2016), pp.466–477, Heraklion, Greece, Feb. 17, 2016.

Nobuhiro Miki, Fumihiko Ino, and Kenichi Hagihara, “Applying Temporal Blocking to Out-of-Core Stencil Computation with OpenACC,” Proceedings of the Work in Progress Session held in connection with the 24th Euromicro International Conference on Parallel, Distributed and Network-Based Computing (PDP 2016), Heraklion, Greece, 2 pages, Feb. 19, 2016.

著作物（総説、解説、著書）

平澤将一, 肖熊, 滝沢寛之, 小林広明, “Xevolverを用いた自動チューニング”, 日本計算工学会誌「計算工学」, Vol.20 No.2, pp. 14-17, 2015.

Hiroyuki Takizawa, Daichi Sato, Shoichi Hirasawa, and Hiroaki Kobayashi, “A High-Level Interface of Xevolver for Composing Loop Transformations,” Sustained Simulation Performance 2015, pp 137-145, 2015.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “Performance Evaluation of Compiler-Assisted OpenMP Codes on Various HPC Systems,” Sustained Simulation Performance 2015, pp 147-157, 2015.

Ryusuke Egawa, Kazuhiko Komatsu, and Hiroaki Kobayashi, “Code Optimization Activities Toward a High Sustained Simulation Performance,” Sustained Simulation Performance 2015, pp 159-168, 2015.

小松一彦 ,江川隆輔 ,磯部洋子 ,緒方隆盛 ,滝沢寛之 ,小林広明, “SX-ACEにおけるHPCG ベンチマークの性能評価,” 大規模科学計算システム広報 SENAC Vol. 48 No.3, pp14-19.

江川隆輔, 小林広明, 小松一彦, 岡部公起, 大泉健治, 小野敏, 山下毅, 佐々木大輔, 森谷友映, 齋藤敦子, 撫佐昭裕, 松岡浩司, 渡部修, 曽我隆, 山口健太, “ベクトルコンピュータにおける高速化,” SENAC Vol48, No.3, pp.20 – 51, 2015

招待講演

須田礼仁, 李聡, 渡邉大地, 熊谷洋佑, 藤井昭宏, 田中輝雄, “通信削減 CG 法:エクサスケールに向けたクリロフ部分空間法の新展開”, RIMS研究集会：現象解明に向けた数値解析学の新展開, 11月, 2015年.

Kazuhiko Komatsu, “Migration of an HPC Code to an OpenACC Platform Using a Code Translation Framework,” 2016 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2016), Feb. 2016.

Daisuke Takahashi, “Automatic Tuning for Parallel FFTs on Intel Xeon Phi Clusters,” 2016 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2016), Feb. 2016.

Reiji Suda, “Semi-Automatic Construction of Performance Modeling Software for Autotuning,” 2016 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2016), Feb. 2016.

Hiroyuki Takizawa, Takeshi Yamada, Shoichi Hirasawa, and Hiroaki Kobayashi, “Data Layout Optimization Using User-Defined Code Transformations,” 2016 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2016), Feb. 2016.

Shoichi Hirasawa, Hiroyuki Takizawa and Hiroaki Kobayashi, “Streamlining Empirical Tuning of Large-scale HPC Applications,” 2016 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing (ATAT in HPSC 2016), Feb. 2016.

口頭講演

Reiji Suda, “Saving Collective Communications in Conjugate Gradient Method for Very Large Supercomputers,” 3rd TWSIAM Annual meeting, May 31, 2015.

藤井昭宏, 野村卓矢, 田中輝雄, “代数的多重格子法の粗格子集約パラメタの最適化”, 計算工学講演会論文集, Vol. 20, E-1-1, 6月, 2015.

渡邉大地, 須田礼仁, “通信削減共役勾配法における基底ベクトル拡大数の選択”, 計算工学講演会論文集, Vol. 20, E-1-2, 6月, 2015.

高橋大介, “Xeon Phiクラスタにおける並列FFTの自動チューニング”, 計算工学講演会論文集, Vol. 20, E-2-2, 6月, 2015.

平澤将一, 滝沢寛之, 小林広明, “Xevolverによる実アプリケーションの性能と保守性の両立”, 計算工学講演会論文集, Vol. 20, E-2-3, 6月, 2015.

三木脩弘, 伊野文彦, 萩原兼一, “OpenACCを用いたアウトオブコア・ステンシル計算に対するテンポラルブロッキングの適用”, 情報処理学会研究報告, 2015-HPC-150, 8 pages, 8月, 2015.

池田圭, 伊野文彦, 萩原兼一, “ヒストグラム生成を高速化するためのOpenACCオプティマイザの検討”, 情報処理学会研究報告, 2015-HPC-150, 9 pages, 8月, 2015.

高橋大介, “Xeon Phiにおける並列FFTの実現と評価”, 日本応用数理学会2015年度年会講演予稿集, 9月, 2015.

高橋大介, “Xeon Phiにおける多倍長精度浮動小数点演算の実現と評価”, 日本応用数理学会2015年度年会講演予稿集, 9月, 2015.

須田礼仁, “一般の行列冪カーネルにむけて”, 日本応用数理学会2015年度年会講演予稿集, 9月, 2015.

Hiroyuki Takizawa, Shoichi Hirasawa, Kazuhiko Komatsu, Ryusuke Egawa and Hiroaki Kobayashi, “Expressing system-awareness as code transformations for performance portability across diverse HPC systems,” Workshop on Portability Among HPC Architectures for Scientific Applications, Nov. 2015.

石田翔太郎, 須田礼仁, “Thomaの浮動小数点数一様乱数の問題点とその修正”, 情報処理学会研究報告ハイパフォーマンスコンピューティング（HPC）, 2015-HPC-152(5), pp. 1-18, 12月, 2015年.

榊原巧磨, 佐々木信一, 菱沼利彰, 藤井昭宏, 田中輝雄, 平澤将一, “GMPライブラリを用いた任意多倍長プログラムへの自動変換機構の提案”, 情報処理学会研究報告ハイパフォーマンスコンピューティング（HPC）, 2015-HPC-152(6), pp. 1-8, 12月, 2015年.

須田礼仁, “次世代並列計算機のための通信を削減した疎行列計算アルゴリズム”, 日本応用数理学会三部会連携「応用数理セミナー」, 12月, 2015年.

須田礼仁, “複合的・階層的な自動チューニングのための数理基盤手法”, 自動チューニング研究会第7回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2015), 12月, 2015年.

藤井昭宏, 野村直也, 田中輝雄, “代数的マルチグリッド法のパラメタ探索空間について”, 自動チューニング研究会第7回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2015), 12月, 2015年.

滝沢寛之, “進化的アプローチによる超並列複合システム向け開発環境の創出”, 自動チューニング研究会第7回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2015), 12月, 2015年.

石田翔太郎，須田礼仁, “丸め関数を用いて浮動小数点数へと離散化された実数一様乱数”, 情報処理学会第153回ハイパフォーマンスコンピューティング研究発表会, 3月, 2016年.

熊谷洋佑, 野地優希, 藤井昭宏, 田中輝雄, 須田礼仁, “通信削減Jacobi法を前処理とした共役勾配法の性能評価”, 情報処理学会第153回ハイパフォーマンスコンピューティング研究発表会, 3月, 2016年.

田口悠太, 金子重郎, 野村直也, 藤井昭宏, 田中輝雄, “時間発展非線形偏微分方程式へのMultigrid Reduction in Timeの適用における特性評価” 情報処理学会第153回ハイパフォーマンスコンピューティング研究発表会, 3月, 2016年.

金子重郎, 田口悠太, 野村直也, 藤井昭宏, 田中輝雄, “時間方向のマルチグリッド法におけるレベル間自由度に関する考察”, 情報処理学会第78回全国大会，No.2G-06, 3月, 2016年.

根本望, 野村直也, 藤井昭宏, 田中輝雄, “極大独立集合問題における並列性と解の精度”, 情報処理学会第78回全国大会，No.4H-02, 3月, 2016年.

Hiroyuki Takizawa, Takeshi Yamada, Takuya Tsunogawa, Shoichi Hirasawa, and Hiroaki Kobayashi, “Performance Engineering of HPC Applications Based on Pattern Matching,” The 23rd Workshop on Sustained Simulation Performance, Mar. 16-17, 2016.

Shoichi Hirasawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “A Correctness Verification Framework for Empirically Tuning Large-scale HPC Applications,” The 23rd Workshop on Sustained Simulation Performance, Mar. 16-17, 2016.

ポスター発表

Kazuhiko Komatsu, Ryusuke Egawa, Yoko Isobe, Ryusei Ogata, Hiroyuki Takizawa and Hiroaki Kobayashi, “An Approach to the Highest Efficiency of the HPCG Benchmark on the SX-ACE Supercomputer,” in the International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Nov. 2015. (Poster)

2014年度

原著論文

安藤翔平, 伊野文彦, 藤原融, 萩原兼一, “結合重み分布を高速に計算するための並列手法”.電子情報通信学会論文誌, Vol. J97-D, No. 9, pp. 1471-1480, 2014.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “A Compiler-Assisted OpenMP Migration Method Based on Automatic Parallelizing Information,” ISC’14, Germany, 2014/6/25.

Daichi Mukunoki and Daisuke Takahashi, “Using Quadruple Precision Arithmetic to Accelerate Krylov Subspace Methods on GPUs,” Proc. 10th International Conference on Parallel Processing and Applied Mathematics (PPAM 2013), Part I, Workshop on Numerical Algorithms on Hybrid Architectures, Lecture Notes in Computer Science, Vol. 8384, pp. 632-642, 2014. (DOI: 10.1007/978-3-642-55224-3_59)

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa and Hiroaki Kobayashi, “Platform-Specific Code Smell Alert System for High Performance Computing Applications,” The 16th Workshop on Advances on Parallel and Distributed Processing Symposium (APDCM 2014), 2014.

Alfian Amrizal and Shoichi Hirasawa and Hiroyuki Takizawa and Hiroaki Kobayashi, “Automatic Parameter Tuning of Hierarchical Incremental Checkpointing,” The 9th International Workshop on Automatic Performance Tuning (iWAPT2014), 2014.

Xiong Xiao, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “An Approach to Customization of Compiler Directives for Application-Specific Code Transformations,” IEEE 8th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC-14), Sep., 2014.

Akihiro Fujii and Osni Marques, “Axis Communication Method for Algebraic Multigrid Solver,” IEICE Transactions on Information and Systems, Vol.E97-D, No.11, pp. 2955-2958, 2014. (DOI: 10.1587/transinf.2014EDL8052)

Yuki Sumiyoshi, Akihiro Fujii, Akira Nukada and Teruo Tanaka, “Mixed-Precision AMG method for Many Core Accelerators,” Proc. EuroMPI/ASIA ‘ 14, International Workshop on Enhancing Parallel Scientific Applications with Accelerated HPC (ESAA 2014). p. 127, 2014. (DOI:10.1145/2642769.2642794)

Yuki Sugimoto, Fumihiko Ino, Kenichi Hagihara, “Improving Cache Locality for GPU-based Volume Rendering,” Parallel Computing, Vol. 40, No. 5/6, pp.59-69, 2014. (DOI:10.1016/j.parco.2014.03.013)

Kei Ikeda, Fumihiko Ino, Kenichi Hagihara, Efficient Acceleration of Mutual Information Computation for Nonrigid Registration Using CUDA,” IEEE Journal of Biomedical and Health Informatics, Vol. 18, No. 3, pp.956-968, 2014. (DOI: 10.1109/JBHI.2014.2310745)

Shohei Ando, Fumihiko Ino, Toru Fujiwara, and Kenichi Hagihara, “A Parallel Algorithm for Enumerating Joint Weight of a Binary Linear Code in Network Coding,” Proceedings of the 2nd International Symposium on Networking and Computing, pp.xx–xx, (2014-12).

Hiroyuki Takizawa, Shoichi Hirasawa, Yasuharu Hayashi, Ryusuke Egawa, Hiroaki Kobayashi, “Xevolver: An XML-based Code Translation Framework for Supporting HPC Application Migration,” IEEE International Conference on High Performance Computing (HiPC), pages 1-11, Dec. 2014.

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “Identification and Elimination of Platform-Specific Code Smells in High Performance Computing Applications,” International Journal of Networking and Computing, Volume 5, Number 1, pages 180–199, January 2015

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Combining Code Refactoring and Auto-tuning to Improve Performance Portability of High-Performance Computing Applications,” The Sixth International Conference on Computational Logics, Algebras, Programming, Tools, and Benchmarking(COMPUTATION TOOLS 2015), Mar. 2015.

著作物（総説、解説、著書）

Ryusuke Egawa, Kazuhiko Komatsu, Hiroaki Kobayashi, “Designing an HPC Refactoring Catalog Toward the Exa-scale Computing Era,” Sustained Simulation Performance 2014, pp 91-98, 2014.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Performance Evaluation of an OpenMP Parallelization by Using Automatic Parallelization Information,” Sustained Simulation Performance 2014, pp 119-126, 2014.

招待講演

Hiroyuki Takizawa, “Evolutionary Adaptation of HPC Applications to Revolutionary System Changes,” ISC’14, Germany, 2014/6/23.

Ryusuke Egawa, “System Design Strategies for Disaster-prevention Applications,” EUROMPI/ASIA 2014 WORKSHOP: CHALLENGES IN DATA-CENTRIC COMPUTING (BIGDATACOMPUTING’2014), 10 Sep.2014, Kyoto, Japan.

Hiroyuki Takizawa, “Autotuning with User-defined Code Transformations,” 2015 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing February 27-28, 2015.

Shoichi Hirasawa, “A Correctness Checking Framework for Empirical Auto-tuning,” 2015 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing February 27-28, 2015.

Daisuke Takahashi, “Automatic Tuning for Parallel FFTs on GPU Clusters,” 2015 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing February 27-28, 2015.

Reiji Suda, “Noise-reducing Collective Communication Algorithms,” 2015 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing February 27-28, 2015.

Ryusuke Egawa, “Overcoming Performance Portability Issues on Modern HPC Systems,” 2015 Conference on Advanced Topics and Auto Tuning in High-Performance Scientific Computing February 27-28, 2015.

口頭講演

須田礼仁, 李聡, 島根浩平, “数値的に安定性な通信削減クリロフ部分空間法,” 第19回計算工学講演会, 2014/6/12.

高橋大介，”GPUクラスタにおける並列FFTの自動チューニング, ” 第19回計算工学講演会，2014/6/12.

前田広志, 高橋大介, “GPU/MICクラスタにおける疎行列ベクトル積の性能評価,” 情報処理学会研究報告, Vol. 2014-HPC-144, No. 4, 2014.

三谷康晃, 伊野文彦, 萩原兼一, “GPU向けの反復型グラフ処理フレームワークにおけるトポロジ変更の実現,” 情報処理学会研究報告, 2014-HPC-145, 新潟, 2014/7/30.

Fumihiko Ino, Akihito Nakano, Kenichi Hagihara, “An Extension of OpenACC for Pipelined Processing of Large Data on a GPU,” Legacy HPC Application Migration 2014, 2014/9/23.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “OpenMP Parallelization Method using Compiler Information of Automatic Optimization,” Legacy HPC Application Migration 2014, 2014/9/23.

Reiji Suda, Shoichi Hirasawa, Hiroyuki Takizawa, “User-defined Source-to-source Code Transformation Tools using Xevolver, ” Legacy HPC Application Migration 2014, 2014/9/24.

Akihiro Fujii, Takuya Nomura, Teruo Tanaka, “Communication Optimization Technique of Algebraic multi-grid solver to Each Computing System, ” Legacy HPC Application Migration 2014, 2014/9/24.

Hiroyuki Takizawa,”An Evolutionary Approach to Construction of a Software Development Environment for Massively-Parallel Heterogeneous Systems,” 2014 ATIP Workshop: Japanese Research Toward Next-Generation Extreme Computing, Nov.17, 2014.

李聡, 須田礼仁, “Numerically Stable Communication Avoiding Block Krylov Subspace Method”, 日本応用数理学会環瀬戸内応用数理研究部会第18回シンポジウム, Dec., 2014

竹内裕貴, 須田礼仁, “非整数階常微分方程式の陽的数値計算法”, 日本応用数理学会環瀬戸内応用数理研究部会第18回シンポジウム, Dec., 2014

Reiji Suda, “Developments and experiences in Xevolver, an extensible code transformation system for supporting software evolution,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

Fumihiko Ino, “An extension of OpenACC for pipelined execution of large datasets,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

Kazuhiko Komatsu, “High-productive OpenMP migration using compile information,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

Ryusuke Egawa, “Code Optimization Activities toward Sustained Simulation Performance,” 20th Workshop on Sustained Simulation Performance, Dec. 15-16, 2014.

Hiroyuki Takizawa, “Xevolver: an extensible framework for user-defined code transformation,” 20th Workshop on Sustained Simulation Performance, Dec. 15-16, 2014.

Kazuhiko Komatsu, “High-productive OpenMP migration using Automatic Parallelizing Information,” 20th Workshop on Sustained Simulation Performance, Dec. 15-16, 2014.

滝沢寛之, “進化的アプローチによる超並列複合システム向け開発環境の創出”, 第6回自動チューニング技術の現状と応用に関するシンポジウム(ATTA2014), Dec. 2014.

Akihiro Fujii, Takuya Nomura, Teruo Tanaka, Osni Marques, ”AMGS: Algebraic Multigrid Solver with Coarse Grid Aggregation,” Annual Meeting on Advanced Computing System and Infrastructure (ACSI) 2015.

Hiroyuki Takizawa, “What can we do to fight with system diversity?,” 21st Workshop on Sustained Simulation Performance, Feb. 18-19, 2015.

Ryusuke Egawa, “Green HPC System Design with Innovative Technologies,” 21st Workshop on Sustained Simulation Performance, Feb. 18-19, 2015.

江川隆輔, “ベクトル型スーパーコンピュータの現状と将来,” Cyber HPC Symposium パネルディスカッション, 2015.

Chunyan Wang, Shoichi Hirasawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Refactoring of HPC Applications with User Knowledge,” 第13回情報シナジー研究会, 東北大学サイバーサイエンスセンター, 3月2日, 2015

生澤拓也, 伊野文彦, 萩原兼一, “離散ウェーブレット変換のGPU実装における入力の上書きによるメモリ使用量削減”, 情報処理学会研究報告, 2015-HPC-148, (2015-03).

荒谷祐紀，藤井昭宏，田中輝雄, “GPUクラスタ上のAMG法の高速化”, 情報処理学会研究報告, 2015-HPC-148, (2015-03).

竹内裕貴, “非整数階常微分方程式に対する高精度陽的数値計算法”, 日本応用数理学会 2015年研究部会連合発表会, 3月6日-7日, 2015

丸地賢, 佐々木信一, 菱沼利彰, 藤井昭宏, 田中輝雄, 平澤将一, “Xevolverを用いたGMPコードへの自動変換機能の実装”, 情報処理学会第77回全国大会, Mar., 2015.

Hiroyuki Takizawa, Shoichi Hirasawa, Hiroaki Kobayashi, “A Framework for Separation of Concerns Between Application Requirements and System Requirements,” 2015 SIAM Conference on Computational Science and Engineering (CSE15), Salt Palace Convention Center, Salt Lake City, Utah, USA, March 18, 2015.

Daisuke Takahashi, “Automatic Tuning for Parallel FFTs on GPU Clusters,” 2015 SIAM Conference on Computational Science and Engineering (CSE15), Salt Palace Convention Center, Salt Lake City, Utah, USA, March 18, 2015.

Hiroshi Maeda and Daisuke Takahashi, “Performance Evaluation of Sparse Matrix-Vector Multiplication Using GPU/MIC Cluster,” 2015 SIAM Conference on Computational Science and Engineering (CSE15), Salt Palace Convention Center, Salt Lake City, Utah, USA, March 14, 2015.

ポスター発表

Tomochika Kato, Fumihiko Ino, and Kenichi Hagihara. “PACC: An Extension of OpenACC for Pipelined Processing of Large Data on a GPU,” Poster in the 27th International Conference for High Performance Computing, Networking, Storage and Analysis, (2014-11).

Ryusuke Egawa, Shintaro Momose, Kazuhiko Komatsu, Yoko Isobe, Hiroyuki Takizawa, Akihiro Musa, Hiroaki Kobayashi, “Early Evaluation of the SX-ACE Processor,” Poster in the 27th International Conference for High Performance Computing, Networking, Storage and Analysis, (2014-11).

Shoichi Hirasawa, “HPC Refactoring and Code Transformation toward Next-generation Extreme Computing,” 2014 ATIP Workshop: Japanese Research Toward Next-Generation Extreme Computing, Nov.17, 2014.

Shoichi Hirasawa, Tohoku Univ./JST CREST, “Enhancing Performance Portability of Real Applications Using Xevolver,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

Takahashi Daisuke, “Parallel Numerical Libraries with Xevolver towards Exa-Scale Systems,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

Ken’ichi Itakura, “Designing an HPC Refactoring Catalog toward Post Peta-scale Computing Era,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

Reiji Suda, “Tools for Exa-Scale Computational Science Codes based on Xevolver,” JST CREST International Symposium on Post Petescale System Software, ISP2S2, Dec., 2014.

2013年度

原著論文

伊野文彦, 萩原兼一, “GPUアクセラレータとその研究動向”, Medical Imaging Technology, Vol.31, No.3, pp.147-152, 2013. (DOI: 10.11409/mit.31.147)

平澤将一, 滝沢寛之, 小林広明, “ソフトウェア進化のための自動性能追跡システム”, 情報処理学会論文誌コンピューティングシステム(ACS), Vol.6 No.4, pp.96-104, 2013. (NAID: 110009616697)

Fumihiko Ino, Kentaro Shigeoka, Tomohiro Okuyama, Masaya Motokubota, and Kenichi Hagihara, “A Parallel Scheme for Accelerating Parameter Sweep Applications on a GPU,” Concurrency and Computation: Practice and Experience, Vol.26, No.2, pp.516-531, 2014. (DOI: 10.1002/cpe.3016)

Hiroyuki Takizawa, Makoto Sugawara, Shoichi Hirasawa, Isaac Gelado, Hiroaki Kobayashi, and Wen-mei W. Hwu, “clMPI: An OpenCL Extension for Interoperation with the Message Passing Interface,” the IEEE 27th International Symposium on Parallel & Distributed Processing Workshops(IPDPSW2013), pp.1138-1148, 2013. (DOI: 10.1109/IPDPSW.2013.183)

Makoto Sugawara, Shoichi Hirasawa, Kazuhiko Komatsu, Hiroyuki Takizawa and Hiroaki Kobayashi, “A Comparison of Performance Tunabilities between OpenCL and OpenACC,” the IEEE 7th International Symposium on Embedded Multicore SoCs (MCSoC-13), pp. 147-152, 2013. (DOI: 10.1109/MCSoC.2013.31)

Fumihiko Ino, Shinta Nakagawa, and Kenichi Hagihara, “GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems,” IEICE Transactions on Information and Systems, Vol.96-D, No.12, pp.2604-2613, 2013. (DOI: 10.1587/transinf.E96.D.2604)

Daichi Mukunoki and Daisuke Takahashi, “Optimization of Sparse Matrix-vector Multiplication for CRS Format on NVIDIA Kepler Architecture GPUs,” Proc. 13th International Conference on Computational Science and Its Applications (ICCSA 2013), Part V, Lecture Notes in Computer Science, Vol. 7975, pp.211-223, 2013. （DOI: 10.1007/978-3-642-39640-3_15）

Daisuke Takahashi, “Implementation of Parallel 1-D FFT on GPU Clusters,” Proc. 2013 IEEE 16th International Conference on Computational Science and Engineering(CSE 2013) , pp.174-180, 2013. (DOI: 10.1109/CSE.2013.36)

Takaaki Hiragushi and Daisuke Takahashi, “Efficient Hybrid Breadth-First Search on GPUs,” LNCS Algorithms and Architectures for Parallel Processing (ICA3PP 2013), Vol. 8286, pp. 40-50, 2013. (DOI: 10.1007/978-3-319-03889-6_5）

Ayumu Tomiyama, Reiji Suda, “Automatic Parameter Optimization for Edit Distance Algorithm on GPU,” LNCS High Performance Computing for Computational Science – VECPAR 2012, Vol. 7851, pp.420-434, 2013. (DOI: 10.1007/978-3-642-38718-0_38)

Kamil Rocki, Reiji Suda, “High Performance GPU Accelerated Local Optimization in TSP,” Third Workshop on Parallel Computing and Optimization (PCO’13) in conjunction with 27th IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 1788-1796, 2013. (DOI: 10.1109/IPDPSW.2013.227)

Cheng Luo and Reiji Suda, “An Efficient Task Partitioning and Scheduling Method for Symmetric Multiple GPU Architecture,” the 11th International Symposium on Parallel and Distributed Processing with Applications(ISPA2013), pp.1133-1142, 2013. (DOI: 10.1109/TrustCom.2013.137)

Tian Xiaochen, Kamil Rocki, Reiji Suda, “Register Level Sort Algorithm on Multi-Core SIMD Processors,” IA^3 Workshop on Irregular Applications: Architectures & Algorithms, The International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC13), pp. 9:1-9:8, 2013. (DOI: 10.1145/2535753.2535762)

Kamil Rocki, Martin Burtscher, and Reiji Suda, “The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?,” the 29th Symposium on Applied Computing, Gyeongju (SAC 2014), 2014. (to appear)

著作物（総説、解説、著書）

須田礼仁, “並列処理とコンピュータの進歩”, 週間金融財政事情, 第64巻, 第27号, pp.60-61, 2013.

須田礼仁，”自動チューニング：数理的手法によるソフトウェア高性能化”, 次世代計算科学の基盤技術とその展開，京都大学数理解析研究所講究録, 1848(RIMS Kokyuroku 1848), pp. 1-14, 2013.

Kamil Rocki, Reiji Suda (須田礼仁), “Large-scale Parallel Iterated Local Search Algorithm for Travelling Salesman Problem (巡回セールスマン問題に対する反復局所探索の大規模並列アルゴリズム)”, TSUBAME e-Science Journal, Vol.10, pp.13-17, pp. 30-34, 2013.

須田礼仁, “「並列計算」(parallel processing)”, 応用数理ハンドブック, pp.402-405, 2013.

Kazuhiko Komatsu, Toshihide Sasaki, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Analysing the Performance Improvements of Optimizations on Modern HPC Systems,” Sustained Simulation Performance 2013, Springer Berlin Heidelberg, pp. 13-25, 2013.

招待講演

滝沢寛之, “XMLを用いたツール間連携に向けて”, 第1回 XcalableMP Workshop, 東京, 11月1日, 2013.

滝沢寛之, “HPCシステム多様化・複雑化時代のアプリケーション開発環境”, JACORN2013 Winter – 次世代 RHW 創造研究会, 沖縄, 12月26日-27日, 2013.

江川隆輔, 小松一彦, 小林広明, “科学技術アプリケーションの進化を支えるHPCリファクタリングの実現に向けて”, 第17回計算工学講演会, 京都, ６月11日-13日, 2013.

Shoichi Hirasawa, “An Automatic Performance Tracking System for Software Evolution of Large Scale Vector Applications,” Xev CREST Project Open Seminar, Tokyo, May 28, 2013.

Hiroyuki Takizawa, “An extensible programming framework for custom code transformations,” 2014 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, Taipei, Mar. 14-15, 2014.

Shoichi Hirasawa, “A Light-weight Rollback Mechanism for Testing Code Variants in Auto-tuning,” 2014 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, Taipei, Mar. 14-15, 2014.

Daisuke Takahashi, “Implementation of Parallel FFTs on GPU Clusters,” 2014 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, Taipei, Mar. 14-15, 2014.

Reiji Suda, “Autotuning with a Nuisance Parameter: A Case Study for Power Optimization,” 2014 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, Taipei, Mar. 14-15, 2014.

口頭講演

Chunyan Wang，平澤将一，滝沢寛之，小林広明, “Code Refactoring for High Performance Computing Applications”, 平成25年度電気関係学会東北支部連合大会, 会津若松, 8月22日, 2013.

肖熊，平澤将一，滝沢寛之，小林広明, “A Case Study of Performance Tuning with the POET Framework”, 平成25年度電気関係学会東北支部連合大会, 会津若松, 8月22日, 2013.

滝沢寛之, 平澤将一, 小松一彦, 小林広明, “OpenACCにおける性能チューニングとその効果”, 日本応用数理学会2013年度年会, 福岡, 9月9日-11日, 2013.

生澤拓也, 伊野文彦, 萩原兼一, CUDAにおける離散ウェーブレット変換のIn-place処理のためのデータ並べ替え手法, 平成25年度情報処理学会関西支部支部大会, 大阪, 9月25日, 2013.

重岡謙太朗, 伊野文彦, 萩原兼一, GPUを用いた分枝限定法におけるメモリ参照効率を高めるための配列パッキング手法, 情報処理学会ハイパフォーマンスコンピューティング研究会, 沖縄, 9月30日, 2013.

滝沢寛之, “進化的アプローチによる超並列複合システム向け開発環境の創出”, 2013年 AT研究会 μワークショップ, 下呂, 10月30日, 2013.

平澤将一, “性能可搬性向上のためのアプリケーションコード部分発見”, 2013年 AT研究会 μワークショップ, 下呂, 10月30日, 2013.

中野瑛仁, 伊野文彦, 萩原兼一, “アクセラレータのメモリ容量を超えるデータをパイプライン処理するためのディレクティブ”, 情報処理学会ハイパフォーマンスコンピューティング研究会, 札幌, 12月17日, 2013.

滝沢寛之, “進化的アプローチによる超並列複合システム向け開発環境の構築,” 第5回自動チューニング技術の現状と応用に関するシンポジウム, 東京, 12月25日, 2013.

平井亮太, 平澤将一, 滝沢寛之, 小林広明, “アクセラレータのためのプログラム最適化とその性能評価”, 第12回情報シナジー研究会, 仙台, 2月24日, 2014.

平櫛貴章，高橋大介, “GPUクラスタにおける幅優先探索の高速化”，情報処理学会第139回ハイパフォーマンスコンピューティング研究発表会，情報処理学会研究報告，2013-HPC-139，No. 12, 柏, 5月30日, 2013.

椋木大地，高橋大介, “GPUにおける4倍精度浮動小数点演算を用いたクリロフ部分空間法の高速化”，情報処理学会第140回ハイパフォーマンスコンピューティング研究発表会，情報処理学会研究報告，2013-HPC-140，No. 35, 北九州, 7月24日-8月2日, 2013.

竹内裕貴, “非整数階拡散方程式に対する2次精度有限差分法の安定性解析”, 第42回数値解析シンポジウム, 松山, 6月14日, 2013.

須田礼仁, 小山雄佑, “部分行列性能を用いた疎行列格納形式選択の自動チューニング”, 計算工学講演会論文集, vol.18, 東京, 6月19日-21日, 2013.

Hongzhi Chen, Reiji Suda, “Evaluation of Impact of Noise on Collective Algorithms in Repeated Computation Cycle”, 情報処理学会ハイパフォーマンスコンピューティング研究会, Vol. 2013-HPC-141, No.16, 沖縄, 9月30日-10月1日, 2013.

Kamil Rocki, Martin Burtscher, Reiji Suda, “The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?”, 情報処理学会プログラミング研究会, 東京, 11月11日-12日, 2013.

小松一彦, 佐々木俊英, 江川隆輔, 滝沢寛之, 小林広明, “マルチプラットフォームにおける最適化手法の効果に関する一検討”, 並列/分散/協調処理に関するサマーワークショップ(SWoPP2013), 北九州, 7月24日-8月2日, 2013.

Azmir Ridzuan bin Azlan, Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “OpenMP Parallelization using Compile Log of Automatic Parallelization”, 第12回情報シナジー研究会, 仙台, 2月24日, 2014.

Hiroyuki Takizawa, “Towards an Extensible Programming Environment for Software Evolution,” Special Session: Legacy HPC Application Migration 2013 (LHAM) (held in conjunction with IEEE MCSoC-13), Tokyo, Sep. 27, 2013.

Daisuke Takahashi, “Experience of Implementing Parallel FFTs on GPU Clusters,” Special Session: Legacy HPC Application Migration 2013 (LHAM) (held in conjunction with IEEE MCSoC-13), Tokyo, Sep. 27, 2013.

Akihiro Fujii, Takuya Nomura, Teruo Tanaka, and Osni Marques, “Dynamic Parallel Algebraic Multigrid Coarsening for Strong Scaling,” MS50 “Auto-tuning Technologies for Extreme-Scale Solvers” – Part III, SIAM Conference on Parallel Processing for Scientific Computing (PP14), Portland (USA), Feb. 20, 2014.

Kamil Rocki, “OpenCL-based Approach to Heterogeneous Parallel TSP Optimization,” IWOCL 2013, International Workshop on OpenCL, the Georgia Institute of Technology, Boston(USA), May 13-14, 2013.

Yuki Takeuchi, “Second order accuracy finite difference methods for fractional diffusion equations,” ASME 2013 International Design Engineering Technical Conferences (IDETC) and Computers and Information in Engineering Conference (CIE), Portland(USA), Aug. 4-7, 2013. (abstract review)

Yuki Takeuchi, “Approximate solutions of fractional differential equations with Riesz fractional derivatives in a finite domain,” International Conference on Scientific Computation and Differential Equations(SciCADE 2013), Valladolid(Spain), Sep. 16-20, 2013.

Kamil Rocki, “The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?,” Special Session: Legacy HPC Application Migration 2013 (LHAM) (held in conjunction with IEEE MCSoC-13), Tokyo, Sep. 27, 2013.

Cong Li, Reiji Suda, Kohei Shimane, and Hongzhi Chen, “BCBCG: Iterative Solver with Less Number of Global Communications,” MS42 Auto-tuning Technologies for Extreme-Scale Solvers – Part II of III (Feb 20), SIAM PP14, Portland(USA), Feb. 18-21, 2014.

Jiahong Chen, Ray-Bing Chen, Akihiro Fujii, Reiji Suda, Weichung Wang, “Timing Performance Surrogates in Auto-Tuning for Qualitative and Quantitative Factors,” CP16 Performance Optimization (Feb 19), SIAM PP14, Portland(USA), Feb. 18-21, 2014.

Ryusuke Egawa, “An HPC Refactoring Catalog; Guidelines to Bridge The Gap between HPC Systems,” Special Session: Legacy HPC Application Migration 2013 (LHAM) (held in conjunction with IEEE MCSoC-13), Tokyo, Sep. 27, 2013.

Ryusuke Egawa, “Designing an HPC Refactoring Catalog toward the Exa-scale Computing Era,” 18th Workshop on Sustained Simulation Performance(WSSP18), Stuttgart(Germany), Oct. 28-29, 2013.

Kazuhiko Komatsu, “Performance evaluation of auto-parallelized codes on various supercomputing systems,” 18th Workshop on Sustained Simulation Performance(WSSP18), Stuttgart(Germany), Oct. 28-29, 2013.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, Takashi Soga, Akihiro Musa, Hiroaki Kobayashi, “Design of the Next-Generation Vector Architecture for Postpeta-Scale CFD,” International Conference on Fluid Dynamics(ICFD2013), Sendai, Nov. 25-27, 2013.

Kazuhiko Komatsu, “Performance Comparison of Auto-parallelized Codes and OpenMP Codes on Various Supercomputing Systems,” 19th Workshop on Sustained Simulation Performance(WSSP19), Sendai, Mar. 27-28, 2014.

ポスター発表

Hiroyuki Takizawa, Xiong Xiao, Shoichi Hirasawa, Hiroaki Kobayashi, “An XML-based Programming Framework for User-defined Code Transformations,” The 4th AICS International Symposium, Kobe, Dec. 2-3, 2013.

Hiroyuki Takizawa, Shoichi Hirasawa, and Hiroaki Kobayashi, “Xevolver : an XML-based Programming Framework for Software Evolution,” poster presentation at Supercomputing Conference 2013 (SC13), Denver(USA), 2013. (abstract review)

Daichi Mukunoki and Daisuke Takahashi, “Linear Algebra Operations using Quadruple-Precision Arithmetic on GPU,” GPU Technology Conference (GTC 2013), San Jose(USA), Mar. 24-27, 2013.

Kamil Rocki, Martin Burtscher, Reiji Suda, “The Future of Accelerator Programming: Abstraction, Performance or Can We Have Both?,” The 19th IEEE International Conference on Parallel and Distributed Systems(ICPADS2013), Seoul(Korea), Dec. 15-18, 2013.

2012年度

原著論文

Fumihiko Ino, Yuma Munekawa, and Kenichi Hagihara, “Sequence Homology Search Using Fine Grained Cycle Sharing of Idle GPUs,” IEEE Transactions on Parallel and Distributed Systems, Vol.23, No.4, pp.751-759, April 2012. (DOI: 10.1109/TPDS.2011.239)

Tomohiro Okuyama, Fumihiko Ino, and Kenichi Hagihara, “A Task Parallel Algorithm for Finding All-Pairs Shortest Paths Using the GPU,” International Journal of High Performance Computing and Networking, Vol.7, No.2, pp.87-98, April 2012. (DOI: 10.1504/IJHPCN.2012.046384)

Alfian Amrizal, Shoichi Hirasawa, Kazuhiko Komatsu, Hiroyuki Takizawa, and Hiroaki Kobayashi, “Improving the Scalability of Transparent Checkpointing for GPU Computing Systems,” IEEE Region 10 Conference (TENCON 2012), 2012.

Kamil Rocki, Reiji Suda, “Accelerating 2-opt and 3-opt local search using GPU in the Travelling Salesman Problem”, The 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID 2012), Ottawa, Canada, 13-16 May 2012

Yuki Takeuchi and Reiji Suda, “New numerical computation formula and error analysis of some existing formulae in fractional derivatives and integrals,” The 5th IFAC Symposium on Fractional Differentiation and its Applications (FDA’12), Hohai University, Nanjing, China, May 14-17, 2012.

Kei Ikeda, Fumihiko Ino, and Kenichi Hagihara, “Accelerating Joint Histogram Computation for Image Registration on the GPU,” In Proceedings of Computer Assisted Radiology and Surgery: 26th International Congress and Exhibition (CARS 2012), pp.S72-S73, June 2012.

Muhammad Ismail Faruqi, Fumihiko Ino, and Kenichi Hagihara, “Acceleration of Variance of Color Differences-Based Demosaicing Using CUDA,” In Proceedings of the 10th International Conference on High Performance Computing and Simulation (HPCS 2012), pp.503-510, July 2012.

Kamil Rocki, Reiji Suda, “An efficient GPU implementation of the iterative hill climbing based TSP solver for large problem instances”, ACM/SIGEVO GECCO 2012: Genetic and Evolutionary Computation Conference, Philadelphia, USA, July 07 – 11, 2012

Ayumu Tomiyama, Reiji Suda, “Automatic Parameter Optimization for Edit Distance Algorithm on GPU”, the seventh international Workshop on Automatic Performance Tuning (iWAPT 2012) / VECPAR 2012, RIKEN Advanced Institute for Computational Science, Kobe, July 17th, 2012.

Hiroki Yoshizawa and Daisuke Takahashi: Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS format on GPUs, Proc. 2012 IEEE 15th International Conference on Computational Science and Engineering (CSE 2012), pp. 130–136 (2012). (DOI: 10.1109/ICCSE.2012.28)

Daichi Mukunoki and Daisuke Takahashi: Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs, Proc. 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW 2012), The 13th Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC-12), pp. 1378–1386 (2012). (DOI : 10.1109/IPDPSW.2012.175)

Kohei Shimane, Reiji Suda, “A Fast Tour Construction Algorithm for ACOTSP”, The 4th International Conference on Metaheuristics and Nature Inspired Computing(META’2012), Port El-Kantaoiui (Sousse, Tunisia) Oct 27-31, 2012.

Kamil Rocki, Reiji Suda, “High Performance GPU Accelerated TSP Solver” (Electronic Poster), The International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC12), 10-16 November 2012, Salt Lake City, USA.

平澤将一, 滝沢寛之, 小林広明, “ソフトウェア進化のための自動性能追跡システム”, 2013年ハイパフォーマンスコンピューティングと計算科学シンポジウム, 2013年 1月.

高橋光佑，藤井昭宏，田中輝雄：マルチカラー法を用いたマルチGPU上でのAMG法，電子情報通信学会論文誌 D, Vol. J96-D, No.3, pp.452–460 (2013).

椋木大地，高橋大介：GPUにおける3倍・4倍精度浮動小数点演算の実現と性能評価，情報処理学会論文誌コンピューティングシステム, Vol. 6, No. 1, pp. 66–77 (2013).

著作物（総説、解説、著書）

須田礼仁，「GPUとGPGPUの概要」，映像情報メディア学会誌，Vol. 66, No.10, pp.808-812, 2012. 月間ベストオーサー賞

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, Shun Takahashi, Daisuke Sasaki, and Kazuhiro Nakahashi, “Performance Evaluation of BCM on Various Supercomputing Systems,” In 24th International Conference on Parallel Computational Fluid Dynamics, 2012.

小松一彦, 曽我隆, 江川隆輔, 滝沢寛之, 小林広明. 大規模計算システムにおけるBCMの性能評価．In 東北大学サイバーサイエンスセンター大規模科学計算システム広報 SENAC, ISSN 0286-7419, Vol.45, No.3, July 2012.

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, Hiroaki Kobayashi, “Performance evaluation of a next-generation CFD on various supercomputing systems,” High Performance Computing on Vector Systems, 2012．

佐々木大輔，山下毅，小野敏，大泉建治，江川隆輔，小林広明, “東北大学サイバーサイエンスセンターにおけるユーザコードの高速化支援活動,” 第34回全国共同利用情報基盤センター　研究開発論文集，(2012), 21-26.

江川隆輔, 岡部公起, 伊藤英一, 小野敏, 山下毅, 撫佐昭裕, 神山典, 小久保達信, 金野浩伸, 曽我隆, 塩田和永，並列処理，東北大学情報サイバーサイエンスセンター大規模科学計算機システム広報 SENAC 45, 1 (2012), 17-41．

江川隆輔, 岡部公起, 伊藤英一, 小野敏, 山下毅, 撫佐昭裕, 神山典, 小久保達信, 吉村健二, 遠藤清, 小沢実希, 坂本英顕, 金野浩伸, 坂口祐太, 曽我隆，”スーパーコンピュータSX-9の高速化;” 東北大学情報サイバーサイエンスセンター大規模科学計算機システム広報 SENAC 45, 2 (2012), 25-60.

招待講演

Reiji Suda, “HPC, PARALLEL, AT”, NII Shonan Meeting on Bridging the theory of staged programming languages and the practice of high-performance computing”, May 19-22, 2012.

平澤将一, “HPCアプリケーションのヘテロ対応リファクタリングを支援する開発ツールに向けて”自動チューニング研究会オープンアカデミックセッション, 2012年6月

Kazuhiko Komatsu, Takashi Soga, Ryusuke Egawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “Performance Evaluation of a CFD using Cartesian Meshes on Various Supercomputing Systems,” In NUG XXIV, June 2012.

Ryusuke Egawa, “Introduction to SIMD, Vector, and Parallel Supercomputing,” SICE2012 Tutorial II, Akita, 2012.

Kazuhiko Komatsu, “Introduction to GPU Computing,” SICE2012 Tutorial II, Akita, 2012.

Kamil Rocki. “Accelerating Parallel Monte Carlo Tree Search using CUDA”, GTC Japan 2012, 2012 年 7 月 26 日,東京ミッドタウンホール＆カンファレンス

伊野文彦. “CUDAプログラミング入門”. 第2回ユニットコム×NVIDIA CUDAトレーニング, (2012-09).

Hiroyuki Takizawa, “Software Evolution for System Architecture Revolution,” IEEE International Symposium on Embedded Multicore SoCs, September 21, 2012.

Daisuke Takahashi: Automatic Tuning for Parallel FFTs on Clusters of Multi-Core Processors, Special Session: Auto-Tuning for Multicore and GPU (ATMG) (held in conjunction with IEEE MCSoC-12), The University of Aizu, Aizu, Japan, September 22, 2012.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, and Hiroaki Kobayashi. Performance of Practical Applications on Modern Supercomputing Systems. In SC12 NEC booth presentation, Nov 2012.

Kazuhiko Komatsu, Ryusuke Egawa, Hiroyuki Takizawa, and Hiroaki Kobayashi. Toward High Performance-Portabilities on Modern HPC Systems. In 16th Workshop on Sustained Simulation Performance, Dec. 2012.

Hiroyuki Takizawa, “A new research project for enabling evolution of legacy code into massively-parallel heterogeneous computing applications.”, The 14th Teraflop Workshop, Stuttgart, Dec. 5, 2012.

滝沢寛之， “GPU向けプログラミング環境の現状と将来展望” シミュレーション科学セミナー, (2013-01).

Hiroyuki Takizawa, “Autotuning for Improving the Fault Tolerance of Large-scale Simulations,” Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, March 27-29, 2013

Shoichi Hirasawa, “An Automatic Performance Tracking System for Scientific Software Evolution,” Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, March 27-29, 2013

Daisuke Takahashi, “Automatic Tuning for Parallel FFTs”, 2013 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing (2013@^2HPSC), National Taiwan University, March 28, 2013.

Reiji Suda, “Performance Correlations for Autotuning Efficiency, Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing, March 27-29, 2013

Akihiro Fujii and Teruo Tanaka, “Online Auto-Tuning Technique for Algebraic Multi-Grid Solver”, 2013 Conference on Advanced Topics and Auto Tuning in High Performance Scientific Computing (2013@^2HPSC), National Taiwan University, March 28, 2013.

口頭講演

滝沢寛之, 佐藤功人, 小松一彦, 小林広明, “OpenCLアプリケーションの実行時自動チューニング”, 計算工学講演会, 5月30日, 2012年.

高橋大介：並列FFTにおける自動チューニング，第17回計算工学講演会，計算工学講演会論文集，Vol. 17，E-7-2，京都教育文化センター，京都市，5月30日, 2012年．

須田礼仁，「選択肢変更コストをともなうオンライン自動チューニング」，計算工学会，May 30th, 2012.

加藤誠也，須田礼仁，玉田嘉紀，「GPU におけるダイバージェンス削減による高速化手法」，情報処理学会研究報告 2012-HPC-134, No. 5, 電気通信大学，Jun 1st, 2012.

須田礼仁，「相関を利用した自動チューニング数理手法」，情報処理学会研究報告, Vol. 2012-HPC-134, No.10, 第134回 HPC 研究会＠電気通信大学，Jun 1st, 2012.

竹内裕貴，「分数階微積分の差分的数値計算法の提案と誤差解析」，第41回数値解析シンポジウム予稿集，6月6日～6月8日，伊香保温泉旅館よろこびの宿しん喜，2012.

Reiji Suda, “4DAC and One-Step Approximation: Mathematical Formulation and Algorithm for Automatic Tuning”, EASIAM, Jun 27th, 2012.

須田礼仁，「自動チューニングのための相関モデル：行列積における行列サイズパラメタ」，情報処理学会研究報告，Vol. 2012-HPC-135, No. 21, 第135回 HPC 研究会，Aug. 2nd, 2012.

吉澤大樹，高橋大介：GPUにおけるCRS形式疎行列ベクトル積の自動チューニング，2012年並列／分散／協調処理に関する『鳥取』サマー・ワークショップ（SWoPP鳥取2012），情報処理学会研究報告Vol. 2012-HPC-135，No. 31，とりぎん文化会館，鳥取市，2012年8月3日．

中野瑛仁, 伊野文彦, 萩原兼一. “マルチGPU環境におけるストリーム処理を高速化するタスクスケジューラ”. 情報処理学会研究報告, 2012-HPC-135, (2012-08). 7 pages.

高橋大介：ポストペタスケール計算環境に向けた並列FFTの自動チューニング，日本応用数理学会2012年度年会，日本応用数理学会2012年度年会講演予稿集，pp. 285-286，稚内全日空ホテル，稚内市，2012年8月31日．

須田礼仁，「自動チューニングにおける選択肢絞り込み」，日本応用数理学会 2012 年度年会，予稿集 271-272, 稚内全日空ホテル，Aug. 31st, 2012.

小松一彦, 江川隆輔, 安田一平, 撫佐昭裕, 松岡浩司, 小林広明. HPCシステムにおける最適化手法の性能可搬性に関する一検討．In 第7回次世代CFD研究会, Sep. 2012.

重岡謙太朗, 奥山倫弘, 伊野文彦, 萩原兼一. “GPUにおいてパラメータスイープを高速化するための並列方式”. 情報処理学会研究報告, 2012-HPC-136, (2012-10). 8 pages.

菅原誠, 小松一彦, 平澤将一, 滝沢寛之, 小林広明, “ナノ粒子群形成アプリケーションのOpenACC による実装と性能評価,” 第136回ハイパフォーマンスコンピューティング研究会, 2012年10月3日.

平澤将一, 滝沢寛之, 小林広明, “統合開発環境と連携するポータブルなビルドシステム”, 情報処理学会第136回ハイパフォーマンスコンピューティング研究会, 2012年 10月.

小松一彦, 江川隆輔, 安田一平, 撫佐昭裕, 松岡浩司, 小林広明, “HPCアプリケーションの性能可搬性に関する一検討,” 第136回HPC研究会, Oct. 2012.

Alfian Amrizal, S. Hirasawa, K. Komatsu, H. Takizawa, H. Kobayashi, “A Multi Level Checkpointing Approach for Heterogeneous Computing Systems.”, ITRCセミナー/INI仙台2012秋・第3回先進的情報通信工学研究会合同ワークショップ, 12月, 2012年

椋木大地，高橋大介：GPUにおける4倍精度演算を用いた疎行列反復解法の実装と評価，情報処理学会第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会（HOKKE-20），情報処理学会研究報告Vol. 2012-ARC-202，Vol. 2012-HPC-137，No. 37，北海道大学，札幌市，2012年12月14日．

高橋光佑，藤井昭宏，田中輝雄：GPUのダイレクト通信を用いたAMG法, 情報処理学会第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会（HOKKE-20）, 情報処理学会研究報告, Vol. 2012-ARC-202，Vol.2012-HPC-137 No.30，北海道大学，札幌市，2012年12月14日.

荒谷祐紀，藤井昭宏，田中輝雄：GPU上でのAMG法におけるChebyshev多項式緩和法, 情報処理学会第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会（HOKKE-20）, 情報処理学会研究報告, Vol. 2012-ARC-202，Vol.2012-HPC-137 No.36，北海道大学，札幌市，2012年12月14日.

松本英樹，須田礼仁, 「ジッタの影響を緩和する集団通信アルゴリズム」，情報処理学会研究報告, Vol. 2012-HPC-137, No.19, 第137回 HPC 研究会，Dec. 14, 2012.

安田一平, 小松一彦, 江川隆輔, 小林広明, “大規模並列システムのノード間通信を考慮した性能モデルに関する一検討,”第194回計算機アーキテクチャ・第137回ハイパフォーマンスコンピューティング合同研究発表会（HOKKE-20）

菅原誠, 平澤将一, 小松一彦, 滝沢寛之, 小林広明, “ナノ粒子群形成アプリケーションのOpenACCによる実装と性能評価”, 数値流体力学シンポジウム(CFD2012), 代々木, December 18-20, 2012.

小松一彦, 曽我隆, 江川隆輔, 滝沢寛之, 小林広明. 大規模計算システムにおけるBuilding Cube Methodの性能評価．In 第26回数値流体力学シンポジウムCFD2012, Dec. 2012.

安藤翔平, 伊野文彦, 藤原融, 萩原兼一. “GPUによる高速な結合重み分布生成の検討”. 第13回ハイパフォーマンスコンピューティングと計算科学シンポジウム論文集（HPCS 2013）, p. 80, (2013-01).

生澤拓也, 伊野文彦, 萩原兼一. “In-place処理に基づく離散ウェーブレット変換のCUDAによる高速化の検討”. 第13回ハイパフォーマンスコンピューティングと計算科学シンポジウム論文集（HPCS 2013）, p. 69, (2013-01).

Toru Motoya and Reiji Suda, “Conjugate Gradient Methods Relieved for Inner Product Communication Latencies”, International workshop on HPC, Krylov Subspace method and its applications, Jan 13-14, 2013, Beppu B-con Plaza.

安田一平, 小松一彦, 江川隆輔, 滝沢寛之, 小林広明, “メモリバンド幅および通信バンド幅に着目した大規模並列システムの性能モデルに関する一検討,” 第11回情報シナジー研究会, 2013

Yuki Sugimoto, Fumihiko Ino, and Kenichi Hagihara, “An Acceleration Method for GPU-Based Volume Rendering by Localizing Texture Memory Reference,” 情報処理学会研究報告, 2012-HPC-138, (2013-02). 7 pages.

岡陽介, 伊野文彦, 萩原兼一. “協調マルチタスキングを用いて短い遊休時間を活用するGPUグリッドシステムの提案”. 情報処理学会研究報告, 2012-HPC-138, (2013-02). 6 pages.

池田圭, 伊野文彦, 萩原兼一. “医用画像位置合わせを対象にした結合ヒストグラム生成のGPUによる高速化”. 情報処理学会研究報告, 2012-HPC-138, (2013-02). 6 pages.

椋木大地，高橋大介：GPUにおける高速なCRS形式疎行列ベクトル積の実装，情報処理学会第138回ハイパフォーマンスコンピューティング研究発表会，情報処理学会研究報告Vol. 2013-HPC-138，No. 5，芦原温泉清風荘，あわら市，2013年2月21日．

Shoichi Hirasawa, Hiroyuki Takizawa, and Hiroaki Kobayashi, “An IDE Integrated Cross-Platform Build System for Scientific Applications,” SIAM CSE2013 Minisymposium on Auto-tuning Technologies for Tools and Development Environment in Extreme-Scale Scientific Computing, February 2013

Vivek S Nittoor and Reiji Suda, “Balanced Tanner Units And Their Properties”, To Appear, Indo-Slovenia Conference on Graph Theory and Applications (Indo-Slov-2013) ?Feb 22-24, 2013, India.

Vivek S Nittoor and Reiji Suda, “Partition Parameters for Girth Maximum BTUs”, To Appear, Indo-Slovenia Conference on Graph Theory and Applications (Indo-Slov-2013) ?Feb 22-24, 2013, India.

Daichi Mukunoki and Daisuke Takahashi: Iterative Method for Sparse Linear Systems using Quadruple Precision Operations on GPUs, 2013 SIAM Conference on Computational Science and Engineering (CSE13), The Westin Boston Waterfront, Boston, Massachusetts, USA, February 28, 2013

Reiji Suda, “Toward Tunable Multi-Scheme Parallelization”, 2013 SIAM Conference on Computational Science and Engineering (CSE13), The Westin Boston Waterfront, Boston, Massachusetts, USA, February 28, 2013.

ポスター発表

Kei Ikeda, Fumihiko Ino, and Kenichi Hagihara, “Accelerating Mutual Information Computation for Nonrigid Registration the GPU,” In Poster in the 3rd GPU Technology Conference (GTC 2012), May 2012.

須田礼仁，本谷徹，「チェビシェフ基底共役勾配法」，2013年ハイパフォーマンスコンピューティングと計算科学シンポジウム (HPCS 2013), 2013年1月15日(火)-16日(水), 東京工業大学蔵前会館くらまえホール, ポスター発表．

松本英樹，須田礼仁, バタフライの中間に冗長なデータ交換を行いジッタの影響を緩和する集団通信アルゴリズム, 2013年ハイパフォーマンスコンピューティングと計算科学シンポジウム (HPCS 2013), 2013年1月15日(火)-16日(水), 東京工業大学蔵前会館くらまえホール, ポスター発表．

Vivek S Nittoor and Reiji Suda, “Search for Optimal Graphs”, Poster Presentation at Extremal Combinatorics Conference at Illinois, Urbana-Champaigne, IL, 14-16 Mar 2013.

Daichi Mukunoki and Daisuke Takahashi, “Linear Algebra Operations using Quadruple-Precision Arithmetic on GPU,” GPU Technology Conference (GTC 2014), San Jose(USA), Mar. 24-27, 2013.

Fumihiko Ino and Kenichi Hagihara, “Fine-Grained Cycle Sharing of Idle GPUs for Homology Search,” In Poster in the 4th GPU Technology Conference (GTC 2013), San Jose, CA, USA, March 2013.

2011年度

原著論文

Kosuke Takahashi, Akihiro Fujii, Teruo Tanaka, “GPGPU-based Algebraic Multigrid Method”, Proc. 23rd IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2011) , pp. 93–99, 2011. （DOI: 10.2316/P.2011.757-061)

佐藤功人, 小松一彦, 滝沢寛之, 小林広明, “OpenCLにおけるタスク並列化支援のための実行時依存関係解析手法,” 情報処理学会論文誌コンピューティングシステム(ACS), Vol.5 No.1 53-67, 2012.

Daichi Mukunoki and Daisuke Takahashi, “Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs”, Proc. 13th IEEE International Workshop on Parallel and Distributed Scientific and Engineering Computing (PDSEC-12).

著作物（総説、解説、著書）

招待講演

Hiroyuki Takizawa, “How can we help software evolution for post-Peta scale computing and beyond?,” The 2nd AICS symposium, Kobe, Mar. 2, 2012.

玉田嘉紀, スーパーコンピュータによる大規模遺伝子ネットワーク推定, 情報処理学会第74回全国大会，名古屋工業大学，2012年3月8日．

滝沢寛之, “GPUコンピューティング～複雑なシステムを使いこなす～,” 熊本大学プロジェクトゼミナール（柔構造コンピューティングの創成と展開ゼミナール）、熊本、2012年3月16日.

Ryusuke Egawa, “Designing a Refactoring Catalog for HPC,” The 15th Workshop on Sustained Simulation Performance, Sendai, Mar. 23, 2012

口頭講演

Ryusuke Egawa, “Evolutionary Creation of Programming Environments for Massively-parallel Heterogeneous Computing Systems，” APES Project Seminar, Aachen, Germany, Oct. 4, 2011.

吉本芳英，”平面波基底第一原理計算プログラムにおけるアクセラレータの活用”，大阪大学産業技術研究所学内共同研究研究会，メープル有馬，2012年2月23～24日．

Yuki Sugimoto, Fumihiko Ino, Kenichi Hagihara, “Improving Cache Locality for Ray Casting with CUDA,” The 3rd Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, Munich, Feb. 29, 2012.

Muhammad Alfian Amrizal, Katsuto Sato, Kazuhiko Komatsu, Hiroyuki Takizawa, and Hiroaki Kobayashi. “Evaluation of a Scalable Checkpointing Mechanism for Heterogeneous Computing Systems.” 平成23 年度第7 回情報処理学会東北支部研究会. 仙台, 2012年3月2日.

杉野透, “QR 分解のアップデートアルゴリズムに関する誤差の研究”,日本応用数理学会 2012年研究部会連合発表会行列・固有値問題の解法とその応用, 九州大学伊都キャンパス，2012年3月8日．

竹内裕貴, “分数階微積分における数値計算法の提案と誤差解析”, 日本応用数理学会 2012年研究部会連合発表会行列・固有値問題の解法とその応用, 九州大学伊都キャンパス，2012年3月8日．

金沢隆史, “曲げエネルギー最小の可展面による紙の曲がり方のシミュレーション”, 日本応用数理学会 2012年研究部会連合発表会折紙工学研究部会, 九州大学伊都キャンパス，2012年3月9日．

神田裕士, 奥山倫弘, 伊野文彦, 萩原兼一. “CUDAプログラムにおけるメモリ参照効率を解析するための実行履歴生成手法.” 第133回情報処理学会ハイパフォーマンスコンピューティング研究会. 神戸, 2012年3月26日.

Cong LI，Reiji SUDA, “A Three-Step Performance Automatic Tuning Strategy using Statistical Model for OpenCL Implementation of Krylov Subspace Methods”,第133回 HPC 研究会，神戸，2012年3月26日．

高橋光佑，藤井昭宏，田中輝雄，”マルチGPUを用いたAMG法”，情報処理学会第133回ハイパフォーマンスコンピューティング研究会，Vol.2012-HPC-133，No.29，神戸，2012年3月27日.

Reiji Suda and Vivek S. Nittoor, “Efficient Monte Carlo Optimization with ATMathCoreLib,” 第133回 HPC 研究会，神戸，2012年3月27日．

本谷徹，須田礼仁, “k段飛ばし共役勾配法:通信を回避することで大規模並列計算で有効な対称正定値疎行列連立1次方程式の反復解法”, 第133回 HPC 研究会，神戸，2012年3月27日．

ポスター発表

Vivek S Nittoor and Reiji Suda, “A High Performance Computing Approach For Finding and Decoding Optimal Codes on Graphs”, HiPC 2011 at Bangalore, India, Dec. 18, 2011.

玉田嘉紀, 島村徹平, 山口類, 新井田厚司, 斉藤あゆむ, 長崎正朗, 井元清哉, 宮野悟, “SiGN-BN: ベイジアンネットワークによる大規模遺伝子ネットワーク推定プログラム”，ISLiM 成果報告会 2011，東京大学武田ホール，2011年12月21日～22日．

中野瑛仁, 伊野文彦, 萩原兼一. “CUDA互換GPUにおける高速なストリーム処理のためのタスクスケジューリングアルゴリズムの検討.” 第12回ハイパフォーマンスコンピューティングと計算科学シンポジウム. 名古屋, 2012年1月25日.

玉田嘉紀, 島村徹平, 山口類, 新井田厚司, 斉藤あゆむ, 長崎正朗, 井元清哉, 宮野悟, “SiGN-BN: ベイジアンネットワークによる大規模遺伝子ネットワーク推定プログラム”, 文部科学省「革新的ハイパフォーマンス・コンピューティング・インフラ（HPCI）の構築」・次世代ナノ統合シミュレーションソフトウェアの研究開発（ナノ）・次世代生命体統合シミュレーションソフトウェアの研究開発（ライフ）公開シンポジウム，ニチイ学館，2012年3月5～6日．

吉本芳英，”GPUによる交換相互作用の計算：平面波基底第一原理計算プログラムxTAPPへの実装”，日本物理学会第67回年次大会，関西学院大学，3/24～27.