!!Torsten Hoefler - Selected Publications
\\
All publications and bibliometric indices can be found at [https://scholar.google.com/citations?hl=en&user=DdBvcBEAAAAJ].\\
\\
1) Slim fly: A cost effective low-diameter network topology (2014)- establishing a lower bound and optimal construction for high-performance topologies\\
\\
2) Scientific benchmarking of parallel computing systems: twelve ways to tell the masses when reporting performance results - establishing theoretical and practical methods for benchmarking computing systems (2015), basis for 2020 BenchCouncil Rising Star award\\
\\
3) Using advanced MPI: Modern features of the message-passing interface - the MPI-3 book, (2014) Torsten contributed big pieces to the Message Passing Interface Standard version 3, the de-facto programming model of HPC\\
\\
4) Dare: High-performance state machine replication on RDMA networks (2015) - introduced protocols for implementing replicated state machines on datacenter RDMA networks, has been implemented in practice\\
\\
5) The Portals 4.2 Network Programming Interface - contribution to the design and specification of the Portals networking interface, which was the basis of libfabric and Intel's HPC networking products (OFA), 2018\\
\\
6) Stateful dataflow multigraphs: A data-centric model for performance portability on heterogeneous architectures - a fundamentally new programming model (result of an ERC starting grant) based on graphical performance tuning to replace MPI, 2019\\
\\
7) Neural code comprehension: A learnable representation of code semantics - a novel deep learning method to understand program code properties, 2018\\
\\
8) A data-centric approach to extreme-scale ab initio dissipative quantum transport simulations - using data-centric techniques to optimize a full application for quantum nanotransport simulation, 2019\\
\\
9) sPIN: High-performance streaming Processing in the Network - a novel network acceleration interface for data acceleration offload to the network, 2017\\
\\
10) To push or to pull: On reducing communication and synchronization in graph computations - fundamental analysis of parallel processing techniques for graph computations, 2017