Toolchain for Programming, Simulating and Studying the XMT Many-Core Architecture
Title | Toolchain for Programming, Simulating and Studying the XMT Many-Core Architecture |
Publication Type | Conference Papers |
Year of Publication | 2011 |
Authors | Keceli F, Tzannes A, Caragea GC, Barua R, Vishkin U |
Conference Name | Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on |
Date Published | 2011/05// |
Keywords | algorithm;XMT, architecture;XMT, architectures;shared, chain;cycle-accurate, compiler;programmer, compilers;parallel, component;shared, computing;optimizing, hardware;shared, many-core, many-core;concurrency, memory, multithreading;general-purpose, PRAM, simulator;ease-of-programming;explicit, systems;, theory;multi-threading;optimising, tool, workflow;programming, XMT |
Abstract | The Explicit Multi-Threading (XMT) is a general-purpose many-core computing platform, with the vision of a 1000-core chip that is easy to program but does not compromise on performance. This paper presents a publicly available tool chain for XMT, complete with a highly configurable cycle-accurate simulator and an optimizing compiler. The XMT tool chain has matured and has been validated to a point where its description merits publication. In particular, research and experimentation enabled by the tool chain played a central role in supporting the ease-of-programming and performance aspects of the XMT architecture. The compiler and the simulator are also important milestones for an efficient programmer's workflow from PRAM algorithms to programs that run on the shared memory XMT hardware. This workflow is a key component in accomplishing the dual goal of ease-of-programming and performance. The applicability of our tool chain extends beyond specific XMT choices. It can be used to explore the much greater design space of shared memory many-cores by system researchers or by programmers. As the tool chain can practically run on any computer, it provides a supportive environment for teaching parallel algorithmic thinking with a programming component. Unobstructed by techniques such as decomposition-first and programming for locality, this environment may be useful in deferring the teaching of these techniques, when desired, to more advanced or platform-specific courses. |
DOI | 10.1109/IPDPS.2011.270 |