Professor Murray Cole
Personal Chair of Patterned Parallel Computing

- School of Informatics
Contact details
Address
- Street
-
Room 1.18 Informatics Forum
10 Crichton Street - City
- Edinburgh
- Post code
- EH8 9AB
Background
I'm fascinated by patterns, skeletons and other structures in parallel algorithms and software, and what we can do to expIoit them for performance or clarity. I'm a member of the Compiler and Architecture Design Group, within the Institute for Computing Systems Architecture (ICSA). I am Co-Director of the Centre for Doctoral Training in Pervasive Parallelism.
You can find my Publications and Projects in the Edinburgh Research Explorer Archive.
You may come across me as one of the School's PGR Personal Tutors, offering advice and support to research students who are having problems which affect their study. I have many years of experience working with postgrads, and my own memories of just how hard it can be as a PhD student (I nearly gave up, I changed my supervisor, I'm still researching nearly 40 years later!). Drop by my office or e-mail me (or one of the other PGR tutors, Elizabeth Polgreen or John Longley) if you'd like to chat.
I'm also the School's Academic Misconduct Officer (ie the plagiarism person), do get in touch if you have questions.
Postgraduate teaching
Parallel Programming Languages and Systems
Open to PhD supervision enquiries?
Yes
Research summary
My research interests are in parallel programming models, emphasising approaches which exploit skeletons to package and optimize well known patterns of computation and interaction as parallel programming abstractions.
Many parallel programs can be expressed as instances of more generic patterns of parallelism, such as pipelines, stencils, wavefronts and divide-and-conquer. Providing a skeleton API simplifies programming: the programmer only has to write code which customizes selected skeletons to the application. This also makes the resulting programs more performance portable: the compiler and/or run-time can exploit structural information provided by the skeleton to choose the best implementation strategy for a range of underlying architectures, from GPU, through manycore, and on to large heterogeneous clusters.
Opportunities for research in this area include the full integration of skeletons into the language and compilation process, dynamic optimization of skeletons for diverse heterogeneous systems, the extension of skeleton approaches to applications which are not quite skeleton instances, the automatic discovery of new (and old) skeletons in existing applications, and the design and implementation of skeleton languages in domain-specific contexts.
Project activity
See
Publications and Projects in the Edinburgh Research Explorer Archive
for Project/Grant information.
-
CoSPARSE: A Software and Hardware Reconfigurable SpMV Framework for Graph Analytics
(6 pages)
DOI: https://doi.org/10.1109/DAC18074.2021.9586114
Research output: Contribution to Conference › Conference contribution (Published) -
Device-Hopping: Transparent Mid-Kernel Runtime Switching for Heterogeneous Systems
(26 pages)
In:
ACM Transactions on Architecture and Code Optimization, vol. 18
DOI: https://doi.org/10.1145/3471909
Research output: Contribution to Journal › Article (Published) -
HyFM: Function Merging for Free
(12 pages)
DOI: https://doi.org/10.1145/3461648.3463852
Research output: Contribution to Conference › Conference contribution (Published) -
Loop Parallelization using Dynamic Commutativity Analysis
(12 pages)
DOI: https://doi.org/10.1109/CGO51591.2021.9370319
Research output: Contribution to Symposium › Conference contribution (Published) -
Modernizing Parallel Code with Pattern Analysis
(13 pages)
DOI: https://doi.org/10.1145/3437801.3441603
Research output: Contribution to Conference › Conference contribution (Published) -
HETSIM: Simulating Large-Scale Heterogeneous Systems using a Trace-driven, Synchronization and Dependency-Aware Framework
(12 pages)
DOI: https://doi.org/10.1109/IISWC50251.2020.00011
Research output: Contribution to Symposium › Conference contribution (Published) -
Parallelizing Parallel Programs: A Dynamic Pattern Analysis for Modernization of Legacy Parallel Code
(2 pages)
DOI: https://doi.org/10.1145/3410463.3414663
Research output: Contribution to Conference › Conference contribution (Published) -
Transmuter: Bridging the Efficiency Gap using Memory and Dataflow Reconfiguration
(16 pages)
DOI: https://doi.org/10.1145/3410463.3414627
Research output: Contribution to Conference › Conference contribution (Published) -
DelayRepay: Delayed Execution for Kernel Fusion in Python
(14 pages)
Research output: Contribution to Conference › Conference contribution (Accepted/In press) -
HETSIM: Simulating Large-Scale Heterogeneous Systems using a Trace-driven, Synchronization and Dependency-Aware Framework
(1 page)
Research output: Contribution to Workshop › Paper (Published) -
Effective Function Merging in the SSA Form
(15 pages)
DOI: https://doi.org/10.1145/3385412.3386030
Research output: Contribution to Conference › Conference contribution (Published) -
Enforcing Deadlines for Skeleton-based Parallel Programming
(12 pages)
DOI: https://doi.org/10.1109/RTAS48715.2020.000-7
Research output: Contribution to Seminar › Conference contribution (Published) -
A Hybrid Approach to Parallel Pattern Discovery in C++
(5 pages)
DOI: https://doi.org/10.1109/PDP50117.2020.00035
Research output: Contribution to Conference › Conference contribution (Published) -
Vectorization-Aware Loop Unrolling with Seed Forwarding
(13 pages)
DOI: https://doi.org/10.1145/3377555.3377890
Research output: Contribution to Conference › Conference contribution (Published) -
“It Looks Like You’re Writing a Parallel Loop”: A Machine Learning Based Parallelization Assistant
(10 pages)
DOI: https://doi.org/10.1145/3358500.3361567
Research output: Contribution to Conference › Conference contribution (Published) -
Function Merging by Sequence Alignment
(15 pages)
DOI: https://doi.org/10.1109/CGO.2019.8661174
Research output: Contribution to Conference › Conference contribution (Published) -
Towards a Compiler Analysis for Parallel Algorithmic Skeletons
(11 pages)
DOI: https://doi.org/10.1145/3178372.3179513
Research output: Contribution to Conference › Conference contribution (Published) -
NUMA Optimizations for Algorithmic Skeletons
(12 pages)
DOI: https://doi.org/10.1007/978-3-319-96983-1_42
Research output: Contribution to Conference › Conference contribution (Published) -
Compositional Compilation for Sparse, Irregular Data Parallelism
(8 pages)
Research output: Contribution to Conference › Conference contribution (Published) -
Helium: a transparent inter-kernel optimizer for OpenCL
(11 pages)
DOI: https://doi.org/10.1145/2716282.2716284
Research output: › Conference contribution (Published)