The Potential for Thread-Level Data Speculation in Tightly-Coupled Multiprocessors J. Gregory Steffan and Todd C. Mowry Technical Report CSRI-TR-350 Department of Electrical and Computer Engineering University of Toronto Toronto, Ontario, Canada M5S 3G4 {steffan,tcm}@eecg.toronto.edu To fully exploit the potential of single-chip multiprocessors, we must find a way to parallelize non-numeric applications. However, compilers have had little success in parallelizing non-numeric codes due to their complex data access patterns. This paper explores the potential for using thread-level data speculation (TLDS) to overcome this limitation by allowing the compiler to view parallelization solely as a cost/benefit tradeoff, rather than something which may violate program correctness. Experimental results demonstrate that TLDS can offer significant program speedups. We also demonstrate that through modest hardware extensions, a standard single-chip multiprocessor could support TLDS by augmenting the cache coherence scheme to detect dependence violations, and by using the primary data caches to buffer speculative state. We quantify the impact of this implementation on performance, and we also evaluate the compiler support necessary to exploit TLDS.