TY - JOUR
T1 - Improving processor performance by simplifying and bypassing trivial computations
AU - Yi, Joshua J.
AU - Lilja, David J
PY - 2002
Y1 - 2002
N2 - During the course of a program's execution, a processor performs many trivial computations; that is, computations that can be simplified or where the result is zero, one, or equal to one of the input operands. This paper shows that, despite compiling a program with aggressive optimizations (-03), approximately 30% of all arithmetic instructions, which account for 12% of all dynamic instructions, are trivial computations. The amount of trivial computation is not heavily dependent on the program's specific input values. Our results show that eliminating trivial computations dynamically at run-time yields an average speedup of 8% for a typical processor. Even for a very aggressive processor (i.e. one with no functional unit constraints), the average speedup is still 6%. It also is important to note that the area cost (i.e. hardware) required to dynamically detect and eliminate these trivial computations is very low, consisting of only a few comparators and multiplexers.
AB - During the course of a program's execution, a processor performs many trivial computations; that is, computations that can be simplified or where the result is zero, one, or equal to one of the input operands. This paper shows that, despite compiling a program with aggressive optimizations (-03), approximately 30% of all arithmetic instructions, which account for 12% of all dynamic instructions, are trivial computations. The amount of trivial computation is not heavily dependent on the program's specific input values. Our results show that eliminating trivial computations dynamically at run-time yields an average speedup of 8% for a typical processor. Even for a very aggressive processor (i.e. one with no functional unit constraints), the average speedup is still 6%. It also is important to note that the area cost (i.e. hardware) required to dynamically detect and eliminate these trivial computations is very low, consisting of only a few comparators and multiplexers.
UR - http://www.scopus.com/inward/record.url?scp=0036396926&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036396926&partnerID=8YFLogxK
U2 - 10.1109/ICCD.2002.1106814
DO - 10.1109/ICCD.2002.1106814
M3 - Article
AN - SCOPUS:0036396926
SN - 1063-6404
SP - 462
EP - 465
JO - Proceedings-IEEE International Conference on Computer Design: VLSI in Computers and Processors
JF - Proceedings-IEEE International Conference on Computer Design: VLSI in Computers and Processors
M1 - 77
ER -