# SESSION 19 – TAPA I Power-Aware Processing

Friday, June 20, 1:30 p.m. Chairpersons: V. De, Intel Corporation M. Hariyama, Tohoku University

## 19.1 – 1:30 p.m.

A Powerful Yet Ecological Parallel Processing System Using Execution-Based Adaptive Power-Down Control And Compact Quadruple-Precision Assist FPUs, H. Aoki, T. Kawahara, M. Yamaoka, C.Yoshimura, Y. Nagasaka, K. Takayama, N. Sukegawa, Y. Fukumura, M. Nakahata, H. Sawamoto, M. Odaka, T. Sakurai\*, K. Kasai, Hitachi Ltd., \*University of Tokyo, Japan

We have developed a general-purpose parallel processing system with fine-grained execution-based adaptive power-down control and compact quadruple-precision (QP) assist floating-point execution units (FPUs), and demonstrated it in a test chip. Power could be reduced adaptively down to 52% with these systems and almost twice the performance was achieved in QP addition with only 22% more transistors than conventional double precision Fused Multiply-Add FPUs use.

## 19.2 – 1:55 p.m.

**The Phoenix Processor: A 30pW Platform for Sensor Applications,** M. Seok, S. Hanson, Y.-S. Lin, Z. Foo, D. Kim, Y. Lee, N. Liu, D. Sylvester, D. Blaauw, University of Michigan, USA

An integrated platform for sensor applications, called the Phoenix Processor, is implemented in a carefullyselected 0.18µm process with an area of 915x915µm2, making on-die battery integration feasible. Phoenix uses a comprehensive sleep strategy with a unique power gating approach, an event-driven CPU with compact ISA, data memory compres-sion, a custom low leakage memory cell, and adaptive leakage management in data memory. Measurements show that Phoenix consumes 29.6pW in sleep mode and 2.8pJ/cycle in active mode.

## 19.3 – 2:20 p.m.

**An Asynchronous Power Aware and Adaptive NoC Based Circuit,** E. Beigne, F. Clermidy, J. Durupt, H. Lhermet, S. Miermont, Y. Thonnart, T. Tran-Xuan, A. Valentian, D. Varreau, P. Vivet, CEA-LETI MINATEC, France

A fully power aware GALS NoC circuit is presented in this paper. The circuit is arranged around an asynchronous NoC providing a 17Gbits/s throughput and automatically reducing its power consumption by activity detection. Both dynamic and static power consumptions are globally reduced using adaptive design techniques applied locally for each NoC units. The dynamic power consumption can be reduced up to a factor of 8 while the static power consumption is reduced by 2 decades in stand-by mode.

### 19.4 – 2:45 p.m.

**DSP Architecture Optimization in Matlab/Simulink Environment,** R. Nanda, C.-H. Yang, D. Markovic, University of California, Los Angeles, USA

An automated architecture optimization for DSP algorithms within graphical Matlab/Simulink environment is proposed. The optimization uses Integer Linear Programming for scheduling and retiming of hardware blocks. The high-level block-diagram based Simulink model maps to FPGA or ASIC. Users can control the tuning range of architecture parameters and select solutions from energy-area-performance tradeoff space. The hierarchical method produces optimal architectures with energy efficiency of 5GOPS/mW in a 90nm CMOS technology.