# EMPLOYING TRANSPRECISION COMPUTING TECHNIQUES ON JPEG COMPRESSION SYSTEM

Tuba Ayhan 4.9.2019 ayhant@mef.edu.tr

NiPS Summer School 2019, Perugia, Italy Architectures and Algorithms for Energy-Efficient IoT and HPC Applications

## JPEG COMPRESSION SYSTEM



# **APPROXIMATION LEVELS**

System level



Block level – Compiler level

Circuit level



### DISCRETE COSINE TRANSFORM

$$X_k = \sum_{n=0}^{N-1} x_n \cos \left[ rac{\pi}{N} \left( n + rac{1}{2} 
ight) k 
ight] \qquad k=0,\ldots,N-1$$







#### 16 Additions

### FIND OPTIMUM ADDERS

Minimize:

$$\sum_{i=1}^{N} P(\hat{X}_i)$$



Total power consumption of arithmetic computing units will be minimized.

$$(= [X_1 \dots X_{16}] \rightarrow \text{Error of adder})$$



Power consumption of an adder:  $P(X_i) = a \times X_i + b$ 

# FIND OPTIMUM ADDERS

Minimize: Subject to:

 $\begin{array}{ll} \sum_{i=1}^{N} P(\hat{X}_{i}) \\ B\hat{X} \leq b \\ \hat{X} \geq LB \\ \hat{X} \leq UB \\ \hat{X}_{i} \in \mathbb{R} \ , \ 1 \leq i \leq N \end{array} \qquad \begin{array}{ll} \text{The desired performance will be maintained.} \\ \text{JPEG Compression} \xrightarrow{\rightarrow} \text{PSNR} \\ \text{b} \xrightarrow{\rightarrow} \text{MSE} \\ \hat{X}_{i} \in \mathbb{R} \ , \ 1 \leq i \leq N \end{array}$ 



 $B \rightarrow$  Connections within the blocks

 $UB \rightarrow 7.2$  $LB \rightarrow 0$ 



# **USE CASES**

 Low PSNR requirement: Energy saving is very important, i.e. compressed image is transmitted from a battery-powered node in an IoT application.

- All identical approximate adders.
- Optimizer is used: 2dB PSNR loss is allowed.
- Medium PSNR requirement: Energy saving is important, i.e. compressed image is transmitted for entertainment in a social media application.
  - All identical approximate adders.
  - Optimizer is used: 0.5dB PSNR loss is allowed.
- High PSNR requirement: Baseline.
  - Exact adders are used.

# RESULTS 1/2

| Use-Case       | Adders            | Power<br>[mW] | Power<br>Saving [%] | PSNR [dB] | PSNR loss<br>[%] |
|----------------|-------------------|---------------|---------------------|-----------|------------------|
| Low PSNR       | ldentical<br>app4 | 0.6575        | 48.0939             | 23.1716   | 19.5125          |
|                | OptimizerL        | 0.9475        | 25.1211             | 27.9481   | 4.1274           |
| Medium<br>PSNR | Identical<br>app1 | 1.1650        | 7.8969              | 28.8644   | 0.9841           |
|                | OptimizerM        | 1.1175        | 11.6688             | 28.2875   | 2.9631           |
| High PSNR      | Exact             | 1.3671        | -                   | 29.1513   | -                |

RESULTS 2/2





Appl



OptimizerM

OptimizerL

# CONCLUSION

- 2 of 3 levels of approximation are used: block (compiler) level and circuit level.
- The adders are not optimized for FPGA implementation. Results can be improved by designing the circuits for the target platform.
- Algorithm level approximation (compression ratio) gives the best saving, as expected.
- Objective function of the optimizer can be linear or non-linear; non-linear objective function did not change the adder combination for this problem.

Thank you.

ayhant@mef.edu.tr