Codificación de Imagen

Transcripción

Tema 3: Codificación y compresión de
imagen


1. Introducción.
2. Características de la imagen.
Captura y digitalización de imagen.
 Tipos de imagen


3. Compresión de imagen. Redundancia espacial.
Estándar JPEG
 Wavelets

 EZW
 JPEG 2000
 LTW

4. Conclusiones.
Bibliografía
[FLU95] Fluckiger, “Understanding
networked multimedia”
[TSU99] Introduction to video coding
standards for multimedia communication
[JPEG2000] JPEG 2000 performance
evaluation and assessment
[GEO99] Wavelet-based Image Coding:
An Overview
[EZW] Embedded image coding using
zerotrees of wavelet coefficients
[LTW] Fast and efficient spatial scalable
image compression using wavelet lower
trees
Arquitecturas de red para la distribución de contenidos
1. Introducción

Las imágenes que percibimos están compuestas de ondas
electromagnéticas (: 250nm - 780nm).


El ojo es más sensible a unos colores que a otros.


A diferentes longitudes de onda, diferentes sensaciones de color.
Dadas tres fuentes de luz de la misma intensidad y distinto color (una
roja, otra verde y otra azul), el ojo percibe la verde con el doble de
intensidad que la roja, y seis veces más intensa que la azul.
Los mecanismos de percepción visual humanos son menos
sensibles y estrictos que los auditivos.

Ej.: Variaciones de frecuencia, supresión de imágenes, etc.
Mezclando 3 colores (RGB) podemos obtener otro.
 Integra la información que recibe.

2
2.1 Captura y digitalización de imagen (I)

Las imágenes digitales están compuestas de píxels (picture
element).

Una cámara fotográfica digital utiliza un CCD (charge coupled
device) para realizar el proceso de adquisición analógica.


El CCD tiene una serie de pequeños diodos sensibles a la luz que convierten luz
en cargas eléctricas (o sea, fotones en electrones).
Cada diodo del CCD captura un píxel de la imagen a adquirir.

Para poder situar cada píxel de la imagen (luz entrante) en su
diodo correspondiente del CCD se utiliza una lente.

Mediante la lente se puede conseguir
también realizar zoom óptico (no confundir
con zoom digital)
3
2.1 Captura y digitalización de imagen (II)

Problemática del color:
Si el CCD captura la luz directa que recibe de la lente, sólo tenemos
la intensidad de luz, pero no su color.
 Añadimos un filtro (R, G ó B) a cada píxel, de manera que algunos
píxeles reciben sólo la luz roja, otros la verde y otros la azul.
 El número de píxeles que reciben luz verde es el mismo que la suma
de los que reciben luz roja y azul.


La información de color que no se ha
obtenido en cada píxel se interpola
directamente de sus vecinos, usando un DSP.
4
2.1 Captura y digitalización de imagen (III)

El CCD es un dispositivo analógico.


Es necesario un conversor analógico digital (ADC) que obtenga la
representación digital de cada píxel a partir de la señal eléctrica
generada por cada diodo.
Una cámara digital necesita un DSP (Digital
Signal Processor) para gestionar el
funcionamiento de la cámara.
Realiza el acceso y almacenamiento de fotos en
memoria, el proceso de compresión, la
interpolación de los colores, gestión de menús,
etc.
 Uno de los más usados, el TMS320DSC24 de
Texas Instruments, funciona a 80 Mhz y es
utilizado por Kodak en sus productos.

5
2.1 Captura y digitalización de imagen (IV)

Codificación y recodificación.
Cada muestra RGB se codifica con una cantidad de bits por componente de
color (p.ej., 8 bits/componente→24 bits/muestra).
 A veces resulta interesante codificar el nivel de brillo de una muestra
(luminancia, o componente Y) y las diferencias de color (crominancias azul,
roja y verde, o componentes Cb, Cr, Cg).
 La conversión de RGB a YCbCr (YUV) se realiza mediante una matriz de
conversión (aproximada):

 Y = 0.3R + 0.6G + 0.1B (Nivel de brillo o luminancia)
 U = B - Y (Diferencia de color azul) (equiv. Cb=U/2+128)
 V = R - Y (Diferencia de color rojo) (equiv. Cr=V/1.6+128)

Cada uno de los componentes se codifica con 8 bits.
 Y (8 bits): rango 16-235
 Cb (8 bits) y Cr (8 bits): rango 16-240
La diferencia de color verde
(Cg) es redundante y no se
almacena, ya que se puede
obtener a partir de la Y, la Cb y
la Cr.
6
2.1 Captura y digitalización de imagen (V)

Subsampling: El ojo es más sensible a la
información de luminancia que de
crominancia.
Muestra Y
Muestra Cr + Muestra Cb
720
720
480
o
576
480
o
576
480
o
576
Y
Cb
Cr
360
Y
Cb
Cr
Formato 4:2:2
Formato 4:4:4
720
480
o
576
720
480
o
576
Y
360
240
o
288
Formato 4:2:0
Y
480
o
576
Cb
Cr
Formato 4:1:1
180
Cb
Cr
7
2.2 Codificación: RGB
8
2.2 Codificación: YCbCr
9
2.2 Codificación: Y Subsampling (I)
10
2.2 Codificación: Y Subsampling x2 (II)
11
2.2 Codificación: Y Subsampling x4 (III)
12
2.2 Codificación: Y Subsampling x8 (IV)
13
2.2 Codificación: CbCr Subsampling (V)
14
2.2 Codificación: CbCr Subsampling x2 (VI)
15
2.2 Codificación: CbCr Subsampling x4 (VII)
16
2.2 Codificación: CbCr Subsampling x8 (VIII)
17
2.3 Tipos de imagen (según su resolución)

La resolución de una imagen se mide según el número de
píxels por lado (ancho x alto).

En cámaras digitales se suele medir en Megapixels (millones
de píxels por imagen)
1) Common Intermediate Format (CIF) (352x288): Utilizado
habitualmente en videoconferencia (junto con Quarter CIF)
2) VGA (640x480): Usado por cámaras de baja calidad.
3) n-Megapixels: Ofrecido por cámaras de mayor calidad.

A veces, la resolución real de una cámara digital no coincide
con la del CCD de esa misma cámara.


P.ej, una cámara de 3,3 MP ofrece una resolución de 2048x1536.
Parte de la circuitería del CCD que transporta los datos al ADC está
situada en determinados diodos que no pueden ser usados.
18
imagen


1. Introducción.
 Tipos de imagen


Estándar JPEG
 Wavelets

 EZW
 JPEG 2000
 LTW

4. Conclusiones.
Bibliografía
An Overview
trees
3. Compresión de imagen.

Una imagen suele presentar redundancia espacial:

Redundancia espacial:
 Las imágenes tienen información redundante susceptible de ser eliminada
o reducida (por ejemplo, el color del cielo en una foto suele ser uniforme y
azul :-).

El proceso de compresión de imagen consistirá en:
1) Eliminar en la medida de lo posible la redundancia espacial utilizando
técnicas de source encoding (normalmente mediante transformada
matemática).
2) Codificar los datos obtenidos en el paso anterior usando entropy
enconding (elimina aun más la redundancia espacial).

Para conseguir mayores índices de compresión, este proceso será con
pérdidas (cuantización de los datos).
20
3.1 Redundancia espacial: JPEG

Es un estándar ISO (‘91) cuyo origen proviene del grupo JPEG
(Joint Photographic Expert Group).






Codifica imágenes de tono-continuo
Dispone de cuatro modos de operación (incluyendo codificación sin
perdidas).
Se definen una serie de parámetros que permiten codificar las
imágenes para obtener una gran variedad de calidades de
compresión.
Factor de compresión ronda 20:1*
Es un sistema de codificación simétrico.
Forma parte de otros estándares de compresión de secuencias de
vídeo (MPEG y H.26*).
21
Codificación JPEG (pasos)

Codificación JPEG en modo secuencial con pérdidas
Codificación fuente
Imagen
original (RGB)
Prep. de
bloques
DCT
Cuantiz.
Tabla

Codificación entrópica
Run
Length
Huffman
Imagen
codificada
Tabla
Paso 1: Preparación de la imagen.




No define el formato de imagen original.
Podría ser RGB, YUV, YIQ, YCrCb, etc.
Convierte la imagen a formato YCbCr utilizando una reducción de
color 4:2:0 (sub-sampling)
 Ej.: RGB 640x480 (VGA)  Y (640x480), Cb y Cr (320x240)
Se divide la imagen en bloques de 8x8 elementos
 Ej. anterior: 4800 bloques Y, 1200 Cb y 1200 Cr.
22
Codificación JPEG: Transformada DCT

Paso 2: Transformada discreta del coseno (DCT).

Transforma un dominio de amplitudes al dominio de la frecuencia.
 Las componentes frecuenciales más altas son susceptibles de ser
eliminadas (percepción visual)

Se aplica esta transformada a cada bloque de 8x8 obteniendo la
matriz de coeficientes DCT asociada
 Componente (0,0): el nivel de continua DC del bloque (Media)
Coeficiente
DCT
Amplitud
x
y
Transformada
DCT
Fx
Fy
23
Codificación JPEG: Transformada DCT(II)

DCT-1D: (vector N elementos)
Transformada
directa
C (u ) N 1
 (2 x  1)u 
S (u ) 
s ( x) cos 


N x 0
 2N
1
2
Si u  1, C (u )  1
Si u  0, C (u ) 
C (u )
 (2 x  1)u 
S (u ) cos 

 2N
u 0 N
N 1
s ( x)  
Transformada
inversa

DCT-2D: (matriz NxN elementos)
N 1 N 1
1
 (2 x  1)i 
 (2 y  1) j 
DCT (i, j ) 
C (i )C ( j ) pixel ( x, y ) cos 
cos



2N
2N
 2N
x 0 y 0
1
pixel ( x, y ) 
2N
N 1 N 1
 (2 x  1)i 
 (2 y  1) j 
cos



2N
2N
 C (i)C ( j ) DCT (i, j ) cos
i 0 j 0
24
Codificación JPEG: Cuantificación

Paso 3: Cuantificación (quantization).


Se eliminan los coeficientes menos representativos de la DCT
(transformación con pérdidas).
Cada coeficiente de la matriz 8x8 es dividido por un valor
almacenado en una tabla (quantization table).
 El estándar sugiere dos tablas una para la componente Y y otra para
las componentes Cb y Cr.
 Estas tablas se pueden escalar con otro parámetro Q que nos
permitirá ajustar el índice de compresión requerido.
150
88
21
4
1
0
0
0
70
56
34
6
0
1
0
0
38
22
12
3
5
0
0
0
16
9
4
7
0
0
0
0
4
2
0
0
2
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
Coeficientes DCT
0
0
0
0
0
0
0
0
Tabla de
cuantificación
1
1
2
4
8
16
32
64
1
1
2
4
8
16
32
64
2
2
2
4
8
16
32
64
4
4
4
4
8
16
32
64
8
8
8
8
8
16
32
64
16
16
16
16
16
16
32
64
32
32
32
32
32
32
32
64
64
64
64
64
64
64
64
64
150
88
10
1
0
0
0
0
70
56
17
1
0
0
0
0
19
11
6
1
0
0
0
0
4
8
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
Coeficientes DCT
cuantificados
25
Codificación JPEG: Codificación entropía

Paso 4: Codificación DPCM de los componentes DC de cada
bloque.


Bloques sucesivos tienen un valor medio muy similar.
Paso 5: Codificación run-length de todos los componente de un
bloque.

Se hace un barrido “zig-zag” con el fin de agrupar todos los
componentes nulos.
150
88
10
1
0
0
0
0
70
56
17
1
0
0
0
0
19
11
6
1
0
0
0
0
4
8
1
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
150-70-88-10-56-19-4-11-17-1-0-1-6-8-0-0-0-1-1-0-0-0-0-0-2-..(39 0’s)
150-70-88-10-56-19-4-11-17-1-0-1-6-8-A0/3-A1/2-A0/5-2-A0/39
26
Codificación JPEG: Codificación entropía

Paso 6: Codificación estadística VLC: Huffman



A lo obtenido en el paso anterior se aplica el algoritmo de Huffman
para comprimir aún más la información.
El resultado de este paso es lo que debemos enviar o almacenar.
Decodificación JPEG

Consiste en realizar el proceso inverso:
Inverse
Quantization
Zig-zag
ordering
Run-length
decoder
Inverse
DCT
Huffman
decoder
110001110011100010…..
27
40
44
52
68
62
47
36
48
44
52
55
45
48
67
56
55
47
40
36
56
56
40
23
36
40
47
67
60
48
55
67
55
40
40
63
52
40
55
62
52
55
48
62
55
36
40
44
47
79
67
52
36
47
36
49
47
75
79
72
60
62
62
47
36
Bloque de muestras (pixels)
DCT
411
20
-11
-8
-3
3
8
1
-18 14 -8 24 -10 -14 -18
-34 27 -9 -11 11 14 7
-23 -1
5 -19 4 -20 -2
-5 14 -14 -8 -3 -3
9
9
7
2 -10 17 18 16
-2 -17 8
7
-3
1
-8
1
-2
3
-2 -7 -1 -2
-8 -4
2
2
3
-7
2
Bloque de muestras transformadas
Quant
Codificación JPEG: Ejemplo real (Quant)
39
50
54
58
64
56
43
42
49
49
46
47
52
52
50
50
38
38
43
50
51
51
47
40
47
46
55
59
54
55
58
53
45
45
54
55
45
44
51
50
55
57
60
57
48
45
49
51
70
65
53
43
41
42
43
48
76
72
60
55
60
55
42
39
Bloque recuperado de muestras
Q-1+ IDCT
102 -2
2 -4
-1 -2
0
0
0
0
0
0
0
0
0
0
1 0 1
2 0 0
0 0 -1
1 -1 0
0 0 0
-1 0 0
0 0 0
0 0 0
0
0
0
0
0
0
0
0
0 -1
0 0
-1 0
0 0
0 0
0 0
0 0
0 0
Bloque de muestras cuantizadas
28
Codificación JPEG: Ejemplo real I (RLE+VLC)

Codificación RLE+VLC de los coeficientes cuantizados
Número de bits
102 -2
2 -4
-1 -2
0
0
0
0
0
0
0
0
0
0
1 0 1
2 0 0
0 0 -1
1 -1 0
0 0 0
-1 0 0
0 0 0
0 0 0
0
0
0
0
0
0
0
0
0 -1
0 0
-1 0
0 0
0 0
0 0
0 0
0 0
Código
0
1
2


5
100
00
01


1110
6
1111 0
7
1111 10
8
1111 110
9
1111 1110
10
1111 1111 0
11
1111 1111 1
Tabla para la DC
 Paso
1. Se codifica la DC usando codificación
diferencial DPCM
• Si DC Bloque anterior es 98 → codificar 102-98
• Se codifica como:
Num. bits necesarios (tabla VLC) + codif + signo
• DC se codifica como: 101 100 0
29
Codificación JPEG: Ejemplo real II (RLE+VLC)
102 -2
2 -4
-1 -2
0
0
0
0
0
0
0
0
0
0
1 0 1
2 0 0
0 0 -1
1 -1 0
0 0 0
-1 0 0
0 0 0
0 0 0
0
0
0
0
0
0
0
0
Run
0 -1
0 0
-1 0
0 0
0 0
0 0
0 0
0 0
EOB
0
0
0
0
0
 Paso
1
1
1
1
2: Se codifica en zig-zag
pares <Run (cuenta de ceros),
coeficientes>
Valores
0
0
0
0
0
1
-2
2
-1
-4
1
2
…
1
2
3
4
5
Código
10
11s
0100
0010
0000
0010
s
1s
110s
0110 s
…
Run (Num. De Ceros)
Nivel
2
2
Escape
1
2
3
4
…
1
2
…
011s
0001 10s
0010 0101 s
0000 0011 00s
0101 s
0000 100s
0000 01
Tabla para pares <Run, Niveles>
Código VLC
0100
0100
111
0000
110
0001
1
0
1101
Existe código de escape:
•
0000
01 RRRR RR NNNN NNNN
100
Parte del bloque codificado con VLC
30
Codificación JPEG: Ejemplo real III (RLE+VLC)

Resultado final de la codificación RLE+VLC
102 -2
2 -4
-1 -2
0
0
0
0
0
0
0
0
0
0
1 0 1
2 0 0
0 0 -1
1 -1 0
0 0 0
-1 0 0
0 0 0
0 0 0
0
0
0
0
0
0
0
0
0 -1
0 0
-1 0
0 0
0 0
0 0
0 0
0 0
Run (Num. De Ceros)
Valores
N/A
0
0
0
0
0
1
0
5
3
5
0
2
4
7
EOB
4
-2
2
-1
-4
1
2
-2
1
1
-1
-1
-1
-1
-1
Código VLC
101 100 0
0100 1
0100 0
111
0000 1101
110
0001 100
0100 1
0001 110
0011 10
0001 111
111
0101 1
0011 01
0001 001
10
Bloque codificado con VLC
 Tasa de compresión:
 Stream final: 1011000010010100011100001101 … 000100110 (85 bits)
 Bits por píxel: (Núm bits/ Núm píxels) 85/64= 1’33 bpp
 Factor de compresión:
 Tam_comprimida:Tam_original= 85:(8*8*8)= 85:512
 1:Tam_original/Tam_comprimida= (85/85):(512/85)= 1:6
31
Codificación JPEG: Ejemplo real IV (Calidad)
40
44
52
68
62
47
36
48
44
52
55
45
48
67
56
55
47
40
36
56
56
40
23
36
40
47
67
60
48
55
67
55
40
40
63
52
40
55
62
52
55
48
62
55
36
40
44
47
79
67
52
36
47
36
49
47
75
79
72
60
62
62
47
36
Bloque de muestras (pixels)
39
50
54
58
64
56
43
42
49
49
46
47
52
52
50
50
38
38
43
50
51
51
47
40
47
46
55
59
54
55
58
53
45
45
54
55
45
44
51
50
55
57
60
57
48
45
49
51
70
65
53
43
41
42
43
48
76
72
60
55
60
55
42
39
Bloque recuperado de muestras
Medida objetiva del error:
MSE (Mean Square Error)
1
MSE 
N
 s i   s i  


i 1 
N
2
Medida objetiva de la calidad:
PSNR (Peak SNR)
2552
PSNR  10 log10
MSE
Valores del ejemplo:
MSE = 49’53
PSNR = 31’18 dB
32
imagen


1. Introducción.
 Tipos de imagen


Estándar JPEG
 Wavelets

 EZW
 JPEG 2000
 LTW

4. Conclusiones.
Bibliografía
An Overview
trees
3.1 Redundancia espacial: Wavelets.
 Problemas con la DCT
 Efectos de borde en los
bloques
 Base ortonormal fija
(cosenos a distinta frec)
 Se
basa en un conjunto de funciones básicas que se
derivan de una función prototipo (“madre”).
Estas funciones son escalados y
desplazamientos de la función prototipo.
 Permiten analizar regiones de la señal
con diferente detalle (resolución).
 Analiza de forma independiente las bajas y altas frecuencias de la señal
(realmente actúa como un filtro).

34
Descomposición Wavelet: ejemplo.
 Dos
funciones:

Función generadora del espacio wavelet ()

Función de escalado () que define la señal en el espacio original
a distintas escalas

3
s=
-5
2
6
+  -1
4
s 1=
4
+ -2


+-
1.5
 -2.5
s 2=
4
-2
Versión ortogonal de la Haar Wavelet
35
Una familia de wavelets: Haar wavelet I
si 0  t  0.5
1,

 (t )    1, si 0.5  t  1
 0 , en otro caso

(t) (2t)(2t 1)
1, si 0  t  1
 (t )  
0, en otro caso
36
Una familia de wavelets: Haar wavelet II
1,0(t) (2t)
1, si 0  t  1
0, en otro caso
 (t )  
37
Una familia de wavelets: Haar wavelet III
1,0(t) (2t)
 1,1 (t )   (2t  1)
 n , k (t )  2 n ( 2 n t  k ),
0  k  2n -1
1, si 0  t  1
0, en otro caso
 (t )  
n: Escala
k: Desplazamiento
38
Análisis de multiresolución (Mallat)
 Relaciona
las funciones wavelets de distintos niveles
 (t )   hk 2 (2t  k )
 (t )   (1) k h1 k 2 (2t  k )
k
k
 Permite
definir familias de funciones wavelets a partir
de los coeficientes {hk} (usados como filtros)
 Ej.:
Daubechies4
h0 

1 3
4 2
h1 
3 3
4 2
h2 
3 3
4 2
h3 
1 3
4 2

39
Aplicación de wavelets sobre señales 1D (I)
s
s
40
Aplicación de wavelets sobre señales 1D (II)
r1
H
2
G
2
s
r1
d1
d1
41
Aplicación de wavelets sobre señales 1D (III)
r2
r1
H
r2
2
G
2
2
s
G
H
2
d2
d2
d1
d1
42
Aplicación de wavelets sobre señales 1D (IV)
r2
r1
H
G
G
d3
2
r3
G
2
d3
2
2
s
r3
H
H
2
d2
2
d2
d1
d1
43
Aplicación de wavelets sobre imágenes I
N
M
Imagen
44
Aplicación de wavelets sobre imágenes II
M/2
M
N
Hf
2
Gf
2
Low
N
Low
High
High
45
Aplicación de wavelets sobre imágenes II
M/2
M/2
N
Low
M
N
Hf
Gf
Hc
2
Gc
2
Hc
2
Gc
2
2
2
High
LL
N/2
LH
HL
HH
LL
HL
LH
HH
46
Aplicación de wavelets sobre imágenes IV
M/2
M/2
N
Low
M
N
Hf
Gf
Hc
2
Gc
2
Hc
2
Gc
2
2
2
High
LL
N/2
LH
HL
HH
47
Aplicación sobre imágenes, ejemplo I
48
Aplicación sobre imágenes, ejemplo II
49
Aplicación sobre imágenes, ejemplo III
50
Aplicación sobre imágenes, ejemplo IV
51
Wavelets: Estructura de un codec.
52
imagen


1. Introducción.
 Tipos de imagen


Estándar JPEG
 Wavelets

 EZW
 JPEG 2000
 LTW

4. Conclusiones.
Bibliografía
An Overview
trees
Wavelets: Cuantización

Una vez descompuesta la imagen en sub-bandas, se
procede a cuantificar los coeficientes wavelets.
La mayor parte de la energía está concentrada en las bandas
de menor frecuencia.
 Existe una relación clara entre los coeficientes de la misma
posición espacial en las diferentes bandas.


Algoritmos propuestos:





Embedded Zero-tree Wavelet (EZW)
Set Partitioning In Hierarchical Trees (SPIHT)
Stack-Run
Cuantificación conjunta: espacial y en frecuencia.
Lower-Tree Wavelet (LTW)
54
Wavelets: Algoritmo EZW
 Basado

en la definición de árboles de coeficientes
Hay coeficientes en diferentes subbandas que representan la
misma posición espacial en la imagen.
 En
imágenes naturales, la mayor parte de energía se
centra en las bandas de menor frecuencia.
Cuanto más cerca del nodo raíz está
un coeficiente, mayor magnitud tiene
 Si un nodo es menor que un umbral,
posiblemente sus descendientes
también lo serán.

 El
algoritmo realiza aproximaciones
sucesivas (planos de bits) con dos pasos por plano
 Utiliza un codificador aritmético

4 símbolos: sp, sn, zr, iz
55

INICIALIZACION:


n  log 2 cmax 
n: número de bits necesarios para codificar el mayor coeficiente.
PASO DOMINANTE (se almacena el mapa de coeficientes)
1) Los coeficientes que necesitan exactamente n bits para ser codificados
se etiquetan como significantes (positivo sp o negativo sn), se añaden a
una lista de significantes y ya no se procesan en otros pasos dominantes.
 2) El resto de coeficientes (que necesitan menos de n bits):
 Si todos sus descendientes también necesitan menos de n bits, se
etiqueta todo el árbol como raíz de un árbol de ceros (zr)
 Sino, se etiqueta ese coeficiente como cero aislado (iz)


PASO SUBORDINADO (se almacenan los bits de los coeficientes)


Se codifica el bit n de los coeficientes de la lista de significantes
Si no se ha alcanzado la tasa de bits deseada
Se decrementa n en uno (vamos al siguiente bit de menor peso)
 Procedemos a realizar una nueva iteración (pasos dominante y
subordinado)

56
Wavelets: Algoritmo EZW. Ejemplo I
11
11 4
-6 3
4 2
3 0
-1 2
2 1
7 1
2 2
4
-1
2
-6
2
1
4
2
3
3
0
7
1
2
2
n=3
1
1011
0110
0100
0011
0100
0011
0010
0000
0001
0010
0111
0010
0010
0001
0001
0010
0100
0011
0010
0000
0001
0010
0111
0010
0010
0001
0001
0010
Símbolos = {sp,zr,zr,zr}
Bits={}
Lista signif = {}
Nº bits: 4*2+0 = 8
n=2
2
*
0110
0100
0011
Símbolos = {sp,zr,zr,zr, sp,sn,iz,
zr,zr,zr,zr, sp,zr,zr,zr, sp,zr,zr,zr}
Bits={0}
Lista signif = {1011}
Nº bits: 19*2+1 = 39
57
Wavelets: Algoritmo EZW. Ejemplo II
11
11 4
-6 3
4 2
3 0
-1 2
2 1
7 1
2 2
4
-1
n=1
3
*
* 0001
* 0011 0010
* 0010 *
0011 0000 0010
0010
0001
0001
0010
2
-6
2
1
4
2
3
3
0
7
1
2
2
Símbolos = {sp,zr,zr,zr, sp,sn,iz,
zr,zr,zr,zr, sp,zr,zr,zr, sp,zr,zr,zr,
sp, zr,sp,sp,zr, sp,sp,zr, zr,sp,sp}
Bits={0, 1, 0, 1, 0, 1}
Lista signif = {
1011,
0100,
Nº bits: 30*2+6 = 66
0110,
0100,
0111
}
58
11
11 4
-6 3
4 2
3 0

4
-1
2
-6
2
1
4
2
3
3
0
7
1
2
2
¿Cuándo dejamos de codificar?


-1 2
2 1
7 1
2 2
Cuando terminamos con la cuota de bits
Varias opciones a la hora de que el decodificador
complete los bits no transmitidos:
Relleno de ceros (p.ej 101xxx  101000)
 Relleno de unos (p.ej 101xxx  101111)
 Reducir el intervalo de error (p.ej 101xxx  101100)

59
Tunning and Optimizing an EZW codec [OLI01]
 We
have implemented the EZW algorithm in order
to tune it
 Here we present different implementation
alternatives, some of them mentioned by Shapiro
and others not
 All these alternatives can be grouped in four
categories:
Choosing the best filter
 Coefficient preprocessing
 Improvements on the EZW algorithm
 Improvements related to the adaptive arithmetic encoder

60
EZW opt&tun: Choosing the best filter
 Choosing
a good filter bank is very important to
achieve good compactness of the image in LL band
 Our implementation shows similar results using the
same image and filter  validate our implementation
 Evaluated filters:
 Daubechies
4-tap
 Biorthogonal 9/7
 Villasenor 10/18
 Adelson 9-QMF (original used by Shapiro)
Bit Rate
2
1
0.5
0.25
Orig
N/A
39.55
36.28
33.17
Adel
44.03
39.53
36.28
33.18
PSNR Lena Image
Vil
B9/7
44.05
44.18
39.63
39.64
36.49
36.59
33.43
33.50
D4
43.90
39.17
35.54
32.23
61
EZW opt&tun: Choosing the best filter
Lena
Baboon
62
EZW opt&tun: Coeffcient preprocessing

An important effect appears in all Rate/Distortion
curves  its scalloped aspect with several peaks
Peaks correspond with the end of a full EZW iteration
 The EZW presents its best performance when the algorithm
finishes its bit budget at the end of a subordinate pass

Lena
Baboon
63
EZW opt&tun: Coefficient Pre-Quantization

At an established bit rate, depending on the prequantization factor, peaks of performance appear
These peaks are shifted to the right when the bit rate
decreases
 With the suitable quantization factor and filter bank,
performance can be improved up to 0.8 dB !!.

64
EZW opt&tun: Improvements on the EZW (I)

Some options can be tuned in the algorithm:
No swap: Different gradient between the subordinate and
dominant pass  bit from subordinate step are more valuable
 Swap: The subordinate pass is performed first, and then the
dominant pass

 Reduce & no swap: The
decoder must predict the bits
that the coder could not send.
In order to reduce the
uncertainty interval, it predicts
that the more significant is 1
and the rest are 0
 Reduce & swap: Performing
swap is not actually significant
when the uncertainty interval is
reduced
65
EZW opt&tun: Improvements on the EZW (II)

Other options:
Scan order  Morton order: Morton order performs the
coefficient scan in small groups, improving the adaptivity
achieved by the arithmetic encoder
 No_skip_first_bit_1  Skip: The MSB of a significant
coefficient is always 1 and may be skipped

 Reorder  No reorder:
Shapiro proposes to sort the
coefficients in the
subordinate pass according
to its magnitude..
 Poor benefit
 Encoder and decoder
become much more
slow.
66
EZW opt&tun: Improvements on the arithmetic coder

Use four histograms  Not use four histograms


Little improvement using four histograms.
Restart  Not restart

All histograms can be restarted at the end of a full pass, to
achieve better adaptivity

Use three symbols 
Use four

Last subbands do not have
offspring, a three-symbol
alphabet without the
isolated-zero symbol can
be used in these bands
(no improvement)
67
EZW opt&tun: Conclusions
We have evaluated 15 alternatives in different coder
stages
 We have shown that it is possible to get better results
than those published by Shapiro:

Around 0.4 dB under the same conditions
 Up to 0.8 dB more by using more efficient filter banks

68
imagen


1. Introducción.
 Tipos de imagen


Estándar JPEG
 Wavelets

 EZW
 JPEG 2000
 LTW

4. Conclusiones.
Bibliografía
An Overview
trees
3.1 Redundancia espacial: JPEG 2000

Estándar ISO/ITU (‘01) que mejora el anterior JPEG.
Mejora la compresión  usa transformada Wavelet
 Hay versión con pérdidas y sin pérdidas.
 Permite multiresolución (gracias al análisis Wavelet).
 Es embedded (enclavado)

Decodificación progresiva
 Escalable SNR

Permite dividir la imagen en bloques de cualquier tamaño (no
solo 8x8) (incluso un único bloque)
 Permite codificar Regiones de Interés (ROI)
 Es más robusto frente a errores.

70
JPEG 2000: Motivación

Inconvenientes del EZW:
Debido a las estructuras de árboles, los errores se propagan
de unas subbandas a otras  Menor robustez frente a errores.
 Debido a la codificación progresiva por planos de bits, no hay
escalabilidad espacial (sólo de calidad  SNR).


JPEG 2000 corrige estos inconvenientes:

Codifica las distintas subbandas por bloques y de forma
independiente, sin utilizar árboles (algoritmo Embedded Block
Coding with Optimized Truncation, EBCOT).
 Se utiliza un nuevo algoritmo para suplir la pérdida de eficiencia al no
aprovechar la redundancia existente entre subbandas.

Se introduce una etapa de reordenación de coeficientes que
permite todo tipo de escalabilidad (al no haber árboles los
coeficientes se reordenan con mayor facilidad).
71
JPEG 2000: Estructura

Estructura de un codificador JPEG 2000:
Una imagen se puede dividir en bloques denominados tiles
(normalmente hay sólo uno).
 A cada tile se le aplica el algoritmo JPEG2000:

 Se calcula su transformada Wavelet.
 Se cuantizan los coeficientes.
 Se aplica el algoritmo EBCOT por bloques (de 32x32 ó 64x64
coeficientes) (tier 1 coding)
 Se organiza el bit stream (tier 2 coding) según la escalabilidad
deseada.
– Escalabilidad espacial: Se sitúan primero las subbandas de menor
frecuencia.
– Escalabilidad SNR o en calidad: Se codifica primero los planos de
bits de los coeficientes de mayor magnitud.
72
JPEG 2000: Estructura
bit-plane 5
bit-plane 4
part of the encoded
bit-stream for a
desired target bit rate
bit-plane 3
bit-plane 2
…
significance propagation pass
bit-plane 1
(LSB)
magnitude refinement pass
clean up pass
block 1 block 2 block 3 block 4
block 5
Fig. 4.3: Example of block coding in JPEG2000. In tier 1 coding, each code-block is completely
encoded bit-plane by bit-plane, with three passes per bit-plane (namely signification propagation,
magnitude refinement and clean up passes). Only part of each code-block is included in the final
bitstream. In this figure, the truncation point for each code-block is pointed out with a dotted line.
These truncation points are computed with an optimization algorithm in tier 2 coding, in order to
match with the desired bit rate with the lowest distortion.
73
JPEG 2000: Comparativa JPEG-JPEG2000
Comparación objetiva
74
JPEG 2000: Comparativa JPEG-JPEG2000
Comparación subjetiva
Original: 700 kbytes
Comprimida con JPEG
6,1 kbytes - 1:115
Comprimida con JPEG 2000
6,1 kbytes - 1:115
75
imagen


1. Introducción.
 Tipos de imagen


Estándar JPEG
 Wavelets

 EZW
 JPEG 2000
 LTW

4. Conclusiones.
Bibliografía
An Overview
trees
LTW: A fast Multiresolution Image Coding Algorithm

Our proposal:


Coefficients are encoded without performing one loop scan
per bit plane
Quantization is performed following two strategies:
Finer quantization: Scalar uniform quantization of wavelet
coefficients
 Coarser quantization: Removing bit planes of all the
coefficients
 rplanes: number of least significant bits to be removed

A coefficient is significant if it is different from 0 after
discarding the first rplanes bits ( c ≥ 2rplanes )

77
LTW: Algorithm I

Scan all the subbands

For each coefficient in this subband:
Get the number of bits needed to represent it (nbits)
if it is significant (nbits > rplanes),
– Arithmetic_output nbits
– Output significant bits (quantized value) and sign
else (it is not significant),
– Arithmetic_output LOWER
78
LTW: Example of Algorithm I
Input  Quantized wavelet coefficients
0
1
-1
2
3
1
-8
3
47
-2
-1
4
+1
0
6 bits
1
-1
0
1
0
+1
0
0
1
-1
1
0
-1
rplanes=2
0
Output  Encoded Bitstream
6,+,
{011}
L
L
3,+,
{}
L
L
4,-,
{0}
L
L
L
L
L
79
LTW: : Characteristics of Algorithm I
The encoding process is spatial scalable (Mallat
decomposition) but it is not SNR scalable
 No inter-subband dependency:

Possibility of robustness using synch marks
 Lower compression performance

One arithmetic symbol is always encoded for every
wavelet coefficient…


3/4 part of the execution time is spent in the arithmetic
encoder stage 
So, we propose an evolution of this algorithm that
considers working with coefficient trees and avoids some
drawbacks.

80
LTW: Two-pass block coding using lower-trees
We use coefficient trees as a fast and efficient method
of grouping coefficients (quad-tree)
 The quantization method is the same as in Algorithm I
 A coefficient is lower-tree root if this coefficient and all
its descendants are not significant (i.e., <= 2rplanes)

The root coefficient is labeled as LOWER (L)
 The rest of coefficients in the tree are LOWER_COMPONENTS
(*) and are never stored

If a coefficient is not significant, but any descendant is
significant, that coefficient is ISOLATED_LOWER (I) and
does not form a tree

81
LTW: Algorithm II

It is a two-pass algorithm
First pass: Wavelet coefficients are properly labeled
according to their significance

Subbands are scanned in 2x2 blocks
 If all the coefficients in the block are not significant  They
are labeled (*)
 If any coefficient is significant  They are labeled as (L), (I )
or with (nbits) according to their significance

Second pass: Coefficient values are coded in a
similar way than in algorithm I

82
LTW: Algorithm II example

We want to encode an 8x8 example image

The wavelet coefficient matrix is obtained (from DWT)

rplanes=2, coefficients within interval ] 22, -(22)[ are
absolutely quantized
We focus on the first
level subbands
51 42 -9
In this block “-4” is
significant (needs 3 bits
to be represented)


The rest are not
significant (LOWER)

2
4
4
0
-1
25 17 10 11
3
1
0
2
12 3
3
-2
2
-2
-5
3
-9
-3
3
-3
0
3
-1
2
-4
1
1
2
0
2
1
3
2
-3
0
2
1
-1
-1
-2
1
3
2
1
1
2
-3
1
-2
-3
3
-12 2
0
2
1
83
These blocks are not significant and their
coefficients are labeled as LOWER_COMPONENT

51 42 -9
2
4
4
0
-1
25 17 10 11
3
1
0
2
12 3
3
-2
2
-2
-5
3
-9
-3
3
-3
0
3
-1
2
3
L
L
L
1
2
0
2
1
3
0
2
1
-1
-1
-2
1
3
2
1
1
2
-3
1
-2
-3
3
-12 2
0
2
1
84

In this block we have a significant coefficient
51 42 -9
2
4
4
0
-1
25 17 10 11
3
1
0
2
12 3
3
-2
2
-2
-5
3
-9
-3
3
-3
0
3
-1
2
3
L
L
L
*
*
0
2
1
3
*
*
1
-1
-1
-2
*
*
2
1
1
2
-3
1
*
*
3
-12 2
0
2
1
85
All these blocks in this subband
are not significant

51 42 -9
2
4
4
0
-1
25 17 10 11
3
1
0
2
12 3
3
-2
2
-2
-5
3
-9
-3
3
-3
0
3
-1
2
3
L
L
L
*
*
0
2
1
3
*
*
1
-1
-1
-2
*
*
2
-3
1
*
L
4
1
*
L
L
2
0
2
1
86
These subbands have significant
coefficients

51 42 -9
2
4
4
0
-1
25 17 10 11
3
1
0
2
12 3
3
-2
2
-2
-5
3
-9
-3
3
-3
0
3
-1
2
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
87
These subbands have
significant coefficients
While these subbands
are not significant


3
3
0
-1
25 17 10 11
L
L
0
2
12 3
3
-2
2
-2
-9
-3
3
-3
0
3
3
L
L
L
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
51 42 -9
2
88
Now we focus on the
following subband level

This coefficient and
all its descendants are
not significant: LOWER
symbol

3
3
*
*
25 17 10 11
L
L
*
*
12 3
3
-2
*
*
-9
-3
3
-3
*
*
3
L
L
L
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
51 42 -9
2
89
This coefficient is not significant, but there is a
significant descendant: ISOLATED_LOWER symbol

3
3
*
*
25 17 10 11
L
L
*
*
12 L
3
-2
*
*
-9
-3
3
-3
*
*
3
L
L
L
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
51 42 -9
2
90
This coefficient is significant: we use a
symbol indicating the number of bits needed
to represent it, a nbits symbols

3
3
*
*
25 17 10 11
L
L
*
*
12 L
3
-2
*
*
3
-3
*
*
L
L
3
L
I
L
L
3
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
51 42 -9
-9
2
91
This coefficient is significant, but its descendants are
not significant: a special nbits symbol indicates that it is
the root of a special lower-tree (the root is significant)

3
3
*
*
25 17 10 11
L
L
*
*
4
L
L
51 42 -9
2
3
-2
*
*
3
-3
*
*
3
L
3
L
L
I
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
-9
92
This block and all its descendants are not
significant, the tree can continue growing

3
3
*
*
25 17 10 11
L
L
*
*
4
4L
3
L
L
I
L
L
3
-2
*
*
3
-3
*
*
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
51 42 -9
2
93
We can finish this first step calculating
the rest of symbols in the same way

3
3
*
*
25 17 10 11
L
L
*
*
4
4L
3
L
L
I
L
L
*
*
*
*
*
*
*
*
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
51 42 -9
2
94
In a second pass, all the symbols (except (*)) are
encoded like in algorithm I

This is the subband
scan order for the
second pass

6
6
4
L
3
3
*
*
5
5L
4L
4
L
L
*
*
4
4L
3
L
L
I
L
L
*
*
*
*
*
*
*
*
3
L
L
L
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
L
4
*
*
L
L
*
*
*
*
95
LTW: Comparison with other wavelet image coders
We use real implementations in this evaluation
 All the implementations are in standard C++ (no
assembly language), compiled using the same
optimization level

SPIHT: Source code available from the authors
 JPEG 2000: Jasper v1.6 (ISO/IEC 15444-5 reference software)
 LTW: Implementation available at GATCOM web site


Selected images:
Standard Lena (8bpp, 512x512): We can compare our algorithm
with practically all the published algorithms
 Café image from the JPEG 2000 test set: A typical high detailed
picture that has been taken using a high quality 5-Megapixel
digital camera

96
LTW: Lena Rate/PSNR comparison
 Our Lower-Tree Wavelet (LTW) encoder is the one that
better results show, including the recently released
JPEG 2000 standard
Codec
\rate
EZW
SPIHT
Stack
Run
Embedded
Run-length
JPEG
2000
LTW
2
n/a
45.07
n/a
n/a
44.62
45.46
1
39.55
40.41
n/a
40.28
40.31
40.50
0.5
36.28
37.21
36.79
37.09
37.22
37.35
0.25
33.17
34.11
33.63
34.01
34.04
34.31
0.125
30.23
31.10
n/a
n/a
30.84
31.27
97
LTW: Café Rate/PSNR comparison
 In high-detail images, the
advantage is not so big compared
to JPEG 2000:

Many details lead to less
insignificant coefficients in first
levels, thus less lower-trees
Codec
\rate
SPIHT
JPEG
2000
LTW
2
38.91
39.09
39.11
1
31.74
32.04
32.03
0.5
26.49
26.80
26.85
0.25
23.03
23.12
23.24
0.125
20.67
20.74
20.76
98
LTW: Café Execution time comparison
 Our encoder:
 Up to 8 times faster than J2K, and 2.5 times faster than SPIHT

Our decoder:

Up to 2.5 faster than J2K, and 7 times faster than SPIHT
 Similar results with Lena
café coding
café decoding
codec\
rate
SPIHT
JPEG 2000
LTW
SPIHT
JPEG 2000
LTW
2
4368.7
7393.1
1663.5
5375.4
2373.0
1505.6
1
2400.1
6907.6
1203.7
3514.0
1475.2
911.6
0.5
1399.3
6543.9
938.6
2619.7
991.8
569.5
0.25
889.8
6246.2
782.8
2193.0
763.6
366.6
Time in million of CPU cycles (only coding/decoding, no WT)
99
LTW: Conclusions
 A new wavelet image coder based on the construction and
efficient coding of wavelet lower-trees (LTW) has been
presented
 Its compression performance is within the state-of-the-art



The main contribution is its lower complexity


Improves SPIHT in 0.2-0.4 dB and
J2K with Lena in 0.35 dB as mean value
Encodes an image up to 8 times faster than J2K and 2.5 times
faster than SPIHT
There is no memory overhead (in-place processing of the
wavelet coefficients): only needs memory to store the source
image, no extra lists
100
4. Conclusiones
Las imágenes son captadas por los CCDs en las cámaras
digitales.
 Utilizamos distintos tipos de codificación YCbCr con
subsampling de la información de color.
 Compresión de imagen, elimina redundancia espacial:

JPEG: Utiliza la DCT, cuantiza los coeficientes resultantes y los
almacena usando RLE+huffman.
 Wavelets: Más eficiente. Permite definir la función base (no
necesariamente cosenos) y descomponer toda la imagen.

 EZW: Almacena los coeficientes Wavelet usando estructuras de árboles y
aproximaciones sucesivas.
 JPEG2000: Almacena los coeficientes Wavelet por bloques utilizando el
algoritmo EBCOT, de manera que consigue escalabilidad espacial y SNR, y
mayor robustez frente a errores.
 LTW: Eficiente, bajo consumo de recursos (memoria y ciclos CPU)
101

Codificación de Imagen

Transcripción

Documentos relacionados

LA COLUMNA TOSCANA

Autor: Andrés Martínez-Medina, Universidad de Alicante Título

Arquitectura de Computadores - Licenciatura en Ciencia de la