32 - SASE

Transcripción

32 - SASE
Electrocomponentes S.A.
SASE 2013
SAM4/Cortex M4 de ATMEL, la
solución para el procesamiento digital
de señales
Agenda
•
•
•
•
Microcontroladores/Microprocesadores vs DSP’s.
Algoritmos tipicos.
Aplicaciones.
Comparación de Arquitecturas.
– Arquitectura Cortex M3.
– Arquitectura Cortex M4.
• Línea de productos.
• Herramientas de programación y entornos de
desarrollo.
• Demostraciones.
¿Qué es un Microprocesador?
• Un Microprocesador (uP) es un circuito integrado que incorpora en su
interior una unidad central de proceso (CPU) y todo un conjunto de
elementos lógicos que permiten enlazarlo con otros dispositivos como
pueden ser memorias y puertos de entrada y salida (I/O). Es un sistema del
tipo “ABIERTO”.
¿Qué es un Microcontrolador?
• Un microcontrolador (uC) es un circuito integrado que nos ofrece las
posibilidades de un pequeño computador. Es decir, que en su interior
podemos encontrar un procesador, memorias, y varios periféricos (puertos
I/O, A/Ds, D/A, etc.).
¿Que es un DSP?
•
Un procesador digital de señales o DSP (sigla en inglés de digital signal
processor) es un sistema basado en un microprocesador o
microcontrolador que posee un juego de instrucciones y un hardware
optimizados para aplicaciones que requieran operaciones numéricas a muy
alta velocidad.
•
Arquitectura Harvard
•
Cache de Intrucciones
•
Acceso directo a memoria (DMA)
•
Generadores de direccion de datos (DAG)
•
Unidad de multiplicacion y acumulacion (MAC)
•
Unidad de punto flotante (FPU)
Cortex M4 / SAM4
ARM Cortex-M4 Features
ISA Support
Thumb® / Thumb-2
DSP Extensions
Single cycle 16,32-bit MAC
Single cycle dual 16-bit MAC
8,16-bit SIMD arithmetic
Hardware Divide (2-12 Cycles)
Floating Point Unit
Single precision floating point unit
IEEE 754 compliant
Pipeline
3-stage + branch speculation
Performance Efficiency
3.40 CoreMark/MHz*
Performance Efficiency
Without FPU: 1.25 / 1.52 / 1.91 DMIPS/MHz**
With FPU: 1.27 / 1.55 / 1.95 DMIPS/MHz**
Memory Protection
Optional 8 region MPU with sub regions and background region
Interrupts
Non-maskable Interrupt (NMI) + 1 to 240 physical interrupts
Interrupt Priority Levels 8 to 256 priority levels
Wake-up Interrupt
Controller
Up to 240 Wake-up Interrupts
Sleep Modes
Integrated WFI and WFE Instructions and Sleep On Exit capability.
Sleep & Deep Sleep Signals.
Optional Retention Mode with ARM Power Management Kit
Bit Manipulation
Integrated Instructions & Bit Banding
Debug
Optional JTAG & Serial-Wire Debug Ports. Up to 8 Breakpoints
and 4 Watchpoints.
Trace
Optional Instruction Trace (ETM), Data Trace (DWT), and
Instrumentation Trace (ITM)
Algoritmos
•
DSP operations – MAC is key operation
•
•
•
•
•
Most operations are dominated by MACs
These can be on 8, 16 or 32 bit operations
y[n] = ∑ h[k ]x[n − k ]
FIR Filters
N −1
•
Data communications
•
Smoothing data
•
Echo cancellation (adaptive versions)
IIR filters
•
Audio equalization
•
Motor control
k =0
y[n] = b0 x[n] + b1 x[n − 1] + b2 x[n − 2]
+ a1 y[n − 1] + a2 y[n − 2]
FFT
•
Audio compression
•
Noise removal
•
Spread spectrum communication
Y [k1 ] = X [k1 ] + X [k 2 ]e − jω
Y [k 2 ] = X [k1 ] − X [k 2 ]e − jω
Typical DSP Algorithms
•
DSP operations – MAC is key operation
•
•
•
•
•
Most operations are dominated by MACs
These can be on 8, 16 or 32 bit operations
FIR Filters
•
Data communications
•
Echo cancellation (adaptive versions)
•
Smoothing data
IIR filters
•
Audio equalization
•
Motor control
FFT
•
Audio compression
•
Spread spectrum communication
•
Noise removal
y[n] = ∑ h[k ]x[n − k ]
N −1
k =0
y[n] = b0 x[n] + b1 x[n − 1] + b2 x[n − 2]
+ a1 y[n − 1] + a2 y[n − 2]
Y [k1 ] = X [k1 ] + X [k 2 ]e − jω
Y [k 2 ] = X [k1 ] − X [k 2 ]e − jω
8
Aplicaciones
• Procesamiento de Audio
• Reproduccion de musica de alta fidelidad.
• Telecomunicacion/voz.
• Sintetizacion y reconocimiento de voz.
Aplicaciones
• Análisis de señales.
• Medicion de vibraciones
• Medicion de señales sismicas
• ECG - EEG
Aplicaciones
• Control
• Control de motores.
• Controles PID.
• Robotica.
Aplicaciones
• Procesamiento de imagenes
• Radares.
• Diagnostico por imagen.
• Procesamiento de imagenes .
ARM7 vs Cortex M3
ARM7TDMI
Cortex-M3
Architecture
ARMv4T (von Neumann)
ARMv7-M (Harvard)
ISA Support
Thumb / ARM
Thumb / Thumb-2
Pipeline
3-stage
3-stage + branch speculation
Interrupts
FIQ / IRQ
NMI + 1 to 240 interrupts
Interrupt Latency
24-42 cycles
12 cycles
(6 when Tail Chaining)
Sleep Modes
None
Integrated (3)
Memory Protection
None
8 regions MPU
Dhrystone
0.95DMIPS/MHz (ARM)
1.25DMIPS/MHz
Power Consumption
0.28mW/MHz
0.19mW/MHz
Cortex M4
• Designed for applications requiring more computational performance
• Cortex-M4 frees CPU resources in case digital signal processing task are
used (less active cycles are needed)
• Cortex M4 features:
•
A single-cycle multiply-accumulate unit (MAC)
•
Optimized single instruction multiple data (SIMD) instructions, saturating arithmetic
instructions
•
Optional single precision Floating-Point Unit (FPU)
Cortex-M3
DSP
Cortex-M4
Ease of use
C Programming
Interrupt handling
Ultra low power
Harvard architecture
Single cycle MAC
Floating Point
Barrel shifter
Cortex M3 vs Cortex M4
Cortex-M3
Cortex-M4
Architecture
ARMv7-M (Harvard)
ARMv7-M (Harvard)
ISA Support
Thumb / Thumb-2
Thumb / Thumb-2
Single cycle 16, 32-bit MAC
DSP Extensions
NA
Single cycle dual 16-bit MAC
8, 16-bit SIMD arithmetic
Hardware Divide (2-12 cycles)
Optional Floating Point
Unit
NA
Single precision floating point
unit IEEE 754 compliant
Pipeline
3-stage + branch
speculation
3-stage + branch speculation
Interrupts
NMI + 1 to 240 interrupts
NMI + 1 to 240 interrupts
Interrupt Latency
12 cycles
(6 when Tail Chaining)
12 cycles
(6 when Tail Chaining)
Sleep Modes
Integrated (3)
Integrated (3)
Memory Protection
8 regions MPU
8 regions MPU
Dhrystone
1.25DMIPS/MHz
1.25DMIPS/MHz
Cortex M3 vs Cortex M4
Set de Instrucciones
Desempeño
• Cortex™-M4 (SIMD+FPU) vs. Cortex-M3 Processors
• Fixed Point: 2x Faster
• Floating Point: 10x Faster
Single Cycle Multiply Accumulate Instructions
• Cortex-M4 features 32-bit hardware multiply-accumulate
(MAC) unit
• Makes digital signal processing more efficient and greatly reduces the
consumption of CPU resources
• Capable of accomplishing an operation of up to 32×32+64->64 or two
operations of 16×16 in a single cycle
• Main features:
• Wide range of multiply-accumulate instructions
• Choice of 16 or 32 bit multiply and 32 or 64 bit accumulate
• All instructions execute in a single cycle
MAC Instructions
OPERATION
INSTRUCTION
16 x 16 = 32
16 x 16 + 32 = 32
16 x 16 + 64 = 64
16 x 32 =
(16 x 32)
(16 x 16)
(16 x 16)
(16 x 16)
SMULBB, SMULBT, SMULTB, SMULTT
SMLABB, SMLABT, SMLATB, SMLATT
SMLALBB, SMLALBT, SMLALTB,
SMLALTT
32
SMULWB, SMULWT
+ 32 = 32
SMLAWB, SMLAWT
SMUAD, SMUADX, SMUSD, SMUSDX
± (16 x 16) = 32
± (16 x 16) + 32 = 32 SMLAD, SMLADX, SMLSD, SMLSDX
± (16 x 16) + 64 = 64 SMLALD, SMLALDX, SMLSLD, SMLSLDX
32 x 32 = 32
32 ± (32 x 32) = 32
32 x 32 = 64
(32 x 32) + 64 = 64
(32 x 32) + 32 + 32 = 64
MUL
MLA, MLS
SMULL, UMULL
SMLAL, UMLAL
UMAAL
32 ± (32 x 32) = 32 (upper)
(32 x 32) = 32 (upper)
SMMLA, SMMLAR, SMMLS, SMMLSR
SMMUL, SMMULR
Single Instruction Multiple Data (SIMD)
•
•
Several instructions operate on “packed” data types
•
Byte or halfword quantities packed into words
•
Allows more efficient access to packed structure types
•
SIMD instructions can act on packed data:
•
Quad (4 parallel) 8-bit adds or subtracts
•
Dual (2 parallel) 16-bit adds or subtracts
•
All instructions execute in a single cycle
SIMD extensions perform multiple operations in one cycle
Sum = Sum + (A x C) + (B x D)
SIMD Arithmetic Operations
S
Q
SH
U
UQ
UH
Signed
Signed
Saturating
Signed
Halving
Unsigned
Unsigned
Saturatin
g
Unsigned
Halving
ADD8
SADD8
QADD8
SHADD8
UADD8
UQADD8
UHADD8
SUB8
SSUB8
QSUB8
SHSUB8
USUB8
UQSUB8
UHSUB8
ADD16
SADD16
QADD16
SHADD16
UADD16
UQADD16
UHADD16
SUB16
SSUB16
QSUB16
SHSUB16
USUB16
UQSUB16
UHSUB16
ASX
SASX
QASX
SHASX
UASX
UQASX
UHASX
SAX
SSAX
QSAX
SHSAX
USAX
UQSAX
UHSAX
SAX
ASX
1. Exchanges halfwords of the second operand register 1. Exchanges halfwords of the second operand register
2. Adds top halfwords and subtracts bottom halfwords 2. Subtracts top halfwords and adds bottom halfwords
Optional Floating Point Unit (FPU)
• IEEE 754 standard compliance
• Single-precision floating point math key to some algorithms
• Add, subtract, multiply, divide, MAC and square root
• Fused MAC – provides higher precision Single-precision floating point
math
Cortex Microcontroller Software Interface Standard (CMSIS)
•
Abstraction layer for all Cortex-M processor based devices
•
Developed in conjunction with silicon, tools and middleware partners
•
Benefits to the embedded developer
•
Consistent software interfaces for silicon and middleware vendors
•
Simplifies re-use across Cortex-M processor-based devices
•
Reduces software development cost and time-to-market
•
Reduces learning curve for new Cortex microcontroller developers
CMSIS-DSP Library
•
C Source Code optimized for Cortex-M4
•
For CMSIS compliant C Compilers (MDK ARM, IAR and GCC)
•Basic Math Functions
•
•
•
•
•
•
•
•
•
Vector Absolute Value
Vector Addition
Vector Dot Product
Vector Multiplication
Vector Negate
Vector Offset
Vector Scale
Vector Shift
Vector Subtraction
•Fast Math Functions
•
Cosine
•
•
Sine
Square Root
•Complex Math Functions
•
•
•
•
•
•
Complex Conjugate
Complex Dot Product
Complex Magnitude
Complex Magnitude Squared
Complex-by-Complex Multiplication
Complex-by-Real Multiplication
•Filtering Functions
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Biquad Cascade IIR Filters Using Direct Form I Structure
Biquad Cascade IIR Filters Using a Direct Form II Transposed Structure
High Precision Q31 Biquad Cascade Filter
Convolution
Partial Convolution
Correlation
Finite Impulse Response (FIR) Decimator
Finite Impulse Response (FIR) Filters
Finite Impulse Response (FIR) Lattice Filters
Finite Impulse Response (FIR) Sparse Filters
Infinite Impulse Response (IIR) Lattice Filters
Least Mean Square (LMS) Filters
Normalized LMS Filters
Finite Impulse Response (FIR) Interpolator
•Matrix Functions
•
•
•
•
•
•
•
Matrix Addition
Matrix Initialization
Matrix Inverse
Matrix Multiplication
Matrix Scale
Matrix Subtraction
Matrix Transpose
•Transform Functions
•
•
•
Complex FFT Functions
DCT Type IV Functions
Real FFT Functions
•Controller Functions
•
•
•
•
•
•
Sine Cosine
PID Motor Control
Vector Clarke Transform
Vector Inverse Clarke Transform
Vector Park Transform
Vector Inverse Park transform
•Statistics Functions
•
•
•
•
•
•
Maximum
Mean
Minimum
Power
Root mean square (RMS)
Standard deviation
•
Variance
•Support Functions
•
•
•
•
•
•
Convert 16-bit Integer value
Convert 32-bit Integer value
Convert 32-bit floating point value
Convert 8-bit Integer value
Vector Copy
Vector Fill
•Interpolation Functions
•
•
Linear Interpolation
Bilinear Interpolation
Arquitecturas ARM
• ARM posee un gran numero de arquitecturas, las mas difundidas son:
– ARMv4T (ARM7TDMI y ARM9T)
– ARMv5TEJ (ARM926EJ y ARM7EJ)
– ARMv6 (ARM11)
– ARMv6-M (Cortex-M0)
– ARMv7
• Perfil M, diseñado para aplicaciones de microcontroladores, un procesamiento
eficiente es tan importante como, el bajo consumo y un bajo costo.
• Perfil R, diseñado para aplicaciones embebidas de alta prestaciones, en los cuales un
desempeño en tiempo real es necesario.
• Perfil A, diseñado para correr sistemas operativos tales como Linux o Windows CE.
Arquitecturas ARM
Soluciones ARM de ATMEL
SAM4N
•
Entry level Cortex-M3 flash based MCUs
–
–
–
–
–
–
–
–
–
–
–
–
Extends SAM4N offering down to 1MByte
Flash and 80kB SRAM
Entry level Cortex-M4 MCU running up to
80MHz
Simplified PCB design through on-chip
termination resistors
Extended supply 1.62-3.6V
Backup mode down to 1.8µA
High data rate serial com including SPI up to
38Mbps
Native Capacitive Touch support
1 SPI, 3 I2C, 7 UART
NO USB
12 bits ADC and 10 bits DAC
Package, QFP, QFN, BGA, from 48 to 100
SAM7S/SAM3S/SAM4S Pin-to-pin compatible
SAM3N vs SAM4N
Core
SAM3N
SAM4N
Cortex-M3
Cortex-M4
Flash
16K-256K
Single bank
512KB/1MB
Single bank
10b
Down to 2.4v
Communication
UART:4
SPI: 1
I2C: 2
12b (at 24Ksps)
Down to 1.62V
Package Options
48, 64, 100
QFN, QFP, BGA
Status
Production
48, 64, 100
QFN, QFP, BGA,
VFBGA
Frequency
SRAM
ADC
48MHz
Up to 24K
80MHz
80K
UART:7
SPI: 1
I2C: 3
March 2013
SAM3N/4N Aplicaciones
High Speed serial comm.
RS485/Modbus stack
ADC/DAC
Low power
QTouch
Safety features
31
QTouch
Low Power
48pin in 256KB
Safety features
RS485
QTouch
Low Power
Downto 16KB of flash
Low Power
Up to 4 UART
High speed serial comm.
Qtouch
Low Power
SAM3/4S
•
General purpose Cortex-M3/M4 flash based
MCUs
–
–
–
–
–
–
–
–
–
–
–
–
–
–
From 64KB to 2MB Flash Memory
Simple an dual bank
From 8KB to 160KB SRAM
Cortex-M3 MCU running from 64Mhz to 100Mhz
Cortex-M4 MCU running up to 120Mhz
Simplified PCB design through on-chip termination
resistors
Extended supply 1.62-3.6V
Backup mode down to 1.8µA
Native Capacitive Touch support
Up to 3 SPI, 2 I2C, 5 UART, I2S
SDIO/SD/MMC
FS USB 2.0 device
12 bits ADC and 12 bits DAC
Package, QFP, QFN, BGA, from 48 to 100
SAM3S vs SAM4S
Core
Frequency
Cache
SAM3S/SD
SAM4S/SD
SAM4SA
Cortex-M3
Cortex-M4
Cortex-M4
-
- / 2KB
2KB
64MHz
120MHz
120MHz
64K-512K
Single/Dual bank
512KB-1MB / 1-2MB
Single/Dual bank
1MB
Single bank
Package
Options
48, 64, 100
QFN, QFP, BGA
64, 100
QFN, QFP, BGA,
VFBGA, WLCSP
64, 100
QFN, QFP, BGA,
VFBGA
Status
Production
Production
Production Feb. 13
Flash
SRAM
Up to 64K
128K / 160K
160K
SAM3S/4S Aplicaciones
High Speed serial
comm.
RS485/Modbus
Up to 1MB flash
12b ADC/DAC
Low power
QTouch
Safety features
Manchester codec
802.15.4 based
Wireless support
EBI
34
QTouch
FS USB
HSMCI
CMOS sensor
I/F
Up to 1MB flash
Low Power
Safety features
802.15.4 based
Wireless support
EBI
RS485
QTouch
Low Power
CMOS sensor
I/F
Up to 1MB flash
Downto 16KB of flash
802.15.4 based Wireless
support
EBI
Low Power
Up to 5 UART
Up to 1MB flash
Dual bank flash
EBI
Low power
QTouch
CMOS sensor
I/F
EBI
High speed serial
comm.
HSMCI
QTouch
Low Power
Up to 1MB flash
Form, Fit & Function Compatibility
• Seamless Migration Path Between SAM7S, SAM3N, SAM3S,
SAM4N and SAM4S Devices
– SAM7S to SAM3S Migration Guide Available
35
Dual Bank Flash
Available on the S, U, X and A series.
What Problem does it Solve?
Dual Bank Flash enables Fault Tolerant Self-Programming
–Provide a fail-safe method of upgrading firmware on remote units
–Enable background firmware upgrade without halting application execution
How Does it Work?
–Safe and secure remote update:
RESET VECTOR
RESET VECTOR
Wired /
wireless
stream
Normal operation from Bank 1
while simultaneously remotely
programming Bank 2
RESET VECTOR
Power or comms failure cause
Bank 2 program fail while
Bank 1 continues to operate
and requests retransmission
Reprogramming successful,
device now executes from Bank
2,Bank 1 available for next
update
SAM4L
•
Low Power Cortex-M4 flash based MCUs
Up to 256KB Flash
Up to 32KB SRAM
Cortex-M4 MCU running up to 48Mhz
Extended supply 1.62-3.6V
Up to SPI, 4 I2C, 4 UART
FS USB 2.0 (Device and Host)
LCD Controller up to 4x40 segment
Hardware QTouch Acquisition up to 32 channels
12 bits ADC and 10 bits DAC
Package, QFP, QFN, BGA, from 48 to 100 pines
Pico Power Technology
Cortex™-M4 processor-based devices, delivering the lowest power in active
mode (down to 90µA/MHz) as well as sleep mode with full RAM retention
(1.5µA) with the shortest wake-up time (down to 1.5µs)
SAM4L Family Overview
SAM4LC Series
Pins
SAM4LS Series
100
64
48
100
64
48
4x40
4x23
4x13
No
No
No
Yes
Yes
Yes
No
No
No
USB
Host and
Device
Host and
Device
Host and
Device
Device
Device
Device
GPIO
75
43
27
80
48
32
2 Master +
2 Master/Slave
2 Master/Slave
1 Master/Slave
2 Master +
2 Master/Slave
2 Master/Slave
1 Master/Slave
LCD
Hardware
Crypto
I2C
SAM4L Aplicaciones
•
Industrial
•
•
•
•
•
Healthcare
•
•
•
•
•
Process transmitters
Sensors & detectors
Sub-meters
Sensor hub
Glucose meters
Pulse oximetry
Human fall detection
Blood pressure
Consumer
•
•
•
•
Sport watches
Remote control
Toys
Sensor hub
SAM4E
•
Industrial Cortex-M4 flash based MCUs
–
–
–
–
–
–
–
–
–
–
–
–
–
–
Up to 1MB Flash
Up to 128KB SRAM
Cortex-M4 MCU running up to 120Mhz
FPU (Floating point unit )
2KB cache
Extended supply 1.62-3.6V
Up to 3 SPI, 2 I2C, 4 UART
SDIO/SD/MMC
Dual CAN
10/100 Ethernet controller
Full Speed USB 2.0 Device
16 bits ADC and 12 bits DAC
Crypto AES
Package, QFP, QFN, BGA, from 100 to 144
SAM4E Aplicaciones
• Industrial Automation and M2M
• Programmable logic controller
• Drive control
• Robotics
• Building and Home Control
•
•
•
•
Gateway
Access control
Control panels
Room control unit
• Energy Management
• Power supplies communication
• Switch breakers communication
• Inverters communication
• Automotive Aftermarket
• Fleet management
• Telematics
Porfolio Cortex de Atmel
SAM3/4 Technical Highlights
Dual Bank Flash
EBI scrambling
USB HS Phy
On-Die
Termination
PIO capture
mode
(camera int.)
ECC on Embedded
Flash
Event System
Sleep Walking
Cache
Active power
consumption
(/MHz)
2MB of Flash
SAM3/4N
SAM3/4S
-
√
-
-
√
√
√
-
-
-
-
√
√
√
-
√
SAM4E SAM4L SAM3U SAM3A SAM3X
-
-
√
-
√
√
√
√
√
√
√
√
-
√
√
√
-
-
-
√
√
√
√
√
√
√
-
-
-
√
-
-
-
-
SAM4S
√
√
-
-
-
200µA
200µA
200µA
Down to
90 µA
350µA
350µA
350µA
-
SAM4S
-
-
-
-
-
-
-
-
√
-
-
-
Atmel SAM3/4 Ecosystem
44
Atmel Studio 6 - Two Architectures, One Studio
•Integrating ARM and AVR Design
•Free, Professional IDE, Integrated GNU C/C++ Compiler
•300 Atmel ARM and AVR MCUs
•QTouch Composer
•Intelligent editor
•New Project Wizard with over 1,000 project examples
•Seamless connection to all in-system debuggers
–
SAM-ICE in-circuit debugger
–
SAM3 and SAM4 evaluation kits
•Cycle accurate chip and peripheral simulator
•Easy 8/32 bit migration
–
Single IDE for ARM and AVR
–
Atmel Software Framework
Atmel Studio 6 – Atmel Software Framework
•
•
•
Software Library
–
Peripheral drivers
–
Hardware abstraction
–
Communication
–
Graphics
–
Up to 50% of new project
Standard APIs
–
Easy code migration
–
Support for 300 ARM + AVR MCUs
–
Common 8/32-bit platform
ASF Explorer
–
Manage ASF components
–
Trace driver dependencies
–
Easy access to documentation
Atmel Studio 6 – QTouch Composer
•
•
•
QTouch Project Wizard
–
Configure QTouch project
–
Optimized QTouch library code
–
Automatic power management
Touch Wizard
–
Automatic performance tests
–
Optimal design recommendations
Power Analyzer
–
Real-time monitoring of MCU power consumption
–
Profiling and visualization
•
Time spent on touch sensing
•
Time spent on user code
•
Time spent in power down
Kits de desarrollo y placas Xplained

Documentos relacionados