32 - SASE
Transcripción
32 - SASE
Electrocomponentes S.A. SASE 2013 SAM4/Cortex M4 de ATMEL, la solución para el procesamiento digital de señales Agenda • • • • Microcontroladores/Microprocesadores vs DSP’s. Algoritmos tipicos. Aplicaciones. Comparación de Arquitecturas. – Arquitectura Cortex M3. – Arquitectura Cortex M4. • Línea de productos. • Herramientas de programación y entornos de desarrollo. • Demostraciones. ¿Qué es un Microprocesador? • Un Microprocesador (uP) es un circuito integrado que incorpora en su interior una unidad central de proceso (CPU) y todo un conjunto de elementos lógicos que permiten enlazarlo con otros dispositivos como pueden ser memorias y puertos de entrada y salida (I/O). Es un sistema del tipo “ABIERTO”. ¿Qué es un Microcontrolador? • Un microcontrolador (uC) es un circuito integrado que nos ofrece las posibilidades de un pequeño computador. Es decir, que en su interior podemos encontrar un procesador, memorias, y varios periféricos (puertos I/O, A/Ds, D/A, etc.). ¿Que es un DSP? • Un procesador digital de señales o DSP (sigla en inglés de digital signal processor) es un sistema basado en un microprocesador o microcontrolador que posee un juego de instrucciones y un hardware optimizados para aplicaciones que requieran operaciones numéricas a muy alta velocidad. • Arquitectura Harvard • Cache de Intrucciones • Acceso directo a memoria (DMA) • Generadores de direccion de datos (DAG) • Unidad de multiplicacion y acumulacion (MAC) • Unidad de punto flotante (FPU) Cortex M4 / SAM4 ARM Cortex-M4 Features ISA Support Thumb® / Thumb-2 DSP Extensions Single cycle 16,32-bit MAC Single cycle dual 16-bit MAC 8,16-bit SIMD arithmetic Hardware Divide (2-12 Cycles) Floating Point Unit Single precision floating point unit IEEE 754 compliant Pipeline 3-stage + branch speculation Performance Efficiency 3.40 CoreMark/MHz* Performance Efficiency Without FPU: 1.25 / 1.52 / 1.91 DMIPS/MHz** With FPU: 1.27 / 1.55 / 1.95 DMIPS/MHz** Memory Protection Optional 8 region MPU with sub regions and background region Interrupts Non-maskable Interrupt (NMI) + 1 to 240 physical interrupts Interrupt Priority Levels 8 to 256 priority levels Wake-up Interrupt Controller Up to 240 Wake-up Interrupts Sleep Modes Integrated WFI and WFE Instructions and Sleep On Exit capability. Sleep & Deep Sleep Signals. Optional Retention Mode with ARM Power Management Kit Bit Manipulation Integrated Instructions & Bit Banding Debug Optional JTAG & Serial-Wire Debug Ports. Up to 8 Breakpoints and 4 Watchpoints. Trace Optional Instruction Trace (ETM), Data Trace (DWT), and Instrumentation Trace (ITM) Algoritmos • DSP operations – MAC is key operation • • • • • Most operations are dominated by MACs These can be on 8, 16 or 32 bit operations y[n] = ∑ h[k ]x[n − k ] FIR Filters N −1 • Data communications • Smoothing data • Echo cancellation (adaptive versions) IIR filters • Audio equalization • Motor control k =0 y[n] = b0 x[n] + b1 x[n − 1] + b2 x[n − 2] + a1 y[n − 1] + a2 y[n − 2] FFT • Audio compression • Noise removal • Spread spectrum communication Y [k1 ] = X [k1 ] + X [k 2 ]e − jω Y [k 2 ] = X [k1 ] − X [k 2 ]e − jω Typical DSP Algorithms • DSP operations – MAC is key operation • • • • • Most operations are dominated by MACs These can be on 8, 16 or 32 bit operations FIR Filters • Data communications • Echo cancellation (adaptive versions) • Smoothing data IIR filters • Audio equalization • Motor control FFT • Audio compression • Spread spectrum communication • Noise removal y[n] = ∑ h[k ]x[n − k ] N −1 k =0 y[n] = b0 x[n] + b1 x[n − 1] + b2 x[n − 2] + a1 y[n − 1] + a2 y[n − 2] Y [k1 ] = X [k1 ] + X [k 2 ]e − jω Y [k 2 ] = X [k1 ] − X [k 2 ]e − jω 8 Aplicaciones • Procesamiento de Audio • Reproduccion de musica de alta fidelidad. • Telecomunicacion/voz. • Sintetizacion y reconocimiento de voz. Aplicaciones • Análisis de señales. • Medicion de vibraciones • Medicion de señales sismicas • ECG - EEG Aplicaciones • Control • Control de motores. • Controles PID. • Robotica. Aplicaciones • Procesamiento de imagenes • Radares. • Diagnostico por imagen. • Procesamiento de imagenes . ARM7 vs Cortex M3 ARM7TDMI Cortex-M3 Architecture ARMv4T (von Neumann) ARMv7-M (Harvard) ISA Support Thumb / ARM Thumb / Thumb-2 Pipeline 3-stage 3-stage + branch speculation Interrupts FIQ / IRQ NMI + 1 to 240 interrupts Interrupt Latency 24-42 cycles 12 cycles (6 when Tail Chaining) Sleep Modes None Integrated (3) Memory Protection None 8 regions MPU Dhrystone 0.95DMIPS/MHz (ARM) 1.25DMIPS/MHz Power Consumption 0.28mW/MHz 0.19mW/MHz Cortex M4 • Designed for applications requiring more computational performance • Cortex-M4 frees CPU resources in case digital signal processing task are used (less active cycles are needed) • Cortex M4 features: • A single-cycle multiply-accumulate unit (MAC) • Optimized single instruction multiple data (SIMD) instructions, saturating arithmetic instructions • Optional single precision Floating-Point Unit (FPU) Cortex-M3 DSP Cortex-M4 Ease of use C Programming Interrupt handling Ultra low power Harvard architecture Single cycle MAC Floating Point Barrel shifter Cortex M3 vs Cortex M4 Cortex-M3 Cortex-M4 Architecture ARMv7-M (Harvard) ARMv7-M (Harvard) ISA Support Thumb / Thumb-2 Thumb / Thumb-2 Single cycle 16, 32-bit MAC DSP Extensions NA Single cycle dual 16-bit MAC 8, 16-bit SIMD arithmetic Hardware Divide (2-12 cycles) Optional Floating Point Unit NA Single precision floating point unit IEEE 754 compliant Pipeline 3-stage + branch speculation 3-stage + branch speculation Interrupts NMI + 1 to 240 interrupts NMI + 1 to 240 interrupts Interrupt Latency 12 cycles (6 when Tail Chaining) 12 cycles (6 when Tail Chaining) Sleep Modes Integrated (3) Integrated (3) Memory Protection 8 regions MPU 8 regions MPU Dhrystone 1.25DMIPS/MHz 1.25DMIPS/MHz Cortex M3 vs Cortex M4 Set de Instrucciones Desempeño • Cortex™-M4 (SIMD+FPU) vs. Cortex-M3 Processors • Fixed Point: 2x Faster • Floating Point: 10x Faster Single Cycle Multiply Accumulate Instructions • Cortex-M4 features 32-bit hardware multiply-accumulate (MAC) unit • Makes digital signal processing more efficient and greatly reduces the consumption of CPU resources • Capable of accomplishing an operation of up to 32×32+64->64 or two operations of 16×16 in a single cycle • Main features: • Wide range of multiply-accumulate instructions • Choice of 16 or 32 bit multiply and 32 or 64 bit accumulate • All instructions execute in a single cycle MAC Instructions OPERATION INSTRUCTION 16 x 16 = 32 16 x 16 + 32 = 32 16 x 16 + 64 = 64 16 x 32 = (16 x 32) (16 x 16) (16 x 16) (16 x 16) SMULBB, SMULBT, SMULTB, SMULTT SMLABB, SMLABT, SMLATB, SMLATT SMLALBB, SMLALBT, SMLALTB, SMLALTT 32 SMULWB, SMULWT + 32 = 32 SMLAWB, SMLAWT SMUAD, SMUADX, SMUSD, SMUSDX ± (16 x 16) = 32 ± (16 x 16) + 32 = 32 SMLAD, SMLADX, SMLSD, SMLSDX ± (16 x 16) + 64 = 64 SMLALD, SMLALDX, SMLSLD, SMLSLDX 32 x 32 = 32 32 ± (32 x 32) = 32 32 x 32 = 64 (32 x 32) + 64 = 64 (32 x 32) + 32 + 32 = 64 MUL MLA, MLS SMULL, UMULL SMLAL, UMLAL UMAAL 32 ± (32 x 32) = 32 (upper) (32 x 32) = 32 (upper) SMMLA, SMMLAR, SMMLS, SMMLSR SMMUL, SMMULR Single Instruction Multiple Data (SIMD) • • Several instructions operate on “packed” data types • Byte or halfword quantities packed into words • Allows more efficient access to packed structure types • SIMD instructions can act on packed data: • Quad (4 parallel) 8-bit adds or subtracts • Dual (2 parallel) 16-bit adds or subtracts • All instructions execute in a single cycle SIMD extensions perform multiple operations in one cycle Sum = Sum + (A x C) + (B x D) SIMD Arithmetic Operations S Q SH U UQ UH Signed Signed Saturating Signed Halving Unsigned Unsigned Saturatin g Unsigned Halving ADD8 SADD8 QADD8 SHADD8 UADD8 UQADD8 UHADD8 SUB8 SSUB8 QSUB8 SHSUB8 USUB8 UQSUB8 UHSUB8 ADD16 SADD16 QADD16 SHADD16 UADD16 UQADD16 UHADD16 SUB16 SSUB16 QSUB16 SHSUB16 USUB16 UQSUB16 UHSUB16 ASX SASX QASX SHASX UASX UQASX UHASX SAX SSAX QSAX SHSAX USAX UQSAX UHSAX SAX ASX 1. Exchanges halfwords of the second operand register 1. Exchanges halfwords of the second operand register 2. Adds top halfwords and subtracts bottom halfwords 2. Subtracts top halfwords and adds bottom halfwords Optional Floating Point Unit (FPU) • IEEE 754 standard compliance • Single-precision floating point math key to some algorithms • Add, subtract, multiply, divide, MAC and square root • Fused MAC – provides higher precision Single-precision floating point math Cortex Microcontroller Software Interface Standard (CMSIS) • Abstraction layer for all Cortex-M processor based devices • Developed in conjunction with silicon, tools and middleware partners • Benefits to the embedded developer • Consistent software interfaces for silicon and middleware vendors • Simplifies re-use across Cortex-M processor-based devices • Reduces software development cost and time-to-market • Reduces learning curve for new Cortex microcontroller developers CMSIS-DSP Library • C Source Code optimized for Cortex-M4 • For CMSIS compliant C Compilers (MDK ARM, IAR and GCC) •Basic Math Functions • • • • • • • • • Vector Absolute Value Vector Addition Vector Dot Product Vector Multiplication Vector Negate Vector Offset Vector Scale Vector Shift Vector Subtraction •Fast Math Functions • Cosine • • Sine Square Root •Complex Math Functions • • • • • • Complex Conjugate Complex Dot Product Complex Magnitude Complex Magnitude Squared Complex-by-Complex Multiplication Complex-by-Real Multiplication •Filtering Functions • • • • • • • • • • • • • • Biquad Cascade IIR Filters Using Direct Form I Structure Biquad Cascade IIR Filters Using a Direct Form II Transposed Structure High Precision Q31 Biquad Cascade Filter Convolution Partial Convolution Correlation Finite Impulse Response (FIR) Decimator Finite Impulse Response (FIR) Filters Finite Impulse Response (FIR) Lattice Filters Finite Impulse Response (FIR) Sparse Filters Infinite Impulse Response (IIR) Lattice Filters Least Mean Square (LMS) Filters Normalized LMS Filters Finite Impulse Response (FIR) Interpolator •Matrix Functions • • • • • • • Matrix Addition Matrix Initialization Matrix Inverse Matrix Multiplication Matrix Scale Matrix Subtraction Matrix Transpose •Transform Functions • • • Complex FFT Functions DCT Type IV Functions Real FFT Functions •Controller Functions • • • • • • Sine Cosine PID Motor Control Vector Clarke Transform Vector Inverse Clarke Transform Vector Park Transform Vector Inverse Park transform •Statistics Functions • • • • • • Maximum Mean Minimum Power Root mean square (RMS) Standard deviation • Variance •Support Functions • • • • • • Convert 16-bit Integer value Convert 32-bit Integer value Convert 32-bit floating point value Convert 8-bit Integer value Vector Copy Vector Fill •Interpolation Functions • • Linear Interpolation Bilinear Interpolation Arquitecturas ARM • ARM posee un gran numero de arquitecturas, las mas difundidas son: – ARMv4T (ARM7TDMI y ARM9T) – ARMv5TEJ (ARM926EJ y ARM7EJ) – ARMv6 (ARM11) – ARMv6-M (Cortex-M0) – ARMv7 • Perfil M, diseñado para aplicaciones de microcontroladores, un procesamiento eficiente es tan importante como, el bajo consumo y un bajo costo. • Perfil R, diseñado para aplicaciones embebidas de alta prestaciones, en los cuales un desempeño en tiempo real es necesario. • Perfil A, diseñado para correr sistemas operativos tales como Linux o Windows CE. Arquitecturas ARM Soluciones ARM de ATMEL SAM4N • Entry level Cortex-M3 flash based MCUs – – – – – – – – – – – – Extends SAM4N offering down to 1MByte Flash and 80kB SRAM Entry level Cortex-M4 MCU running up to 80MHz Simplified PCB design through on-chip termination resistors Extended supply 1.62-3.6V Backup mode down to 1.8µA High data rate serial com including SPI up to 38Mbps Native Capacitive Touch support 1 SPI, 3 I2C, 7 UART NO USB 12 bits ADC and 10 bits DAC Package, QFP, QFN, BGA, from 48 to 100 SAM7S/SAM3S/SAM4S Pin-to-pin compatible SAM3N vs SAM4N Core SAM3N SAM4N Cortex-M3 Cortex-M4 Flash 16K-256K Single bank 512KB/1MB Single bank 10b Down to 2.4v Communication UART:4 SPI: 1 I2C: 2 12b (at 24Ksps) Down to 1.62V Package Options 48, 64, 100 QFN, QFP, BGA Status Production 48, 64, 100 QFN, QFP, BGA, VFBGA Frequency SRAM ADC 48MHz Up to 24K 80MHz 80K UART:7 SPI: 1 I2C: 3 March 2013 SAM3N/4N Aplicaciones High Speed serial comm. RS485/Modbus stack ADC/DAC Low power QTouch Safety features 31 QTouch Low Power 48pin in 256KB Safety features RS485 QTouch Low Power Downto 16KB of flash Low Power Up to 4 UART High speed serial comm. Qtouch Low Power SAM3/4S • General purpose Cortex-M3/M4 flash based MCUs – – – – – – – – – – – – – – From 64KB to 2MB Flash Memory Simple an dual bank From 8KB to 160KB SRAM Cortex-M3 MCU running from 64Mhz to 100Mhz Cortex-M4 MCU running up to 120Mhz Simplified PCB design through on-chip termination resistors Extended supply 1.62-3.6V Backup mode down to 1.8µA Native Capacitive Touch support Up to 3 SPI, 2 I2C, 5 UART, I2S SDIO/SD/MMC FS USB 2.0 device 12 bits ADC and 12 bits DAC Package, QFP, QFN, BGA, from 48 to 100 SAM3S vs SAM4S Core Frequency Cache SAM3S/SD SAM4S/SD SAM4SA Cortex-M3 Cortex-M4 Cortex-M4 - - / 2KB 2KB 64MHz 120MHz 120MHz 64K-512K Single/Dual bank 512KB-1MB / 1-2MB Single/Dual bank 1MB Single bank Package Options 48, 64, 100 QFN, QFP, BGA 64, 100 QFN, QFP, BGA, VFBGA, WLCSP 64, 100 QFN, QFP, BGA, VFBGA Status Production Production Production Feb. 13 Flash SRAM Up to 64K 128K / 160K 160K SAM3S/4S Aplicaciones High Speed serial comm. RS485/Modbus Up to 1MB flash 12b ADC/DAC Low power QTouch Safety features Manchester codec 802.15.4 based Wireless support EBI 34 QTouch FS USB HSMCI CMOS sensor I/F Up to 1MB flash Low Power Safety features 802.15.4 based Wireless support EBI RS485 QTouch Low Power CMOS sensor I/F Up to 1MB flash Downto 16KB of flash 802.15.4 based Wireless support EBI Low Power Up to 5 UART Up to 1MB flash Dual bank flash EBI Low power QTouch CMOS sensor I/F EBI High speed serial comm. HSMCI QTouch Low Power Up to 1MB flash Form, Fit & Function Compatibility • Seamless Migration Path Between SAM7S, SAM3N, SAM3S, SAM4N and SAM4S Devices – SAM7S to SAM3S Migration Guide Available 35 Dual Bank Flash Available on the S, U, X and A series. What Problem does it Solve? Dual Bank Flash enables Fault Tolerant Self-Programming –Provide a fail-safe method of upgrading firmware on remote units –Enable background firmware upgrade without halting application execution How Does it Work? –Safe and secure remote update: RESET VECTOR RESET VECTOR Wired / wireless stream Normal operation from Bank 1 while simultaneously remotely programming Bank 2 RESET VECTOR Power or comms failure cause Bank 2 program fail while Bank 1 continues to operate and requests retransmission Reprogramming successful, device now executes from Bank 2,Bank 1 available for next update SAM4L • Low Power Cortex-M4 flash based MCUs Up to 256KB Flash Up to 32KB SRAM Cortex-M4 MCU running up to 48Mhz Extended supply 1.62-3.6V Up to SPI, 4 I2C, 4 UART FS USB 2.0 (Device and Host) LCD Controller up to 4x40 segment Hardware QTouch Acquisition up to 32 channels 12 bits ADC and 10 bits DAC Package, QFP, QFN, BGA, from 48 to 100 pines Pico Power Technology Cortex™-M4 processor-based devices, delivering the lowest power in active mode (down to 90µA/MHz) as well as sleep mode with full RAM retention (1.5µA) with the shortest wake-up time (down to 1.5µs) SAM4L Family Overview SAM4LC Series Pins SAM4LS Series 100 64 48 100 64 48 4x40 4x23 4x13 No No No Yes Yes Yes No No No USB Host and Device Host and Device Host and Device Device Device Device GPIO 75 43 27 80 48 32 2 Master + 2 Master/Slave 2 Master/Slave 1 Master/Slave 2 Master + 2 Master/Slave 2 Master/Slave 1 Master/Slave LCD Hardware Crypto I2C SAM4L Aplicaciones • Industrial • • • • • Healthcare • • • • • Process transmitters Sensors & detectors Sub-meters Sensor hub Glucose meters Pulse oximetry Human fall detection Blood pressure Consumer • • • • Sport watches Remote control Toys Sensor hub SAM4E • Industrial Cortex-M4 flash based MCUs – – – – – – – – – – – – – – Up to 1MB Flash Up to 128KB SRAM Cortex-M4 MCU running up to 120Mhz FPU (Floating point unit ) 2KB cache Extended supply 1.62-3.6V Up to 3 SPI, 2 I2C, 4 UART SDIO/SD/MMC Dual CAN 10/100 Ethernet controller Full Speed USB 2.0 Device 16 bits ADC and 12 bits DAC Crypto AES Package, QFP, QFN, BGA, from 100 to 144 SAM4E Aplicaciones • Industrial Automation and M2M • Programmable logic controller • Drive control • Robotics • Building and Home Control • • • • Gateway Access control Control panels Room control unit • Energy Management • Power supplies communication • Switch breakers communication • Inverters communication • Automotive Aftermarket • Fleet management • Telematics Porfolio Cortex de Atmel SAM3/4 Technical Highlights Dual Bank Flash EBI scrambling USB HS Phy On-Die Termination PIO capture mode (camera int.) ECC on Embedded Flash Event System Sleep Walking Cache Active power consumption (/MHz) 2MB of Flash SAM3/4N SAM3/4S - √ - - √ √ √ - - - - √ √ √ - √ SAM4E SAM4L SAM3U SAM3A SAM3X - - √ - √ √ √ √ √ √ √ √ - √ √ √ - - - √ √ √ √ √ √ √ - - - √ - - - - SAM4S √ √ - - - 200µA 200µA 200µA Down to 90 µA 350µA 350µA 350µA - SAM4S - - - - - - - - √ - - - Atmel SAM3/4 Ecosystem 44 Atmel Studio 6 - Two Architectures, One Studio •Integrating ARM and AVR Design •Free, Professional IDE, Integrated GNU C/C++ Compiler •300 Atmel ARM and AVR MCUs •QTouch Composer •Intelligent editor •New Project Wizard with over 1,000 project examples •Seamless connection to all in-system debuggers – SAM-ICE in-circuit debugger – SAM3 and SAM4 evaluation kits •Cycle accurate chip and peripheral simulator •Easy 8/32 bit migration – Single IDE for ARM and AVR – Atmel Software Framework Atmel Studio 6 – Atmel Software Framework • • • Software Library – Peripheral drivers – Hardware abstraction – Communication – Graphics – Up to 50% of new project Standard APIs – Easy code migration – Support for 300 ARM + AVR MCUs – Common 8/32-bit platform ASF Explorer – Manage ASF components – Trace driver dependencies – Easy access to documentation Atmel Studio 6 – QTouch Composer • • • QTouch Project Wizard – Configure QTouch project – Optimized QTouch library code – Automatic power management Touch Wizard – Automatic performance tests – Optimal design recommendations Power Analyzer – Real-time monitoring of MCU power consumption – Profiling and visualization • Time spent on touch sensing • Time spent on user code • Time spent in power down Kits de desarrollo y placas Xplained