una arquitectura distribuida para el control de robots

Transcripción

una arquitectura distribuida para el control de robots
UNA ARQUITECTURA DISTRIBUIDA PARA EL
CONTROL DE ROBOTS AUTÓNOMOS MÓVILES
UN ENFOQUE APLICADO A LA CONSTRUCCIÓN DE LA PLATAFORMA
QUAKY-ANT
Dept. Ingeniería de la Información y las Comunicaciones
Universidad de Murcia
Presentada por:
Humberto Martínez Barberá
Dirigida por:
Antonio Fernando Gómez Skarmeta
Enero 2.001
A DISTRIBUTED ARCHITECTURE FOR INTELLIGENT
CONTROL IN AUTONOMOUS MOBILE ROBOTS
AN APPLIED APPROACH TO THE DEVELOPMENT OF THE QUAKY-ANT
PLATFORM
Dept. of Communications and Information Engineering
University of Murcia
Presented by:
Humberto Martínez Barberá
Supervised by:
Antonio Fernando Gómez Skarmeta
January 2.001
D. Antonio Fernando Gómez Skarmeta, Profesor Titular de Universidad del Área de
Ingeniería Telemática en el Departamento de Ingeniería de la Información y las
Comunicaciones, AUTORIZA:
La presentación de la Tesis Doctoral titulada “Una Arquitectura Distribuida para el
Control de Robots Autónomos Móviles: un Enfoque Aplicado a la Const rucción de la
Plataforma Quaky-Ant“, realizada por D. Humberto Martínez Barberá, bajo mi inmediata
dirección y supervisión, en el Departamento de Ingeniería de la Información y las
Comunicaciones, y que presenta para la obtención del grado de Doctor por la Universidad de
Murcia.
En Murcia, a 5 de Febrero de 2.001
A Esther.
1
Agradecimientos / Acknowledgements
No hay tesis que se precie sin su página de agradecimientos, y esta no va a ser menos.
Para empezar, todo este trabajo no habría sido posible sin la inestimable ayuda de distintos
compañeros: Miguel Zamora y Benito Úbeda (hardware de Quaky-Ant), Eduardo Martínez
(transmisión de vídeo en multicast), Juan Botía y Mercedes Valdés (técnicas inteligentes para
fusión sensorial), Jesús Ibáñez (genético para fusión de comportamientos), y por supuesto,
todo el grupo de Sistemas Inteligentes, Universidad de Murcia, especialmente el subgrupo que
habita el sótano y se hace llamar grupo ANTS. Estos han contribuido en diversas facetas
complementarias, y no menos importantes, del quehacer diario, en particular el partidillo de
los jueves (¡aúpa equipo rojo!).
I would also like to express my thanks to the people at the centre for Applied
Autonomous Sensor Systems, University of Örebro, Sweden. They made my work very
pleasant during my stay in the white hell. I specially would like to thank Dr. Alessandro
Saffiotti for his interest and valuable help.
Por supuesto, gran parte de la culpa de haber terminado la tesis la tiene mi director Dr.
Antonio Gómez que ha sabido darme suficiente libertad para tomar distintos caminos,
aconsejarme para tomar decisiones, y obligarme a realizar también la parte más ingrata del
trabajo de la tesis.
He de dar también las gracias a mi familia, porque gran parte, si no todo, de lo que soy
se lo debo a ellos. ¡Si!, todos los cacharros y estropicios que he hecho parece que servían de
algo. Finalmente, no puedo sino agradecer a Esther su compresión y apoyo durante estos años
que, seguro, han servido para cambiar el chip y resetear la cabeza, y en cualquier caso,
renovar las ilusiones para terminar este trabajo. Gracias de verdad.
3
Contenido / Contents
PARTE Iª.......................................................................................................................................................................9
RESUMEN ...................................................................................................................................................................13
DESCRIPCIÓN GENERAL ..................................................................................................................................15
0.1 I NTRODUCCIÓN.................................................................................................................................................15
0.2 LA PLATAFORMA QUAKY-A NT ......................................................................................................................16
0.3 FUSIÓN SENSORIAL ...........................................................................................................................................18
0.4 NAVEGACIÓN....................................................................................................................................................23
0.5 ARQUITECTURA DE CONTROL.........................................................................................................................32
0.6 APRENDIZAJE DE LA FUS IÓN DE COMPORTAMIENT OS .................................................................................54
0.7 UNA APLICACIÓN DE TELEOPERACIÓN BASADA EN EL WEB.......................................................................57
0.8 CONCLUSIONES .................................................................................................................................................61
0.9 TRABAJO FUTURO.............................................................................................................................................66
PARTE IIª ...................................................................................................................................................................71
ABSTRACT ................................................................................................................................................................73
INTRODUCTION.....................................................................................................................................................75
1.1 OBJECTIVES AND CONTENTS ...........................................................................................................................75
1.2 SOME BACKGROUND ON MOBILE ROBOTICS .................................................................................................76
1.2.1 Mobile robotics .......................................................................................................................................76
1.2.2 History of mobile robotics .....................................................................................................................78
1.2.3 Robot intelligence ...................................................................................................................................81
1.2.4 Soft Computing ........................................................................................................................................83
1.3 CONTRIBUTIONS OF THIS THESIS ....................................................................................................................95
HARDWARE ARCHITECTUR E........................................................................................................................97
2.1 I NTRODUCTION.................................................................................................................................................97
2.2 M ADC AR MICRO-ROBOT .................................................................................................................................98
2.2.1 Description...............................................................................................................................................98
2.2.2 Sonar controller ....................................................................................................................................100
2.2.3 Shaft encoder .........................................................................................................................................101
2.2.4 Electronic compass...............................................................................................................................102
2.2.5 Conclusions ............................................................................................................................................103
2.3 LA-GARRA ROBOT ..........................................................................................................................................104
2.3.1 Description.............................................................................................................................................104
4.3.2 Conclusions ............................................................................................................................................106
5
CONTENIDO /CONTENTS
2.4 QUAKY-ANT ROBOT ...................................................................................................................................... 106
2.4.1 Description ............................................................................................................................................106
2.4.2 Modular architecture and interconnection...................................................................................... 107
2.4.3 External video and data communication.......................................................................................... 110
2.4.4 Sonar and Infrared module ................................................................................................................ 111
2.4.5 Motor control module.......................................................................................................................... 114
2.4.6 Compass module ..................................................................................................................................115
2.4.7 Bus master module............................................................................................................................... 117
2.4.8 Conclusions ........................................................................................................................................... 117
SENSOR FUSION ..................................................................................................................................................119
3.1 I NTRODUCTION ..............................................................................................................................................119
3.2 SENSOR MODELS ............................................................................................................................................121
3.2.1 Ultrasonic sonar range finders .......................................................................................................... 121
3.2.2 Infrared sensors .................................................................................................................................... 123
3.3 SENSOR FUSION USING NEURAL NETWORKS ..............................................................................................124
3.3.1 Experiments and results ...................................................................................................................... 125
3.4 SENSOR FUSION USING FUZZY RULES .......................................................................................................... 129
3.4.1 Some background on clustering methods.........................................................................................129
3.4.2 Fuzzy tabu clustering........................................................................................................................... 131
3.4.3 Adaptive network based fuzzy inference system (ANFIS) .............................................................138
3.4.4 Experiments and results ...................................................................................................................... 139
3.5 CONCLUSIONS ................................................................................................................................................ 143
ROBOT NAVIGATION .......................................................................................................................................145
4.1 I NTRODUCTION ..............................................................................................................................................145
4.1.1 Map representation methods ..............................................................................................................146
4.1.2 Path planning........................................................................................................................................148
4.1.3 Localisation........................................................................................................................................... 150
4.3 M AP BUILDING AND PATH PLANNING .........................................................................................................152
4.3.1 Fuzzy grid map ..................................................................................................................................... 152
4.3.2 A*-based planning ................................................................................................................................ 155
4.3.3 Experiments and results ...................................................................................................................... 158
4.4 M AP BUILDING AND LOCALISATION ............................................................................................................160
4.4.1 Fuzzy segments map ............................................................................................................................161
4.4.2 Fuzzy segments localisation ............................................................................................................... 166
4.4.3 Experiments and results ...................................................................................................................... 167
4.5 CONCLUSIONS ................................................................................................................................................ 169
CONTROL ARCHITECTURE .......................................................................................................................... 171
5.1 I NTRODUCTION ..............................................................................................................................................171
5.1.1 Del iberative and reactive control architectures .............................................................................171
5.1.2 Hybrid control architectures ..............................................................................................................176
5.1.3 Behaviour definition ............................................................................................................................179
5.1.4 Behaviour coordination ...................................................................................................................... 182
6
HUMBERTO M ARTÍNEZ BARBERÁ
5.1.5 Multiagent systems ...............................................................................................................................185
5.2 T HE BGA CONTROL ARCHITECTURE ...........................................................................................................189
5.2.1 Antecedents: MadCar’s architecture ................................................................................................189
5.2.2 BGA: a multiagent architecture .........................................................................................................194
5.2.3 Inter-agent communication in BGA...................................................................................................203
5.2.4 Conclusions ............................................................................................................................................208
5.3 T HE BG PROGRAMMING LANGUAGE ...........................................................................................................208
5.3.1 Antecedents ............................................................................................................................................208
5.3.2 BG syntax and semantics .....................................................................................................................209
5.3.3 The BGen development environment.................................................................................................222
5.3.4 Conclusions ............................................................................................................................................225
A SAMPLE AGENT ARCHI TECTURE .........................................................................................................227
6.1 I NTRODUCTION...............................................................................................................................................227
6.2 SENSOR PROCESSING AND ACTUATOR CONTROL .......................................................................................228
6.3 REACTIVE CONTROL ......................................................................................................................................229
6.3.1 Common block .......................................................................................................................................230
6.3.2 Obstacle avoidance behaviours..........................................................................................................232
6.3.3 Goal reaching behaviour .....................................................................................................................236
6.3.4 Alignment behaviours ...........................................................................................................................237
6.3.5 Behaviour blending ..............................................................................................................................238
6.4 NAVIGATION ...................................................................................................................................................240
6.5 PLANNING ........................................................................................................................................................241
6.6 EXPERIMENTS AND RESULTS ........................................................................................................................243
6.6.1 Point-to-point navigation experiment (nav -A).................................................................................243
6.6.2 Point-to-point navigation experiment (nav -B).................................................................................246
6.6.3 Point-to-point navigation experiment (nav -C).................................................................................249
6.6.4 Map-building experiment (map-A) ....................................................................................................251
6.7 CONCLUSIONS .................................................................................................................................................253
BEHAVIOUR FUSION LEARNING, ...............................................................................................................255
7.1 I NTRODUCTION...............................................................................................................................................255
7.2 BEHAVIOUR FUSION USIN G GENETIC -BASED ALGORITHMS......................................................................255
7.2.1 Introduction ...........................................................................................................................................256
7.2.2 Representation and initial population ...............................................................................................256
7.2.3 Genetic operators and parameters ....................................................................................................258
7.2.4 Fitness function .....................................................................................................................................260
7.3 EXPERIMENTAL RESULTS ..............................................................................................................................261
7.4 CONCLUSIONS .................................................................................................................................................264
A WEB-BASED TELEOPER ATION APPLICATION ...............................................................................267
8.1 I NTRODUCTION...............................................................................................................................................267
8.2 WEB ROBOTICS TECHNOLOGIES ...................................................................................................................271
8.2.1 The UWA Telerobot..............................................................................................................................271
8.2.2 EPFL KhepOnTheWeb.........................................................................................................................273
7
CONTENIDO /CONTENTS
8.2.3 CMU Xavier .......................................................................................................................................... 274
8.2.4 NASA WITS ........................................................................................................................................... 277
8.3 T HE BGEN-W EB SYSTEM .............................................................................................................................279
8.3.1 Objectives ..............................................................................................................................................279
8.3.2 BGen-Web architecture.......................................................................................................................280
8.3.3 Experiments and results ...................................................................................................................... 283
8.4 CONCLUSIONS ................................................................................................................................................ 285
CONCLUSIONS AND FUTURE WORK .......................................................................................................287
9.1 CONCLUSIONS ................................................................................................................................................ 287
9.1.1 The Quaky-Ant mobile robot..............................................................................................................287
9.1.2 Sensor fusion .........................................................................................................................................288
9.1.3 Navigation .............................................................................................................................................288
9.1.4 Control architecture and programming language .........................................................................289
9.1.5 Web robotics application .................................................................................................................... 291
9.2 F UTURE WORK................................................................................................................................................ 291
COMMERCIAL MOBILE ROBOTS: A SURVEY..................................................................................... 295
A.1 IS ROBOTICS ................................................................................................................................................. 295
A.2 ACTIVM EDIA R OBOTICS..............................................................................................................................296
A.3 NOMADICS T ECHNOLOGIES .........................................................................................................................297
A.4 K-T EAM .........................................................................................................................................................297
A.5 ROBOS OFT SERVICE ROBOTICS ..................................................................................................................298
A.6 APPLIED AI SYSTEMS ..................................................................................................................................298
A.7 P ROBOTICS ..................................................................................................................................................... 299
OBTAINING THE BGEN S OFTWARE .........................................................................................................301
REFERENCES ........................................................................................................................................................303
8
PARTE Iª
9
“... jamás podrán usar palabras ni otros signos, componiéndolos como
hacemos nosotros, para manifestar a los demás nuestros pensamientos.
Pero se puede concebir una máquina que exprese palabras e, incluso, emita
algunas respuestas a acciones de tipo corporal que se le causen y que
produzcan cambios en sus órganos ...”
(René Descartes, Discurso del Método)
“I wish to work miracles”
(Leonardo da Vinci)
11
Resumen
Esta tesis presenta una serie de trabajos relacionados a la robótica móvil, que abarcan
los distintos niveles necesarios para conseguir que un robot móvil realice tareas de forma
independiente en un entorno previamente desconocido. Se emplean técnicas de SoftComputing (redes neuronales, algoritmos genéticos, y especialmente lógica difusa) para el
desarrollo de los distintos niveles.
Se empieza describiendo la arquitectura hardware que da lugar a la plataforma QuakyAnt, un robot autónomo móvil que es la base de la mayor parte de las pruebas de los distintos
algoritmos en entorno real. Tomando como base esta plataforma, se presenta una solución al
problema de la fusión sensorial no basada en modelo del entorno. Dicha fusión se resuelve
inicialmente con una red neuronal simple para, posteriormente, utilizar un sistema híbrido que
genera reglas difusas y basado en la red ANFIS y algunos algoritmos de clustering, uno de los
cuales ha sido desarrollado en el marco de este trabajo, el clustering difuso con búsqueda
tabú. Una vez que se asumen unos sensores mejorados (tras el proceso de fusión sensorial), se
plantea el problema de la navegación. Este se resuelve, en dos etapas, por la aplicación de dos
técnicas de navegación. En la primera etapa se aplica un mapa de celdillas difusas para el
problema de la elaboración de mapas, así como una búsqueda A* para el problema de la
generación de trayectorias. En la segunda etapa se aplica un mapa de segmentos difusos para
el problema de la elaboració n de mapas, así como un sistema de localización basado en el
mapa de segmentos difusos.
Una vez que se tienen desarrolladas las capacidades básicas de navegación y percepción
del entorno, se define una arquitectura distribuida de control (BGA) que sirve como base del
desarrollo de los distintos elementos de control. Esta arquitectura permite distinguir entre
procesos
deliberativos
y
comportamientos
reactivos.
Los
distintos
elementos
de
la
arquitectura se comunican por medio de un protocolo basado en KQML sobre UDP. Para
facilitar el desarrollo de programas de control por medio de la arquitectura BGA, se ha
13
RESUMEN
definido un lenguaje (BG) que hace uso de la lógica difusa para especificar y definir los
distintos módulos de control. La arquitectura BGA y el lenguaje BG se han integrado en un
entorno de programación y simulación (BGen) desarrollado íntegramente en Java, con el
objetivo de poder ser integrado en distintos robots móviles. Para resolver el problema de la
fusión de comportamientos se ha utilizado un algoritmo genético que genera y sintoniza el
conjunto de reglas difusas de fusión, evaluando las distintas soluciones candidatas en el
entorno BGen. Finalmente, se muestra una aplicación de la plataforma Quaky-Ant y del
entorno BGen en un entorno web (BGen-Web) que permite la teleoperación del robot móvil
en tareas de exploración y presenta al usuario gran interactividad gracias al uso de la
transmisión de vídeo multicast.
14
Capítulo 0
Descripción General
0.1 Introducción
La robótica móvil es un campo de investigación relativamente nuevo que está
relacionado con el control de vehículos autónomos o semi-autónomos. Los robots móviles no
son sólo una colección de algoritmos, sino también sistemas físicos que trabajan en entornos
reales. Así, los robots móviles facilitan un marco de trabajo en el mundo real donde se pueden
probar conceptos teóricos y algoritmos. Aunque la mayoría de los robots móviles son
experimentales, hay aplicaciones emergentes de importancia industrial o comercial.
Esta tesis aborda el problema de hacer que un robot móvil desarrolle tareas con cierta
fiabilidad en entornos completamente desconocidos. Este tópico es bastante general, y así,
está relacionado con la especificación de tareas de robots móviles, el desarrollo de
comportamientos de robots móviles, la interpretación del entorno, y la validación del sistema
final. Hay muchas familias de técnicas disponibles para atacar los distintos subproblemas.
Esta tesis usa y desarrolla técnicas inteligentes en el campo que se conoce como softcomputing (Zadeh, 1994) o computational intelligence (Schult et al., 2000). Haciendo un uso
extenso de la lógica difusa, los algoritmos genéticos, y las redes neuronales. Un punto
importante es el uso de técnicas de aprendizaje para asistir en el desarrollo de tanto los
comportamientos del robot y la interpretación del entorno.
Dados los anteriores problemas de robótica, el objetivo principal de esta tesis es el uso,
desarrollo e integración de técnicas que no hacen uso de modelos a priori del entorno.
Además, se ha creado un entorno de desarrollo para hacer el sistema robótico robusto y
ayudar en la programación de software de control. Este entorno de desarrollo incluye módulos
15
CAPÍTULO 0: DESCRIPCIÓN GENERAL
para la especificación, aprendizaje y validación de comportamientos de robots móviles, e
integra una arquitectura híbrida (reactivo-deliberativa) que se ejecuta en un entorno
distribuido. Los distintos trabajos de la tesis se pueden agrupar en los siguientes puntos:
• El desarrollo de una arquitectura de control, basada en el uso de técnicas inteligentes.
• La especificación y desarrollo de la plataforma y el modelo de comunicación entre los
diferentes elementos del sistema.
• El uso y desarrollo de técnicas de aprendizaje para la definición y fusión de
comportamientos así como la interpretación del entorno.
• El uso y desarrollo de técnicas de navegación en entornos desconocidos: realización de
mapas, planificación de trayectorias y localización.
0.2 La plataforma Quaky-Ant
Durante la preparación de esta tesis se han desarrollado, construido y probado distintas
plataformas, desde el muy simple MadCar hasta el robot final Quaky-Ant (ver capítulo 1). El
desarrollo de los distintos robots estaba condicionado por diferentes factores, siendo los más
importantes las dimensiones físicas de los laboratorios y pasillos adyacentes, y el coste total.
Las dimensiones físicas limitaban el tamaño máximo a 50 cm de diámetro, y el coste estaba
limitado
por
un
presupuesto
de
600.000
ptas.
Adicionalmente,
se
impusieron
otras
restricciones al desarrollo final:
• La plataforma debía ser suficientemente flexible para probar los distintos algoritmos y
arquitecturas presentadas en esta tesis.
• La plataforma debía ser robusta y fácilmente actualizable, haciendo uso de distintos
sensores (ultrasonidos e infrarrojos).
• La plataforma debía ser capaz de desarrollar satisfactoriamente tareas complejas,
operando en entornos tanto interiores como exteriores no abruptos.
El resultado fue el robot Quaky-Ant que, curiosamente, se parece a un robot Pioneer de
ActiveMedia Robotics (ver apéndice A), pero sólo es una coincidencia estética. Este robot
16
HUMBERTO M ARTÍNEZ BARBERÁ
incorpora una arquitectura distribuida organizada en dos capas, la de bajo nivel basada en
microcontroladores y la de alto nivel basada en microprocesador. El robot incluye 2 motores
de 12 V DC que atacan las ruedas de la dirección diferencial y el soporte de una pequeña
rueda “loca”. Esta es la típica configuración en triángulo para obtener un buen equilibrio.
Además, la distancia al suelo es pequeña para permitir un centro de gravedad bajo. El robot
(Figura 0.1) (Figura 0.2) dispone de distintos sensores, que se describen a continuación:
• Diez sensores de ultrasonidos que se usan para medir distancias a paredes y objetos
lejanos.
• Siete sensores de infrarrojos que se usan para medir distancias a paredes y objetos
cercanos.
• Dos encoders de alta resolución montados sobre los ejes de las ruedas motrices, y que se
usan para los cálculos odométricos.
• Un compás electrónico que mide la orientación del robot respecto al polo norte magnético
de la Tierra. Se utiliza para compensar los errores de odometría en entornos interiores.
• Un receptor GPS para posicionamiento absoluto, que se utiliza para compensar los errores
de odometría en entornos exteriores.
• Una cámara RGB que se utiliza como realimentación visual en tareas de teleoperación.
Figura 0.1: Robot Quaky-Ant.
Figura 0.2: Componentes de Quaky-Ant.
17
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Teniendo en cuenta el coste, el rendimiento y la capacidad de ampliación, el robot
Quaky-Ant se compara favorablemente con distintos robots comerciales (ver apéndice A). Las
principales características de la plataforma se pueden resumir en:
• Exhibe una arquitectura abierta. Dos puntos clave son el uso del bus I2C para la
interconexión de los distintos módulos y la definición de un protocolo abierto genérico.
• Una consecuencia del diseño de la arquitectura es la robustez y capacidad de ampliación
(todos los módulos se acceden de la misma forma independientemente de su naturaleza).
• La plataforma no está ligada a ninguna arquitectura de control o sistema de desarrollo,
aunque ha servido de base para el desarrollo de una.
• El robot opera suave y correctamente en entornos interiores y es capaz también de operar
en exteriores no abruptos. En ambos casos, no exhibe una alta velocidad máxima (en torno
a los 0.45 m/s) ya que esta no era fundamental en el diseño. En contraste, exhibe una
buena maniobrabilidad y precisión.
0.3 Fusión sensorial
Uno de los objetivos de la tesis es que el robot Quaky-Ant realice tareas en entornos de
oficina, sin conocimiento previo del entorno, y sin ninguna modificación del entorno (ver
capítulo 3). Este objetivo, entre otras cosas, dicta la necesidad de tener sensores fiables tanto
para construir mapas como para controlar el robot. El robot Quaky-Ant sólo dispone de
sensores de bajo coste que dan una distancia como salida: ultrasonidos e infrarrojos. Estos dos
tipos de sensores pueden ser complementarios (cada uno tiene sus propias fuentes de error), y
una forma natural de manejar la incertidumbre es mediante la realización de fusión sensorial
al nivel de la señal. Así, la salida de cada par de sensores apuntando a la misma zona es
combinada en lo que se llama un sensor virtual. Puesto que el robot construye su propio
modelo del mundo, técnicas que necesitan previamente un modelo del entorno (como ocurre
con los filtros de Kalman) no son aplicables. Las técnicas de aprendizaje supervisado han sido
previamente utilizadas para fusión sensorial, y se usan también en esta tesis, en particular una
red neuronal y un sistema de reglas difusas.
18
HUMBERTO M ARTÍNEZ BARBERÁ
Uno de los problemas más cruciales del aprendizaje supervisado par la fusión sensorial
es la adquisición de datos. Esta adquisición no es simple porque se necesitan suficientes
medidas representativas: tanto las lecturas de los sensores y la distancia real para diferentes
ángulos de incidencia, superficies de objetos, y distancias. En este caso, la preparación de los
experimentos en un entorno real es muy difícil. Se encuentran disponibles en la literatura
buenos modelos de los sensores, y por ello se utiliza la simulación
para la adquisición de
datos. El método de fusión es entrenado con estos datos, y el modelo de fusión resultante se
valida en un entorno real siguiendo un proceso iterativo (Figura 0.3).
Figura 0.3: Procedimiento de construcción del método de fusión.
Como modelo de los ultrasonidos se ha utilizado el método del sonar-tracing (Gallardo,
1999). Este método produce buenos resultados porque puede modelar las distintas causas de
error
de
los
ultrasonidos,
incluidos
los
dobles
rebotes.
Sin
embargo,
no
modela
completamente el efecto del lóbulo principal. Así, cuando el rayo de distancia mínima está
alejado del centro del lóbulo, el tiempo de vuelo resultante es muy conservativo y produce
distancias menores que las reales. El método ha sido modificado para tener en cuenta también
este efecto, introduciendo el efecto de la distancia angular del rayo al centro del lóbulo. Como
no se encuentran en la literatura muchos modelos de infrarrojos se ha usado directamente una
linearización de la respuesta empírica del sensor, basada en la información del fabricante.
Se han utilizado dos entornos distintos para la adquisición de datos para el
entrenamiento de la red neuronal, y su posterior validación. El primero está inspirado en la
planta del sótano de la Facultad de Informática (donde está el laboratorio del autor), con unas
dimensiones de 17 x 22 m, pasillos de 2 m y puertas de 80 cm (Figura 0.4). El segundo está
19
CAPÍTULO 0: DESCRIPCIÓN GENERAL
basado en la planta del Tinity College Fire Fighting Contest, con unas dimensiones de 3.5 x
3.5 m, y pasillos y puertas de 65 cm (Figura 0.5).
Figura 0.4: Entorno sótano.
Figura 0.5: Entorno trinity.
Para seleccionar los valores apropiados para los parámetros de la red se sigue un
proceso iterativo (Figura 0.6). En lugar de entrenar una única red, se entrenan varias redes con
distintos parámetros, y la mejor red, aquella con el menor error sobre ejemplos de prueba, se
utiliza como salida del proceso de aprendizaje. Para acelerar la ejecución del proceso de
aprendizaje se utilizan simultáneamente cuatro equipos que ejecutan un algoritmo de
backpropagation. Una vez que el proceso ha terminado el robot se simula en ambos
escenarios sobre distintos caminos predefinidos, sobre los que se realizan medidas de ambos
tipos de sensores. Las medidas se fusionan y se compara el resultado con la medida real para
así obtener una medida de error. El propósito de este paso es medir la capacidad de
generalización de la red neuronal.
Figura 0.6: Proceso de entrenamiento de la red.
20
HUMBERTO M ARTÍNEZ BARBERÁ
Los resultados se tabulan haciendo uso del error medio (Avg), varianza (Var), y error
cuadrático medio (RMSE). Los resultados de la red neuronal se comparan con los des los
sensores individuales y con el método de fusión de reglas simples de Flynn (Flynn, 1988).
Cabe destacar que la red produce mejores resultados (Tabla 3.1) (Tabla 3.2) en todos los
casos salvo en la media del escenario trinity, debido a que las paredes están siempre muy
cerca y se pueden utilizar fiablemente los infrarrojos que es lo que hace siempre el método de
Flynn.
En adición a este método de fusión simple, se ha desarrollado uno más complejo basado
en el uso de un sistema de reglas difusas. En este caso las reglas se obtienen como resultado
de un proceso de aprendizaje híbrido. Este proceso está compuesto de tres técnicas, cada una
aplicada a una fase del modelado difuso. Primero se utiliza una técnica que encuentra el
número de reglas que componen el sistema. Esto se realiza por medio del mountain clustering
(Yager y Filev, 1994) que produce como salida el número inicial de centroides que
representan los distintos comportamientos de los datos. Con el número de centroides se aplica
un
segundo
algoritmo
de
clustering
para
obtener
unos
centroides
mejorados
que
posteriormente constituirán los parámetros iniciales de las reglas. En esta etapa se utilizan los
métodos k-mean, fuzzy k-mean y fuzzy tabu clustering. Finalmente, se aplica una red
neuronal adaptativa ANFIS (Jang, 1993) que se encarga del ajuste final de las reglas.
El fuzzy tabu clustering es un nuevo algoritmo que se ha desarrollado para tratar la
existencia de múltiples mínimos locales en tareas de clustering. Está basado en la técnica de
búsqueda tabú y en algunas ideas del algoritmo fuzzy c-means. La búsqueda tabú en tareas de
clustering (Al-Sultan, 1995) es diferente de la búsqueda local basada en ascensión de colinas
en el sentido de que no se queda atrapada en mínimos locales, como es el caso del fuzzy cmeans. El nuevo algoritmo es, en esencia, una búsqueda tabú que tiene en cuenta los grados
de pertenencia de los datos a los distintos centroides. Así, el objetivo es dividir los datos en un
número conocido de grupos de forma que se minimice una función objetivo similar a la de
fuzzy c-means, que está basada en los grados de pertenencia a los distintos centroides. Como
medida de error de los datos a los centroides se utiliza la distancia euclídea, aunque se podría
haber utilizado cualquier otra. Se han realizado distintos experimentos que muestran como
con esta técnica se pueden obtener soluciones óptimas o casi ó ptimas globalmente más
21
CAPÍTULO 0: DESCRIPCIÓN GENERAL
fácilmente que con técnicas de clustering tradicionales. Para ello se han utilizado dos
conjuntos de datos: un conjunto en que los clusters están claramente diferenciados, y otro en
que los clusters no están claramente diferenciados (el famoso conjunto de los IRIS de
Anderson).
De forma similar a como se hizo con la red neuronal, se han utilizado dos entornos
distintos para la adquisición de datos para el aprendizaje de las reglas difusas, y su posterior
validación. El primero es similar a un pasillo al que están conectadas diversas salas, con unas
dimensiones de 5.2 x 4.9 m, pasillos de 1.2 m y puertas de 80 cm (Figura 0.7). El segundo
está basado en la planta del Tinity College Fire Fighting Contest, con unas dimensiones de 3.5
x 3.5 m, y pasillos y puertas de 65 cm (Figura 0.8). Para generar las reglas difusas se han
utilizado distintas combinaciones de métodos de clustering, de las que destacan tres por sus
resultados. En todas se utiliza un mountain clustering para obtener el número de clusters
(Tabla 3.9), un método adicional de clustering para obtener los centroides, y una red ANFIS
(Tabla 3.11) para generar las reglas a partir de los centroides. Como métodos de clustering
para obtener los centroides se han utilizado el fuzzy k-means (Tabla 3.9), el k-means, y el
fuzzy tabu (Tabla 3.10).
Figura 0.7: Entorno trinity.
Figura 0.8: Entorno pasillo.
Una vez que el proceso de aprendizaje ha terminado el robot se simula en ambos
escenarios sobre los caminos utilizados para la adquisición de datos (Figura 0.7) (Figura 0.8).
El propósito de este paso es comprobar las características de aprendizaje de las bases de
reglas difusas. Los resultados se tabulan igualmente haciendo uso del error medio (Avg),
varianza (Var), y error cuadrático medio (RMSE). Los resultados de las reglas difusas se
22
HUMBERTO M ARTÍNEZ BARBERÁ
comparan con los de los sensores individuales, con la red neuronal anteriormente generada, y
con el método de fusión de reglas de Flynn. Cabe destacar que la combinación basada en la
búsqueda tabú produce los mejores resultados (Tabla 3.12) (Tabla 3.13) en todos los casos,
tanto en Avg como en RMSE. Las reglas difusas comparadas con las reglas de Flynn obtienen
mejores resultados en el escenario del pasillo porque sus proporciones son similares a un
entor no real, ya que el escenario trinity tiene pasillos excesivamente estrechos para el tamaño
del robot. Finalmente, el robot se simula en ambos escenarios sobre caminos alternativos
(Figura 0.9) (Figura 0.10), distintos a los utilizados para la adquisición de datos. El propósito
de este paso es comprobar las características de generalización de las bases de reglas difusas.
Cabe destacar de nuevo que la combinación basada en la búsqueda tabú produce los mejores
resultados (Tabla 3.14) (Tabla 3.15) en el escenario del pasillo, aunque en el escenario trinity
las reglas de Flynn y la combinación basada en k-means producen resultados ligeramente
mejores. El motivo es que el escenario trinity puede ser navegado haciendo uso exclusivo de
los infrarrojos, y el recorrido de validación sigue una trayectoria muy complicada. En
cualquier caso, las reglas obtenidas por la combinación basada en el fuzzy tabu clustering
obtienen los mejores resultados de forma general.
Figura 0.9: Validación en trinity.
Figura 0.10: Validación en pasillo.
0.4 Navegación
Para un robot móvil la habilidad de navegar es una de las más importantes. La
navegación se puede definir como la combinación de tres habilidades fundamentales: la
23
CAPÍTULO 0: DESCRIPCIÓN GENERAL
construcción de mapas, que es el proceso de realizar mapas a partir de las lecturas de los
sensores en diferentes posiciones, la localización, que es el proceso de conocer la posición
actual del robot a partir de las lecturas de los sensores y el mapa actual, y la generación de
trayectorias, que es el proceso de obtener una trayectoria factible y segura desde una
determinada posición a otra objetivo a partir del mapa actual.
La plataforma Quaky-Ant ha sido dotada de distintos algoritmos y métodos para atacar
el problema de la navegación (ver capítulo 4). Inicialmente, el sistema asumía una odometría
fiable para operaciones de corto tiempo y distancia. Teniendo en cuenta esta asunción,
incorporaba un método de construcción de mapas y otro de generación de trayectorias. Para
solventar el problema de la fuerte restricción de la odometría, se dotó al sistema de un método
de localización basado en otro modelo de mapa. La combinación final de ambos métodos
resultó funcionar bastante bien.
El primer modelo de navegación usa un mapa de celdillas difusas y un algoritmo basado
en el A* para generar las trayectorias. De forma distinta a otras soluciones, el robot construye
los mapas mientras está en movimiento en vez de detenerse constantemente para actualizar el
mapa. La idea es que el robot exhiba cierto grado de reactividad similar al utilizado por
animales y personas. El proceso de navegación es como sigue. Cada vez que un sensor
produce un valor nuevo el mapa se actualiza. Este paso no consume mucho tiempo y se
realiza a una alta frecuencia (cada 300 ms aproximadamente). Simultáneamente el robot se va
desplazando y cada cierto tiempo se genera una nueva trayectoria teniendo en cuenta el mapa
actual. Este paso se realiza a una frecuencia menor (cada 1200 ms aproximadamente) debido
a que la generación de trayectorias consume comparativamente más tiempo.
El método de representación de mapas está basado en los mapas de celdas difusa
(Oriolo et al., 1998). Con este método se usan y actualizan tres mapas: el de las celdas que
están posiblemente vacías (Figura 0.11), que tienen un grado de pertenencia alto al conjunto
difuso celda vacía, el de las celdas que están posiblemente ocupadas (Figura 0.12), que tienen
un grado de pertenencia alto al conjunto difuso celda ocupada, y el de las celdas que son
seguras para navegación (Figura 0.13), que se obtiene combinando los otros dos de forma que
las celdas tienen simultáneamente un grado de pertenencia alto al conjunto difuso celda vacía
y un grado de pertenencia bajo al conjunto difuso celda ocupada.
24
HUMBERTO M ARTÍNEZ BARBERÁ
Figura 0.11: Mapa de celdas vacías.
Figura 0.12: Mapa de celdas ocupadas.
Figura 0.13: Mapa de navegación.
Un punto importante de la solución adoptada es que el método original sólo tiene en
cuenta medidas de sensores de ultrasonidos, modelando su incertidumbre directamente sobre
el mapa. En este caso se utilizan los sensores virtuales que producen como resultado la fusión
de los ultrasonidos e infrarrojos para obtener una medida más fiable. Como el infrarrojo es
dominante sólo para distancias menores de 80 cm, la apertura del haz del mismo se aproxima
bastante a la del sonar si se proyecta sobre las celdas del mapa, ya que estas tienen
habitualmente un tamaño de 10 a 20 cm. De esta forma el modelo del mapa es también válido.
Otra diferencia importante estriba en la inicialización del mapa. El método original considera
una aproximación tipo sense-map-plan-move y modela el área no explorada como ambigua o
ocupada. Como la aproximación de este trabajo es distinta, el método de generación de
trayectorias requiere que el área no modelada se represente como vacía, de forma que las
25
CAPÍTULO 0: DESCRIPCIÓN GENERAL
trayectorias posibles crucen el área no explorado si es necesario. Finalmente, las celdas bajo
la posición del robot se consideran como no ocupadas y vacías. Trivialmente, si el robot se
encuentra en esa posición es que no hay ningún obstáculo debajo. Esta estrategia produce
muy buenos resultados, no solo en la calidad del mapa, sino también a la hora de generar
trayectorias
La forma más directa para generar trayectorias en un mapa de celdillas es considerar
cada celda como un nodo de un grafo y posteriormente aplicar un A*. Este algoritmo (Hart et
al., 1968) permite el uso de información heurística cuando está disponible, resultando en una
búsqueda eficiente. A partir del mapa de navegación segura se puede aplicar este método. Hay
en la literatura aproximaciones para resolver este problema teniendo en cuenta el ruido de los
sensores, el tamaño del robot, y recorridos de tamaño mínimo (Oriolo et al., 1998), pero no
simultáneamente. En este trabajo se ha optado por una solución que encuentra caminos de
tamaño mínimo a la vez que produce caminos seguros y alejados de obstáculos. La cota del
coste mínimo de la función de evaluación del A* se calcula en dos pasos: en primer lugar el
mapa difuso es dilatado localmente alrededor de la celda actual, para posteriormente aumentar
exponencialmente la función de coste (Figura 0.14).
Figura 0.14: Función de evaluación.
El paso de dilatación sirve para rellenar pequeños huecos en el mapa de forma que los
caminos tentativos no crucen las paredes por pequeñas aberturas. Un parámetro permite
especificar el diámetro de dilatación de forma que se pueden tener en cuenta robots de distinto
tamaño. El paso de aumentación expande exponencialmente el grado de certeza de que las
celdas están ocupadas, de forma que valores pequeños den resultados similares y que valores
26
HUMBERTO M ARTÍNEZ BARBERÁ
grandes den resultados muy distintos. La idea es evitar el uso de las celdas ocupadas para la
generación de trayectorias. El paso de dilatación es aplicado localmente para evit ar aplicar la
transformación al mapa completo porque la generación de trayectorias típicamente involucra
a pequeñas porciones del mapa y, además, es un proceso costoso en tiempo. Usando este
esquema basado en A*, la estrategia de navegación usa el camino obtenido para establecer un
punto de look-ahead que es usado para la navegación local. Este punto se sitúa sobre el
camino a una cierta distancia del robot, que es función de la velocidad media del robot y de la
frecuencia del ciclo de control. Se han realizado diversos experimentos de la estrategia de
navegación (Figura 0.15) (Figura 0.16), en entornos tanto simulados como reales, que
muestran su eficacia.
27
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Figura 0.15: Estrategia de navegación simulada.
Figura 0.16: Estrategia de navegación en entorno real.
El primer modelo de navegación asumía una odometría fiable para corto tiempo y
distancia. Esto es claramente una restricción importante en entornos grandes o en operaciones
que requieran mucho tiempo. Para solventar esta limitación, el método anterior se
complementa con un método de localización basado en mapas de segmentos difusos. Los
28
HUMBERTO M ARTÍNEZ BARBERÁ
mapas globales de celdillas no son válidos para localización, mientras que los mapas locales
de celdillas requieren una referencia precisa de un compás para obtener resultados fiables
(Duckett y Nehmzov, 1998). Por otro lado, el método anteriormente descrito, si se provee con
una localización fiable produce caminos seguros de una forma rápida. Así, el segundo modelo
de navegación está basado en el primero con la adición de una técnica de localización que no
necesita un modelo a priori del entorno. El nuevo modelo funciona como sigue. El mapa de
celdas difusas y el de generación de trayectorias se ejecutan de igual forma que en el caso
anterior, pero en lugar de utilizar la posición obtenida por odometría usan la posición
corregida por el método de localización. Simultáneamente, un mapa local de segmentos
difusos se actualiza cada vez que se llena un buffer se ultrasonidos (cada 1200 ms
aproximadamente). Este mapa se compara (cada 2500 ms aproximadamente) con un mapa
global de segmentos difusos para obtener las diferencias y calcular el error actual de la
odometría, para posteriormente corregir la posición. Ambos mapas se construyen sin tener
ningún conocimiento previo del entorno.
El método del mapa de segmentos difusos (Gasós y Martín, 1996b) (Gasós, 2000) es
una representación geométrica formada únicamente por segmentos de línea. Manejando
cuidadosamente la incertidumbre relativa a la posición del robot y el ruido de los sensores, el
método intenta solucionar los problemas típicamente asociados a los modelos geométricos. En
este caso, el mapa ampliamente reduce la falta de estabilidad disminuyendo la precisión de los
ajustes de las líneas (hacié ndolas más gruesas), manteniendo la incertidumbre durante todo el
proceso. El modelo original genera los segmentos de línea considerando cada sensor
individualmente a lo largo del tiempo. Las medidas de los sensores son preprocesadas para
eliminar errores claramente detectables y agrupar medidas correlativas. Finalmente estos
grupos son aproximado por una recta para así generar un segmento difuso. Esta aproximación
genera mapas aceptables, pero no tiene la habilidad de combinar medidas de distintos
sensores al nivel sensorial. Esto hace que el proceso de generación de los mapas sea lento,
porque se necesita bastante tiempo para obtener buenas medidas que puedan ser ajustadas por
una recta. Se ha desarrollado un nuevo método de generación de segmentos para solventar
este problema. Así, este método construye y mantiene de forma heurística un buffer circular
que almacena las últimas medidas de todos los sensores. Cada vez que un nuevo conjunto de
medidas está disponible (un ciclo de disparo de los sensores), las que son menores de una
29
CAPÍTULO 0: DESCRIPCIÓN GENERAL
determinada longitud se almacenan secuencialmente en el buffer sobreescribiendo los valores
menos recientes. Este buffer se utiliza para extraer las agrupaciones de puntos que
posteriormente se ajustarán con una recta para dar lugar a un segmento difuso (Figura 0.17).
Este método obtiene heurísticamente segmentos más consistentes (Figura 0.19) que en el
modelo original (Figura 0.18), a la vez que se necesitan menos movimientos. Este método ha
sido probado en entorno simulados y reales con resultados satisfactorios.
Figura 0.17: Buffer de sensores y generación de segmentos.
Figura 0.18: Generación original.
Figura 0.19: Generación modificada.
Para la localización (Gasós y Martín, 1996a) se utilizan mapas de segmentos difusos
obtenidos de la forma anterior. Mientras que el robot navega por el entorno para realizar una
tarea, las observaciones de los sensores se utilizan para construir un mapa parcial de
segmentos difusos que incluye el entorno cercano del robot. Este mapa es comparado con el
mapa global que se ha construido previamente (se ha podido actualizar en el instante
anterior). Cuando la superposición de ambos mapas produce suficiente evidencia de
30
HUMBERTO M ARTÍNEZ BARBERÁ
coincidencia, la localización del robot se actualiza basada en las diferencias entre los mapas.
La localización se ejecuta como un proceso continuo durante todo el tiempo que el robot se
está desplazando. Un punto importante en el proceso de localización es la correcta
coincidencia entre el mapa local y el global. Si no hay suficientes segmentos perpendiculares,
la posible coincidencia no se tiene en cuenta. Si sólo coinciden segmentos paralelos no se
puede tener suficiente certeza de la posición en la dirección de las paralelas. Claramente,
cuanto mayor sea el número de segmentos correctos que se han generado mayor es la
posibilidad de alcanzar una localización satisfactoria. Este es el mayor beneficio que se
obtiene del método de generación de segmentos propuesto anteriormente. Se han realizado
diferentes pruebas de localización, tanto en simulación (Figura 0.20) como en entorno real
(Figura 0.21), mostrando unos buenos resultados.
31
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Figura 0.20: Localización en entorno simulado.
Figura 0.21: Localización en entorno real.
0.5 Arquitectura de control
Una vez que la plataforma Quaky-Ant ha sido dotada de unas habilidades básicas
(fusión sensorial y navegación), el siguiente problema que surge es como estructurar y
organizar los distintos elementos de un sistema de control. En este contexto se ha definido una
arquitectura, llamada BGA (ver capítulo 5), que está basada en un modelo multi-agente
(Hendeler, 1999) para combinar agentes reactivos y deliberativos. Estos agentes juegan un
papel importante como elementos estructurales de la arquitectura, y se comunican mediante
un esquema basado en una pizarra (Hayes-Roth, 1985). Esto agentes se pueden organizar en
distintos niveles y distribuir en función de sus funcionalidades. En concreto se ha utilizado
32
HUMBERTO M ARTÍNEZ BARBERÁ
una división en niveles similar a las arquitecturas híbridas de tres niveles (Gat, 1998). Como
estos agentes pueden incorporar comportamientos reactivos, la fusión de sus salidas se realiza
mediante un método basado en el context-dependent blending (Saffiotti, 1997b). El diseño de
la arquitectura se ha basado en los siguientes requisitos:
• Integración de módulos reactivos (ej. evitación de obstáculos) y deliberativos (ej.
generación de trayectorias), de forma que cada módulo sea una entidad semi-autónoma e
independiente, encargada de una determinada tarea. Así se consigue la encapsulación,
aislamiento, y control local.
• No se hace ninguna asunción acerca del conocimiento que tienen cada subsistema ni de
los métodos de resolución que implementa. Además, se debe hacer un uso extensivo de la
lógica difusa.
• Organización flexible y dinámica, que permita la distribución de módulos dependiendo de
sus necesidades de procesado y la habilitación y deshabilitación de dichos módulos en
ejecución.
• Adecuación para la incorporación de técnicas de aprendizaje en los distintos elementos de
la arquitectura.
En el contexto de la arquitectura BGA se definen tres unidades de computación básicas:
• Behaviour: es la unidad computacional más simple de BGA. Los comportamientos son
procesos concurrentes que tienen un tiempo de ejecución con cota casi constante. Estos
comportamientos tienen una finalidad reactiva.
• Blender: cuando se tienen uno o más comportamientos el blender se encarga de combinar
los distintos comportamientos, tanto la fusión de comportamientos como la de comandos.
No es necesario que haya ningún blender, pero sí es recomendable.
• Task: es la unidad computacional más compleja de BGA. Las tareas son procesos
asíncronos que no tienen un tiempo de ejecución con cota constante. Estas tareas tienen
una finalidad deliberativa.
33
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Usando estas unidades básicas, la arquitectura BGA (Figura 0.22) está compuesta por
elementos de los siguientes tipos:
• Agent: un agente es un mecanismo que sirve para agrupar behaviours, blenders y tasks. Un
agente no puede contener simultáneamente comportamientos y tareas. Esto da lugar a dos
tipos
de
agentes:
los
agentes
reactivos,
que
son
aquellos
que
sólo
contienen
comportamientos, y los deliberativos, que son aquellos que sólo contienen tareas. Los
agentes se ejecutan de una forma asíncrona y distribuida.
• Blackboard: la pizarra sirve como una infraestructura de comunicación que enlaza los
distintos agentes del sistema. Todos los agentes utilizan un protocolo común, basado en
KQML, para comunicarse con la pizarra. Esta comunic ación se puede realizar usando una
red o memoria compartida.
• Sensor processing: las unidades de procesamiento sensorial están a cargo de acceder a las
diferentes señales de los sensores para filtrarlas, procesarlas, y posteriormente enviar la
información a la pizarra. Típicamente sólo hay uno de estos módulos en la arquitectura.
• Actuator control: las unidades de control de los actuadores están a cargo de acceder a la
pizarra para leer comandos motores y posteriormente enviarlos a los distintos actuadores.
Típicamente sólo hay uno de estos módulos en la arquitectura.
Figura 0.22: La arquitectura BGA.
34
HUMBERTO M ARTÍNEZ BARBERÁ
Una idea clave en BGA es que cada agente reactivo se descompone en comportamientos
muy simples que se ejecutan concurrentemente y acceden a la pizarra a través de variables de
entrada y salida. Un agente reactivo en BGA (Figura 0.23) está compuesto por tres tipos de
módulos que se ejecutan secuencialmente:
• Un módulo common que se ejecuta antes de cualquier comportamiento y que construye o
actualiza el espacio perceptual local (LPS) del agente (Saffiotti, 1997b). El LPS es una
estructura que almacena información perceptual, interpretaciones de estos datos, y
representaciones de subobjetivos creados por el planificador.
• Un conjunto de comportamientos que se ejecutan concurrentemente. Estos son módulos
reactivos estándar que enlazan directamente entradas (de los sensores o del LPS) a
acciones.
• Un módulo blender que especifica qué comportamientos están activos y cómo se fusionan
las salidas de estos.
Figura 0.23: Agente reactivo BGA.
Puesto que dos comportamientos pueden modificar la misma variable de salida, se
necesita un mecanismo de coordinación de comportamientos. En este caso, los blenders de
BGA utilizan un método basado en el context-dependent blending (Saffiotti, 1997b). El
método en BGA funciona como sigue. En tiempo de desarrollo, a cada comportamiento se le
35
CAPÍTULO 0: DESCRIPCIÓN GENERAL
asigna un peso que es fijo durante toda la ejecución, y determina como es de importante ese
comportamiento (y por tanto se utiliza para pesar las distintas salidas). Cada comportamiento,
basado en su información sensorial (e.d. sus variables de entrada), produce una serie de
acciones (e.d. sus variables de salida) que han de ser introducidas en la pizarra. En lugar de
trabajar directamente sobre la pizarra, cada comportamiento utiliza una copia local de las
variables de salida. El blender implementa un base de meta-reglas difusas, donde las reglas
tienen entradas de información sensorial o de contexto y las salidas representan los grados de
activación de los distintos comportamientos. Un mecanismo de inferencia difuso obtiene
finalmente los grados de activación que se van a aplicar. Estos grados definen el esquema de
arbitrio. Cuando un comportamiento tiene grado de activación cero, este es desactivado. Si
tiene otro grado, este se activa parcialmente. Finalmente las copias de las variables locales se
combinan teniendo en cuenta tanto los pesos de los comportamientos como sus grados de
activación. La ventaja de utilizar estos pesos reside en su uso al aplicar técnicas de
aprendizaje para obtener la base de reglas de fusión (ver capítulo 7). En este caso, estos pesos
indican
el
conocimiento
que
tiene
el
usuario
acerca
de
la
importancia
de
cada
comportamiento.
Figura 0.24: Agente deliberativo BGA.
Un agente deliberativo en BGA (Figura 0.24) está compuesto por dos tipos de módulos
que se ejecutan secuencialmente:
36
HUMBERTO M ARTÍNEZ BARBERÁ
• Un módulo common que se ejecuta antes de cualquier tarea y que construye o actualiza el
espacio perceptual extendido (EPS) del agente. El EPS es una estructura que almacena
información global y de alto nivel, interpretaciones de estos datos, y representaciones de
objetivos y subobjetivos.
• Un conjunto de tareas que se ejecutan secuencialmente. Estos son módulos deliberativos
estándar que no enlazan directamente entradas a acciones, sino que modelan el mundo (a
través del uso del EPS) o razonan acerca del mundo.
Un punto importante que diferencia los agentes reactivos de los deliberativos es que las
tareas no tienen esquemas explícitos de arbitrio. Es responsabilidad del diseñador evitar
conflictos entre las distintas tareas. Esto no es restrictivo ya que, en caso de conflicto entre
tareas, se puede implementar una tarea que resuelva los conflictos entre las otras.
En la arquitectura BGA los servicios de comunicación ofrecidos por la pizarra son un
punto muy importante. Los agentes se comunican enviando datos a la pizarra y recibiendo
datos de esta. Para esta tarea BGA usa KQML (Labrou et al., 1999), un lenguaje de
comunicación entre agentes. Las implementaciones típicas de KQML utilizan TCP, que es un
protocolo robusto diseñado para proporcionar flujos fiables de bytes de extremo a extremo
sobre redes no fiables. Esta fiabilidad se consigue a expensas de una gran latencia comparada
con UDP, que es un protocolo de transporte no orientado a conexión que proporciona
transmisión de bytes de extremo a extremo de forma no fiable. La arquitectura BGA está
pensada para aplicaciones robóticas donde los agentes típicamente se ejecutan en la misma
red, y los tiempos de respuesta son muy importantes. Como al reducir la latencia se disminuye
el tiempo de respuesta se ha implementado el servicio de comunicación de BGA sobre UDP.
Para facilitar el desarrollo de programas dentro de la arquitectura BGA se ha definido
un lenguaje de programación, llamado BG, y se ha desarrollado su correspondiente traductor.
Este lenguaje soporta enteramente todos los elementos de la arquitectura, y permite al usuario
desarrollar sistemas de control basados en BGA de forma amigable y a la vez potente.
Además, el modelado de los comportamientos está soportado por medio del uso de lógica
difusa. Un programa BG está formado por la definición de una serie de agentes BGA y sus
37
CAPÍTULO 0: DESCRIPCIÓN GENERAL
interfaces con la pizarra. Un programa BG está pensado para ser eje cutado como un proceso
individual (que puede contener distintos threads de ejecución). Así, una arquitectura BGA
completamente
distribuida
está
compuesta
por
distintos
programas
BG,
cada
uno
ejecutándose en un procesador diferente. En su versión actual, el lenguaje BG incluye
numerosos elementos y estructuras de control de los lenguajes imperativos clásicos:
• Variables locales y globales.
• Expresiones y asignaciones.
• Operadores aritméticos, lógicos y relacionales.
• Condiciones.
• Funciones similares a C.
Además, BG soporta dos tipos de datos: números reales y conjuntos difusos con formas
estándar. Adicionalmente BG incluye estructuras de control avanzadas:
• Autómatas finitos deterministas.
• Bases de reglas difusas.
• Soporte para razonamiento aproximado.
• Soporte para navegación de robots.
• Soporte para agentes, comportamientos, tareas y blenders.
BG soporta sólo números reales y conjuntos difusos (Listado 5.9). Un criterio de diseño
de BG era hacerlo lo más simple posible, soportando sólo un conjunto mínimo de
instrucciones pero sin perder generalidad. En este sentido los números enteros, booleanos, etc.
se pueden ver como casos particulares de los reales. Las variables reales pueden ser accedidas
tanto para lectura como para escritura, y sólo están disponibles en el ámbito de un programa
BG. Se pueden utilizar modificadores para acceder a variables que no están en el ámbito de
un programa (en general para acceder a la pizarra). Estos modificadores restringen el acceso a
estas variables a ser en modo lectura o escritura de forma que se reduzca la sobrecarga de
acceso externo. Están soportados tres tipos de modificadores: sensor, para acceder a variables
38
HUMBERTO M ARTÍNEZ BARBERÁ
en modo sólo lectura, effector, para acceder a variables en modo sólo escritura, status, acceder
a variables en modo lectura y escritura. Están permitidos cinco tipos de conjuntos difusos:
trapezoidales, triangulares, crisp, sigmoides, y funciones campana. Además, un tipo especial
de datos permite definir consecuentes tipo TSK. Las variables reales pueden ser operadas con
operadores aritméticos, lógicos y relacionales estándar (Listado 5.10). Además, las variables
difusas pueden ser operadas con los operadores t-norma, t-conorma, inverso, y grado de
pertenencia (Listado 5.11).
El lenguaje BG, de forma similar a C, soporta las estructuras de control clásicas y las
asignaciones, así como bases de reglas difusas y métodos de soporte a la navegación. Estas
estructuras se denominan comandos y pueden ser agrupados en bloques, que son listas de
comandos que se ejecutan secuencialmente (Listado 5.12). Además, BG soporta el uso de
funciones definidas por el usuario (Listado 5.13), que son globales a todos los módulos
definidos dentro de un programa BG. El tipo de retorno implícito es el real. Los parámetros de
las funciones también son implícitamente números reales. Si las funciones no tienen un
comando de retorno, implícitamente devuelven el valor cero.
El uso de autómatas finitos deterministas (Listado 5.14) permite al programador
secuenciar tareas y acciones, y definir las condiciones para cambiar entre las distintas tareas.
El lenguaje BG, en su versión actual, no tiene soporte para bucles (while, for, o repeat-until),
aunque estos se pueden modelar como autómatas. La razón principal para no incluir bucles es
para tener bajo control el ciclo de ejecución de los distintos comandos (de forma que un error
no lo pueda bloquear). En un futuro estos se incorporarán como en COLBERT (Konolige,
1997), que en el fondo son descompuestos en autómatas. Otro uso de los autómatas es la
implementación de planificadores sencillos que simplemente encadenan acciones.
Las bases de reglas difusas (Listado 5.16) están pensadas tanto para la implementación
de mecanismos de control como la fusión de comportamientos, y son uno de los elementos
más destacados del lenguaje. Las reglas difusas en BG se pueden formar mediante cualquier
combinación de operadores difusos, variables y conjuntos difusos para la definición del
antecedente, así como cualquier combinación de variables y conjuntos difusos para la
definición del consecuente (incluido el TSK). Además de las reglas difusas estándar, BG
incluye reglas background, que son reglas cuyos antecedentes siempre se evalúan con el
39
CAPÍTULO 0: DESCRIPCIÓN GENERAL
mismo grado de pertenencia, el del parámetro de la regla. Estas reglas son útiles para cubrir el
espacio de entrada no definido. Esto se consigue indicando un grado de disparo muy bajo, que
domina si ninguna otra regla se activa, y que produce un resultado despreciable si se activa
alguna. Las reglas background juegan un papel importante cuando se utilizan en conjunción
con un sistema de aprendizaje automático, ya que estos suelen dejar zonas del espacio de
entrada sin cubrir en función de los ejemplos. En la implementación actual se pueden escoger
distintos tipos de t-normas y t-conormas, pero sólo pueden ser cambiadas globalmente antes
de la ejecución un programa BG. Salvo que se cambien, se usa la -t norma del mínimo y la -t
conorma del máximo.
El lenguage BG proporciona una estructura y métodos para gestionar información
espacial bidimensional para la construcción de mapas y la generación de trayectorias (Listado
5.18). Como estructura de mapas se utilizan los mapas de celdillas difusas. El usuario puede
especificar los parámetros de inicialización del mapa: el tamaño horizontal y vertical, el
tamaño de las celdas, y el método de generación de trayectorias. Están soportados dos
métodos de generación de trayectorias, el A* y el directo. Este último calcula la orientación
del punto objetivo con respecto a la orientación actual del robot sin tener en cuenta ningún
obstáculo. Este método es útil para implementar comportamientos de búsqueda del objetivo
completamente reactivos.
BG se diseñó para servir como lenguaje de desarrollo de la arquitectura BGA, y así,
soporta la definición de los distintos elementos de BGA (Listado 5.20): agentes, tareas,
comportamientos y blenders. Los distintos bloques de proceso (comportamientos, tareas y
blenders) pueden contener variables reales de ámbito local, aunque no están soportados los
conjuntos difusos locales. La definición de los comportamientos puede incluir un peso, que se
utiliza en el proceso de fusión de BGA. Si no se especifica ningún peso se toma el valor 1.
Cuando se usa un programa BG en conjunción con técnicas de aprendizaje se puede
especificar que variables y sus respectivos rangos se van a utilizar en el bloque blender. De
esta forma se puede limitar el espacio de búsqueda del método de aprendizaje usado (ver
capítulo 7).
Un programa BG (Listado 5.22) es simplemente una colección de agentes y variables.
Adicionalmente se puede especificar un bloque de inicialización que se ejecuta una única vez
40
HUMBERTO M ARTÍNEZ BARBERÁ
cuando el programa BG es iniciado, y en el que se puede realizar cualquier inicialización que
se necesite.
El intérprete del lenguaje BG ha sido desarrollado íntegramente usando el lenguaje
Java. El analizador sintáctico está basado en un parser LALR generado a partir de la
definición de la gramática, de forma similar a como se hace con YACC (Johnson, 1975), pero
usando Java en lugar de C. El intérprete de BG está formado por dos módulos: el parser BG y
el ejecutor BG. El parser BG toma como entrada un fichero que contiene un programa BG y
procede con los análisis léxico y sintáctico. Como el lenguaje BG es simple, el análisis
semántico (básicamente solo la compr obación de tipos) se realiza simultáneamente con el
análisis sintáctico. Cuando un programa BG ha sido analizado el módulo produce un código
intermedio para su posterior ejecución. El ejecutor BG toma como entrada ese código
intermedio sin hacer ningún tipo de comprobación, y entonces los interpreta (funcionando en
realidad como un transliterador). El código intermedio se almacena como un grafo dirigido,
donde los nodos corresponden a los distintos elementos del lenguaje BG (variables,
expresiones, comandos , comportamientos, tareas y agentes). En esta forma, la parte que
consume más tiempo en el proceso de interpretación (el análisis léxico y semántico) se
desacopla de la parte de ejecución por razones de eficiencia. Aunque el ejecutor BG también
está desarrollado íntegramente en Java, que en realidad también es un lenguaje interpretado,
se han ejecutado programas BG satisfactoriamente incluso en un procesador i486.
El intérprete del lenguaje BG ha sido incorporado en un entorno de desarrollo llamado
BGen, que también integra un simulador parametrizado de distintos robots holonómicos,
herramientas de visualización y registro de datos, y una herramienta de aprendizaje de
comportamientos. El entorno BGen no sólo está pensado para el desarrollo, sino también para
el uso en aplicación final. En la etapa de desarrollo el usuario define y especifica un programa
BG
que
refleja
una
arquitectura
BGA
particular
para
posteriormente
simular
el
comportamiento global del robot y comprobar la conveniencia de la solución. Una vez que el
programa BG está listo para su implantación como aplicación final es transferido al robot real,
que también contiene el entorno BGen pero en lugar del simulador utiliza un interface con los
elementos de bajo nivel del robot. Esto es posible porque todo el entorno se ha desarrollado
en Java, lo que implica que el robot debe incluir una máquina virtual Java, disponibles para la
41
CAPÍTULO 0: DESCRIPCIÓN GENERAL
mayoría de los sistemas operativos. Los robots usados en esta tesis para probar el entorno
BGen utilizan Linux como sistema operativo. Este modelo de desarrollo presenta algunas
ventajas:
• El entorno de desarrollo y el de ejecución final son idénticos.
• El tiempo necesario para pasar del entorno de desarrollo al de ejecución es mínimo.
• Distintas plataformas pueden reusar código, a excepción de las funciones de muy bajo
nivel.
Figura 0.25: Ejemplo de arquitectura BGA.
Una vez que se dispone de una arquitectura (BGA), un lenguaje (BG) y un entorno de
desarrollo (BGen), el siguiente paso más natural es utilizarlos en una aplicación sobre la
plataforma Quaky-Ant (ver capítulo 6). La tarea que ha de realizar el robot es navegar desde
una determinada posición inicial a una objetivo, y después regresar a la inicial, en un entorno
completamente desconocido similar a cualquier entorno interior de oficina (con diferentes
salas y pasillos). Las coordenadas de las dos posiciones se conocen de antemano.
Adicionalmente, el robot no debe colisionar ni quedarse paralizado. Para cumplir el objetivo
el sistema ha sido descompuesto, extrayendo sus elementos más importantes (Armstrong et
al., 1998), en los siguientes agentes BGA (Figura 0.25):
• Reactive Control. Es el agente BGA que se encarga de la navegación reactiva.
Implementa una serie de comportamientos para evitar colisiones, seguir un camino
determinado, alinearse a una pared, o escapar de obstáculos en forma de U.
42
HUMBERTO M ARTÍNEZ BARBERÁ
• Planning. Es el agente deliberativo BGA que se encarga de la planificación de alto nivel y
de la supervisión del objetivo.
• Navigation. Es el agente deliberativo BGA que se encarga de construir un modelo del
entorno, localizar el robot sobre él, y generar una posible camino para alcanzar el
objetivo.
• Sensor Processing and Actuator Control. Son los agentes BGA encargados de procesar
los datos sensoriales y controlar los motores. El primero accede a los señales directas de
los sensores, las filtra, las procesa, y pone la información relevante en la pizarra. Es en
este agente donde se ejecutan los algoritmos de fusión sensorial (ver capítulo 3). El
segundo lee los comandos motores de la pizarra y envía las correspondientes acciones de
control a los motores.
toGoal
Blackboard
alignL
alignR
LPS
uobject
heading
turn
alpha
left
right
leftd
rightd
speed
virtu i
avoidR
front
avoidF
common
avoidL
escape
blender
Figura 0.26: Agente Reactive Control.
El agente Reactive Control (Figura 0.26) es el más importante y crucial porque
implementa los comportamientos básicos reactivos, que tienen la responsabilidad de mantener
43
CAPÍTULO 0: DESCRIPCIÓN GENERAL
la integridad física del robot por medio de la evitación de colisiones. Casi todos los
comportamientos se definen mediante bases de reglas difusas, obtenidas a partir de
experiencia previa o de técnicas de modelado difuso. Se han identificado y desarrollado los
siguientes comportamientos reactivos:
• Avoid-left. Evita un obstáculo situado a su izquierda.
• Avoid-front. Evita un obstáculo situado a su frente.
• Avoid-right. Evita un obstáculo situado a su derecha.
• Align-left. Mantiene el robot a una distancia determinada de la pared izquierda.
• Align-right. Mantiene el robot a una distancia determinada de la pared derecha.
• Move-to-goal. Dirige el robot en la dirección del objetivo actual.
• Escape. Gira el robot aleatoriamente para escapar de un pared en forma de U.
Blackboard
EPS
heading
gy
gx
virtu i
alpha
grid
x
y
map
Figura 0.27: Agente Navigation.
44
HUMBERTO M ARTÍNEZ BARBERÁ
Blackboard
EPS
stage
x
gy
gx
h o m e _x
h o m e _y
y
ingoal
plan
Figura 0.28: Agente Planning.
El agente Navigation (Figura 0.27) es el encargado de construir un mapa del entorno,
calcular un camino a la posición objetivo, y mantener localizado el robot. Se utiliza la
estructura de navegación soportada por BG. El mapa se actualiza continuamente con las
medidas de los sensores. El agente Planning (Figura 0.28) se encarga de supervisar la tarea
que se está ejecutando, cambiando el objetivo cuando se ha completado un bordo y
deteniendo la ejecución cuando se completa la tarea global. Para implementar este
planificados simple se utiliza la estructura de autómata del lenguaje BG. Los distintos estados
del autómata tienen asociados coordenadas objetivo, que son usadas por el agente Navigation
para producir un rumbo que será posteriormente seguido por el robot hasta cumplir el objetivo
del estado.
Se han realizado distintos experimentos para probar el conjunto de la arquitectura, el
lenguaje, el entorno de desarrollo, y la aplicación descrita. El primero de estos (Figura 0.30)
se ha basado en un entorno de 5.2 x 4.9 m (Figura 0.29), con puertas estrechas y ligeramente
más grandes que el diámetro del robot. Se ha realizado sobre simulador con la plataforma
Quaky-Ant. El resultado muestra que el robot es capaz de realizar la tarea describiendo una
trayectoria suave y evitando los obstáculos.
45
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Goal Position
Home Position
Figura 0.29: Entorno del primer experimento.
46
HUMBERTO M ARTÍNEZ BARBERÁ
Figura 0.30: Resultados del primer experimento.
El segundo experimento (Figura 0.32) se ha realizado con un robot real Nomad 200 en
un laboratorio (Figura 0.31) de la Universidad de Örebro, Suecia, con un entorno de 4.2 x 7.5
m y sin corregir la odometría. Es importante destacar que, además de realizar la tarea, este
47
CAPÍTULO 0: DESCRIPCIÓN GENERAL
experimento se ha repetido diversas veces con diferentes puntos de inicio y fin, e incluso con
obstáculos
humanos
a
baja
velocidad.
Cuando
se
ajustaron
correctamente
los
comportamientos y reglas de fusión (que fueron desarrollados para el robot Quaky-Ant) el
robot no colisionaba y encontraba siempre la ruta hasta el punto objetivo, siempre y cuando
existiera una.
Figura 0.31: Entorno del segundo experimento.
48
HUMBERTO M ARTÍNEZ BARBERÁ
Figura 0.32: Resultados del segundo experimento.
El tercer experimento compara el comportamiento del robot con generación de
trayectorias basada en A* y en el método directo. En el primer caso la generación es un
proceso deliberativo mientras que en el segundo es un proceso puramente reactivo. El trayecto
deliberativo (Figura 0.33) presenta un resultado similar al del primer experimento. En este, los
comportamientos reactivos ayudan al robot en el problema de evitar obstáculos, a la vez que
el módulo de generación de trayectorias produce caminos seguros. El trayecto reactivo
(Figura 0.34) discurre de forma diferente. Mientras que los comportamientos reactivos ayudan
al robot en el problema de evitar obstáculos, el módulo de generación de trayectorias produce
caminos directos al objetivo sin tener en cuenta los obstáculos. En este caso el robot completa
la tarea pero describe una trayectoria que depende de la configuración del entorno. Es
importante notar que usando esta aproximación reactiva el robot puede quedar atrapado en
obstáculos con forma de U o en ciclos infinitos.
49
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Figura 0.33: Tercer experimento, trayecto deliberativo.
Figura 0.34: Tercer experimento, trayecto reactivo
El cuarto experimento (Figura 0.37) se ha realizado con un robot real Nomad 200 en un
laboratorio (Figura 0.35) de la Universidad de Örebro, Suecia, con un entorno de 4.2 x 7.5 m,
y con corrección de odometría. La preparación es similar a la del segundo experimento, pero
involucrando bordos más largos y un tiempo de ejecución de la tarea mucho mayor. El robot
comienza en un punto cualquiera e intenta construir un mapa del entorno moviéndose hacia
50
HUMBERTO M ARTÍNEZ BARBERÁ
las áreas no exploradas. Dos son las diferencias más notables con respecto a los anteriores
experimentos:
• Adicionalmente al mapa de celdas difusas, que es usado para la generación de
trayectorias, el robot también construye un mapa de segmentos difusos, que es usado para
localización (ver capítulo 2).
• El agente Planning es similar a los anteriores, pero en lugar de recibir las coordenadas
objetivo de un operador humano (o como parámetro de un programa BG), recibe las
coordenadas de un nuevo agente llamado Topology, que incorpora un planificador basado
en un mapa topológico.
Figura 0.35: Entorno del cuarto experimento.
El planificador topológico construye y actualiza un mapa global topológico (Duckett y
Nehmzov, 1998) mediante la técnica de relajación (Duckett et al., 2000). El mapa topológico
(Figura 0.36) lleva la cuenta de los lugares que han sido visitados, y también los posibles
lugares no ocupados. El planificador selecciona uno de estos lugares y envía sus coordenadas
al agente Planning como nuevo objetivo. El experimento muestra que el robot es capaz de
construir un mapa del entorno mientras que se mantiene localizado. Es importante notar que
los errores de odometría son muy significativos al cabo de unos minutos.
51
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Figure 0.36: Mapa topológico.
Figura 0.37: Resultado del cuarto experimento.
De los experimentos realizados se pueden extraer una serie de conclusiones acerca del
entorno BGen, y del lenguaje y arquitectura subyacentes.
Los programas BG son simples y fáciles de comprender. Sin conocer de antemano que
hace un programa BG, es fácil figurarse el propósito del mismo, aunque, por supuesto, esto
también depende de cómo se haya escrito el programa. En cualquier caso, aunque las tareas
involucradas para alcanzar el objetivo del robot son complicadas, el lenguaje BG oculta los
52
HUMBERTO M ARTÍNEZ BARBERÁ
detalles y permite al programador concentrase en el dominio de la aplicación. Las técnicas
específicas de navegación están accesible desde el lenguaje, e incluso nuevos métodos pueden
ser añadidos. Esto favorece y ayuda a la hora de conseguir un rápido desarrollo de los
programas del robot así como su mantenimiento futuro.
El entorno BGen es abierto y portable. Mientras que el desarrollo del entorno BGen se
ha llevado a cabo en la Universidad de Murcia en paralelo al desarrollo del robot Quaky-Ant,
alguno de los experimentos muestran resultados con otro completamente diferente. El
interface del entorno BGen con el nuevo sistema, y la modificación del programa BG para
tener en cuenta la geometría y velocidad del nuevo robot se llevaron a cabo en menos de dos
semanas, mientras que el autor estaba en la Universidad de Örebro, Suecia. Es importante
hacer notar que el entorno todavía no se había probado al completo en un robot real. Estos
hechos apoyan la idea de la flexibilidad del entorno y su facilidad de transporte a otros
sistemas.
Los entornos robóticos basados en Java son prácticos y usables. Comúnmente se
argumenta que los programas en Java son extremadamente lentos. Si bien es cierto que los
programas Java compilados just-in-time son ligeramente más lentos que el equivalente
compilado nativo, es posible utilizar Java en la mayoría de las aplicaciones robóticas.
Además, el intérprete de BG se ejecuta por encima de la máquina virtual Java, lo que implica
que los programas BG se interpreten dos veces. Esto no es un grave problema ya que la
mayoría de las técnicas intensivas en procesamiento se ejecutan directamente como librerías
Java (construcción de mapas, generación de trayectorias, y localización). El valor añadido del
entorno BGen es, de nuevo, su portabilidad y reutilización de código a dos niveles: a nivel de
sistema, reusando las librerías completas de BGen, y a nivel de aplicación, reusando los
programas BG para diferentes plataformas.
El entorno BGen es extensible. Uno de los experimentos muestra que nuevas técnicas o
módulos pueden ser añadidos a la arquitectura BGA. En este caso el planificador topológico
no es un programa Java, sino un programa en C/C++ que se comunica por medio del Java
Native Interface. De esta forma se pueden añadir al entorno nuevos servicios o aplicaciones
ya desarrollados.
53
CAPÍTULO 0: DESCRIPCIÓN GENERAL
0.6 Aprendizaje de la fusión de comportamientos
Un punto muy importante del control basado en comportamientos es como coordinar
eficientemente conflictos y competición entre distintos tipos de comportamientos para
conseguir un buen rendimiento (ver capítulo 7). Se ha mostrado la solución de la arquitectura
BGA que utiliza un sistema basado en reglas difusas. El uso de técnicas de aprendizaje en la
base de reglas de fusión puede resultar en una mejora del rendimiento del robot. Una de las
ventajas de este sistema es que el robot puede ser entrenado con solo unos comportamientos,
y posteriormente si se desea añadir otros nuevos solo se necesita reentrenar la bases de reglas
de fusión, preservando los comportamientos anteriores.
El proceso de aprendizaje propuesto en esta tesis es como sigue. El usuario selecciona el
agente BGA cuya base de reglas de fusión va a ser aprendida. Una aproximación interesante y
práctica es empezar con un conjunto de reglas difusas predefinido, que se utilizan como
entrada al proceso de aprendizaje (al que también se le puede pasar un conjunto de reglas
vacío). Para el aprendizaje se utiliza un algoritmo genético, cuyos individuos son bases de
reglas difusas (codificación Pittsburgh), y el resultado del proceso de aprendizaje es una base
de reglas de fusión.
El algoritmo utilizado se basa en una modificación de uno utilizado para el modelado
difuso (Gómez Skarmeta y Jiménez, 1997). Para simplificar la codificación, el algoritmo
desarrollado impone y asume algunas restricciones: los conjuntos difusos han de ser
trapezoidales y las reglas sólo deben usar conjunciones. Mientras que la primera restricción
puede condicionar el rendimiento de los resultados, la segunda no impone ninguna restricción.
Cualquier conjunto de reglas difusas con disyunciones y negaciones puede ser reducido a otro
conjunto
con
solo
conjunciones,
simplemente
cambiando
el
número
de
reglas,
y
complementando algunos conjuntos. Los conjuntos trapezoidales se utilizan comúnmente en
muchas aplicaciones de control, y los experimentos muestran que son adecuados para este
problema. La codificación original se ha modificado para incluir las reglas background.
54
HUMBERTO M ARTÍNEZ BARBERÁ
Aparte de la codificación de las reglas y algunos parámetros derivados de esta, un punto
clave es la definición de la función de fitness. Esta depende fuertemente del contexto de la
aplicación y generalmente no es simple. La función debe medir cuanto de bueno es cada
individuo, que, de hecho, afecta el rendimiento del robot. De esta forma, la función debe tener
en cuenta como ejecuta el robot una tarea predefinida. El usuario debe definir un entorno de
simulación con, al menos, los siguientes parámetros: las posiciones inicial y final del robot y
qué debe de hacer, un entorno con paredes, puertas, pasillos y obstáculos, y el número de
simulaciones que realizará cada individuo. Para evitar ciclos infinitos en las simulaciones, se
necesitan adicionalmente: el tiempo máximo disponible para solucionar la tarea, y el número
máximo permitido de colisiones.
La evaluación de la función de fitness para cada individuo implica el uso de su base de
reglas difusas como base de reglas de fusión en un programa BG, y la simulación del robot
con dicho programa. Se tendrá en cada generación tantos programas BG como individuos
haya. Cuando se termina una simulación, esta devuelve los siguientes parámetros:
• Si el objetivo se ha conseguido o no.
• El número total de pasos de simulación realizados.
• El número de colisiones del robot con el entorno.
• El último tramo completado por el robot.
• El porcentaje de la distancia que le queda al robot para terminar el tramo actual.
La función de fitness definida a partir de estos parámetros intenta tener en cuenta los
distintos aspectos relevantes para conseguir un buen rendimiento del robot: premiar cuando la
tarea se completa, premiar tiempos de ejecución bajos, y castigar las colisiones con las
paredes. Adicionalmente, esta función debe medir el rendimiento de un robot incluso si no ha
completado el objetivo. Para ello también tiene en cuenta el número de tramos completados, y
el grado de completitud del último tramo no completado. De otra forma la evolución no puede
distinguir entre individuos que no son suficientemente buenos para lograr completar el
objetivo, y sin embargo pueden ser buenos candidatos para seguir evolucionando. Este hecho
es muy importante, sobre todo al comienzo del proceso de la evolución.
55
CAPÍTULO 0: DESCRIPCIÓN GENERAL
Se han llevado a cabo diferentes experimentos para comprobar el rendimiento del
algoritmo genético en el problema de la fusión de comportamientos. En todos los casos, el
proceso de aprendizaje comienza con un programa BG determinado, que es el descrito en los
ejemplos de la arquitectura BGA. El programa BG especifica el número de variables de
entrada y salida, que corresponden a las variables de entrada del bloque de fusión y a los
comportamientos reactivos respectivamente. Debido a que la simulación consume mucho
tiempo, los escenarios usados (Figura 0.38) son muy pequeños, aunque presentan tres
problemas reactivos clásicos: el objetivo es ocluido por un obstáculo simétrico, el robot ha de
girar a la izquierda con el objetivo a la derecha, y el robot ha de girar a la derecha con el
objetivo a la izquierda. Las probabilidades de mutación y cruce se han seleccionado
ligeramente más altas que en sistemas genéticos típicos. La razón está relacionada con el
tiempo necesario para realizar una simulación. Si las probabilidades se escogen bajas el
tiempo de exploración (el tiempo que se necesita para explorar diferentes partes del espacio
de soluciones) es mayor. Si las probabilidades se escogen altas el tiempo de exploración es
menor, con el inconveniente de una búsqueda local más pobre. En la presente aplicación la
capacidad de búsqueda global se considera más importante, y por ello se utilizan
probabilidades altas.
Figura 0.38: Entornos para aprendizaje.
La definición de comportamientos en BG y su correspondiente modelo en BGA
permiten el uso de pesos para indicar la importancia de cada comportamiento. En este
contexto de aprendizaje los pesos juegan un papel importante porque es una información que
no es evolucionada y es mantenida estática durante todo el proceso de aprendizaje. Además,
esta información ha sido proporcionada por el usuario que sirve domo experto en el dominio.
Para comprobar el efecto de estos pesos en el proceso de aprendizaje se han efectuado dos
experimentos, uno de los cuales tiene todos los pesos puestos al mismo valor. Se ha
comprobado en estos casos que el aprendizaje con pesos produce mejores resultados que el
56
HUMBERTO M ARTÍNEZ BARBERÁ
mismo sin pesos, y que, además, produce mejores resultados que con el conjunto de reglas
iniciales.
0.7 Una aplicación de teleoperación basada en el web
En los últimos años las aplicaciones robóticas basadas en el web (Taylor y Dalton,
2000) en crecido de una manera que la mayoría de los laboratorios importantes tienen
aplicaciones para acceder a sus robots desde Internet. Además existen diferentes tecnologías
para permitir que un robot sea teleoperado desde una red.
BGen-Web es una aplicación robótica de exploración basada en el web (ver capítulo 8),
que sirve como prueba de concepto. El ánimo de esta aplicación es proporcionar un servicio
basado en el web para un control supervisor (Sheridan, 1992) altamente interactivo de un
robot móvil con el objetivo de realizar una exploración de un entorno poco conocido o
completamente desconocido. Así, los objetivos de la aplicación BGen-Web se pueden resumir
en:
• La aplicación debe funcionar en entornos reales tanto interiores como exteriores. En el
último caso, puesto que no se dispone de ningún vehículo todo terreno, sólo se consideran
caminos asfaltados. La intención es el uso en tareas de exploración y vigilancia.
• El sistema se debe de construir sobre la arquitectura BGA, haciendo así uso de los
recursos previamente desarrollados.
• La experiencia del usuario debe ser los más interactiva posible, sólo limitada por los
requisitos de ancho de banda.
• El grado de control de usuario debe depender de un perfil, variando desde el control en
lazo cerrado hasta el control supervisor por parte del operador.
Como se puede observar en ejemplos disponibles en la literatura (Taylor y Dalton,
2000a) (Saucy y Mondada, 2000) (Simmons et al., 2000) los interfaces basados en HTNL no
son suficientes para sistemas altamente interactivos porque demandan mucho procesamiento
57
CAPÍTULO 0: DESCRIPCIÓN GENERAL
en el lado del servidor. Como se muestra en otro ejemplo (Tso et al., 1998), el modo de
solucionar el problema es incrementar el procesamiento en el cliente, y en este caso la
tecnología Java proporciona un marco de trabajo multi plataforma para conseguir interfaces
de alto rendimiento. Puesto que la arquitectura BGA y su correspondientes entorno BGen y
lenguaje BG han sido desarrollados en Java, es sencillo incorporar los desarrollos previos en
una arquitectura basada en web. Aunque la aplicación BGen-Web no es dependiente de una
plataforma concreta, se ha utilizado el robot Quaky-Ant. Entre otros elementos (ver capítulo
2), dispone de una serie de sensores de ultrasonidos e infrarrojos, una cámara de vídeo, un
radio módem para transmisión bidireccional de datos y un emisor de vídeo a 2.4 GHz.
La arquitectura BGen-Web (Figura 0.39) utiliza HTML como el mecanismo de
distribución del software y los ficheros de configuración, y para ejecutar el applet Java. Los
clientes proporcionan interfaces para datos sensoriales, realimentación de visual, y control del
robot. El servidor web proporciona el repositorio de software y la pasarela para el protocolo
de la arquitectura BGA (KQML). El servidor de vídeo proporciona vídeo en tiempo real.
Ambos servidores pueden ejecutarse físicamente en el mismo equipo. La pasarela BGA actúa
como un servicio proxy: recibe peticiones de los distintos clientes y las deposita en la pizarra,
y lee información sensorial de la pizarra y la envía a los distintos clientes. El vídeo en tiempo
real lo proporciona la Java Media Framework (JMF), que es una API que permite que las
aplicaciones y applets incorporen media basados en tiempo. Así, permite a los programadores
desarrollar software que presenta, captura y almacena dichos media, y también controla el
tipo de procesamiento aplicado a esos flujos. Por último, la JMF ofrece soporte RTP (RFC
1889) para ser usado con aplicaciones de vídeo o audio bajo demanda o aplicaciones de
videoconferencia interactiva. El tipo de media usado para la transmisión del BGen-Web es la
recomendación ITU H.261 (Tekalp, 1995). En la implementación actual
de BGen-Web el
servidor de vídeo envía imágenes QCIF H.261 a 7.5 frames/s, que proporciona suficiente
calidad de vídeo sobre Internet.
58
HUMBERTO M ARTÍNEZ BARBERÁ
Figura 0.39: Arquitectura BGen-Web.
La operación de BGen-Web es muy simple. Cuando el applet (Figura 0.40) arranca, se
conecta tanto al proxy BGA (por medio de una dirección unicast) como al servidor de vídeo
(por medio de una dirección multicast), y comienza a recibir datos de los sensores y
realimentación de vídeo en tiempo real. El applet tiene unos componentes que muestran las
lecturas de los sensores (ultrasonidos, infrarrojos, sensores virtuales, bumpers, compás,
odometría y GPS), el vídeo, el mapa de celdas difusas actual, y que permiten dirigir el robot.
Al comienzo el mapa está vacío y se va actualizando según se desplaza el robot. El operador
en control supervisor puede escoger el punto objetivo que desee haciendo click en el mapa. El
robot ejecuta un programa BG que trata de alcanzar el objetivo actual, y cuando lo alcanza se
detiene y espera. El operador puede seleccionar un objetivo nuevo en cualquier momento. De
esta forma el operador puede explorar entornos desconocidos. Para propósitos de adquisición
de datos, el operador también puede manejar el robot en cerrando el lazo de control por medio
de las barras de despla zamiento de velocidad y giro. Esta opción sólo está disponible a
aquellos operadores cuya identificación de equipo esté almacenada en el proxy. Cuando un
59
CAPÍTULO 0: DESCRIPCIÓN GENERAL
comando de velocidad o giro llega al proxy, este compara la dirección de origen y si esta no
se encuentra entre las permitidas el comando no se envía al robot.
Figura 0.40: Applet de la aplicación BGen-Web.
El sistema BGen-Web ha sido probado satisfactoriamente en diferentes lugares y con
distintas configuraciones (Figura 0.41). La aplicación se muestra como una prueba de
concepto, y el servicio no está disponible en todo momento porque sólo se activa para
exhibiciones y pruebas, ya que el equipo se comparte con otros experimentos.
60
HUMBERTO M ARTÍNEZ BARBERÁ
Figura 0.41: Secuencia de una teleoperación con BGen-Web.
0.8 Conclusiones
Esta tesis presenta una serie de trabajos relacionados con la robótica móvil, que cubren
las diferentes capas necesarias para hacer que un robot móvil desarrolle tareas de forma
autónoma en un entorno desconocido a priori. Cada capa proporciona una serie de
capacidades que son usadas por las capas superiores. Estas capas son la plataforma física, la
capa de fusión sensorial, la capa de navegación, la capa de arquitectura de control y lenguaje
y la capa de aplicación. Se resumen a continuación los resultados y contribuciones
conseguidos en cada una de ellas.
El Quaky-Ant es un diseño simple pero potente, que presenta una arquitectura flexible y
modular. Al ser comparado con robots comerciales, Quaky-Ant resulta mejor en aspectos
como el coste, rendimiento y capacidad de ampliación. Las principales características de este
robot pueden resumirse en:
• Presenta una arquitectura abierta. Dos puntos clave son el uso del bus estándar I2C para la
interconexión de los diferentes módulos y la definición de un protocolo genérico abierto.
61
CAPÍTULO 0: DESCRIPCIÓN GENERAL
• Una consecuencia del diseño de la arquitectura es su robustez (principalmente debido a
los protocolos hechos a medida) y su extensibilidad (todos los módulos son tratados de la
misma forma, independientemente de su naturaleza).
• La arquitectura no está ligada a ninguna arquitectura de control o sistema de desarrollo, a
pesar de que sirve como punto de referencia de la arquitectura de control desarrollada.
Opera suave y correctamente en entornos interiores y es también capaz de operar en
entornos exteriores no accidentados. En cualquier caso, no muestra una destacada velocidad
máxima (aproximadamente 0.45 m/s) porque este no fue una preocupación de diseño. Frente a
esto, exhibe maniobrabilidad y precisión.
Se han desarrollado dos métodos diferentes para resolver el problema de la fusión
sensorial en entornos no conocidos a priori. En este ámbito, las técnicas que se basan en un
modelo y hacen uso de información a priori no son aplicables, y por ello, se han empleado en
su lugar redes neuronales y sistemas basados en reglas borrosas. Estas técnicas de fusión
requieren algunos ejemplos para la etapa de aprendizaje. La obtención de tales ejemplos de
los sensores reales es un problema extremadamente complejo. La aproximación seguida ha
sido emplear ejemplos obtenidos usando un simulador. Por esta razón, se han evaluado e
integrado en el entorno de simulación modelos de sonar y de sensores infrarrojos, también
empleados en la validación de la fusión sensorial. El modelo de sonar está basado en una
técnica de sonar-tracing que ha sido modificada para tomar en consideración los efectos de
lóbulos del sonar, y el modelo de sensores infrarrojos se ha obtenido linealizando los valores
experimentales.
Una vez que se han entrenado los métodos de red neuronal y reglas borrosas, se prueban
sobre diferentes entornos simulados, desde entornos pequeños a grandes. Seguidamente, los
métodos de fusión se comparan con los datos directos del sensor y con un sistema sencillo
basado en reglas heurísticas. Los resultados muestran que, en la mayoría de los casos, los
métodos de fusión propuestos se comportan mejor que los datos directos del sensor y el
sistema de reglas heurísticas. Además de esto, de entre todos los métodos de fusión probados,
62
HUMBERTO M ARTÍNEZ BARBERÁ
el sistema basado en reglas borrosas obtenido usando clustering tabú difuso tiene el mejor
rendimiento general.
El clustering tabú borroso es un algoritmo basado en medias c borrosas, pero con la
introducción de una búsqueda del mínimo global por medio de una técnica de búsqueda tabú.
El algoritmo ha sido aplicado también, satisfactoriamente, a diferentes ejemplos, mostrando
resultados comparables con los algoritmos más conocidos de clustering borroso disponibles
en la literatura.
Se han desarrollado dos esquemas de navegación para permitir al robot Quaky-Ant
navegar en un entorno desconocido a priori. El primer procedimiento se apoya en la
suposición de una odometría fiable para operaciones en cortos intervalos de tiempo y cortas
distancias. Teniendo en cuenta esta suposición, se proporciona al robot un método de
construcción de mapas y un algoritmo de planificación de trayectorias, que se ejecutan
simultáneamente, de forma que a medida que el robot se desplaza, construye un mapa,
resultando en un movimiento continuo. Esta aproximación se basa en un mapa de celdas
borroso para la construcción del mapa y un algoritmo basado en A* para la planificación de la
trayectoria. El mapa de celdas borroso se ha modificado para cumplir con la tarea requerida:
se emplean sensores virtuales en lugar de los sonars, se ha cambiado el procedimiento de
inicialización y se han incorporado algunas heurísticas para modificar la certeza de las celdas
visitadas. Se ha definido el planificador basado en A* con una nueva función de evaluación
para generar trayectorias que dejan suficiente espacio entre el robot y los obstáculos, y para
ser eficiente en tiempo, de forma que el nodo de evaluación es suficientemente rápido para
ejecutar el planificador con alta frecuencia. El esquema de navegación ha sido usado de forma
satisfactoria en escenarios diferentes, tanto simulados como reales.
La arquitectura BGA aprovecha varios conceptos bien conocidos y desarrollados. Se
basa en una arquitectura de pizarra que coordina un sistema multiagente. Los agentes pueden
estar divididos en niveles y distribuidos de acuerdo con sus funcionalidades, incluyendo los
esquemas de arquitecturas híbridas de tres capas, donde la coordinación del comportamiento
se consigue mediante un método de fusión derivado del context-dependent blending. Las
principales contribuciones de la arquitectura BGA son las siguientes:
63
CAPÍTULO 0: DESCRIPCIÓN GENERAL
• BGA agrupa, de manera efectiva, diferentes conceptos y paradigmas de arquitecturas de
control (a saber, arquitecturas de pizarra, arquitecturas híbridas y sistemas multiagente)
extendiendo y combinando sus características.
• BGA incorpora un nuevo mecanismo de fusión que está basado en el context-dependent
blending,
que
de
forma
efectiva
afronta
el
problema
de
la
coordinación
de
comportamientos. Además de esto, su diseño permite el uso de conocimiento experto
durante el proceso de aprendizaje de la base de reglas de fusión.
• BGA hace uso de KQML, típico en los sistemas de paso de mensajes pero no habitual en
los sistemas de pizarra, implementado sobre UDP para mejorar el rendimiento.
El lenguaje BG es un lenguaje de programación de robots de alto nivel que soporta
completamente la arquitectura BGA y, por tanto, permite al usuario desarrollar sistemas de
control basados en BGA de una manera sencilla y potente a la vez. BG ayuda en el ciclo de
desarrollo de aplicaciones de robótica proporcionando un entorno único y uniforme para
declarar y especificar todos los elementos de la arquitectura BGA (comportamientos, tareas,
blenders y agentes), soportando un modelado del comportamiento a través del uso extensivo
de la lógica difusa. Además, se puede conseguir una arquitectura BGA completamente
distribuida ejecutando los programas BG en diferentes procesadores. El entorno de desarrollo
BGen incorpora un intérprete del lenguaje BG, así como un simulador para diferentes robots
holonómicos adecuados, herramientas de visualización y de grabación de datos, y una
herramienta de aprendizaje de comportamientos.
El entorno BGen está concebido tanto para el desarrollo como para la puesta en
producción, y proporciona un modo genérico y simple de crear y probar programas BG fuera
del robot, y para ejecutar dichos programas en un robot real. Por tanto, los tiempos de
desarrollo y puesta en marcha se reducen considerablemente, además de permitir que
diferentes plataformas reutilicen el código. Mediante una serie de experimentos se ha
mostrado que BGen, que incluye el lenguaje BG y la arquitectura BGA, proporciona un buen
entorno de trabajo para los problemas relacionados con robótica móvil.
64
HUMBERTO M ARTÍNEZ BARBERÁ
Se ha desarrollado un método basado en algoritmos genéticos para asistir en la creación
de las bases de reglas difusas de fusión de comportamientos. El algoritmo se ha empleado
para la obtención y aprendizaje de un nuevo conjunto de reglas de fusión desde un conjunto
inicial de reglas proporcionado por el usuario (que puede ser el conjunto vacío). El algoritmo
genético ha sido adaptado para codificar individuos que son convertidos en bases de reglas
difusas de fusión de BG. Por ello, estos individuos pueden ser reglas difusas estándares
MIMO y también reglas difusas background. Más aún, la función de fitness ha sido definida
para incluir los efectos de la evaluación experimental del comportamiento del robot usando
simulación. Esta función toma en cuenta diferentes parámetros que miden el rendimiento del
robot tanto en ejecuciones satisfactorias como insatisfactorias. Este aspecto es crucial en la
parte inicial del proceso de aprendizaje porque la mayoría de los individuos no alcanzan el
objetivo, y es muy importante discriminar entre ellos para acelerar el proceso de aprendizaje.
Se han llevado a cabo varios experimentos para probar la influencia de los pesos del
comportamiento y los resultados del proceso de aprendizaje. Como se ha mostrado, los pesos
del comportamiento pueden ayudar al proceso de aprendizaje a reducir el espacio de búsqueda
haciendo uso de conocimiento que el usuario tiene del dominio. Por otra parte, el uso del
proceso de aprendizaje mejora el rendimiento del robot con respecto a las reglas de fusión
definidas por el usuario.
BGen-Web es una aplicación de exploración de robótica basada en web, creada a modo
de prueba de concepto. El sistema proporciona un servicio basado en web para realizar una
supervisión altamente interactiva de un robot móvil, con el fin de llevar a cabo una
exploración de un entorno poco conocido o completamente desconocido, y como tal se ha
probado con resultados satisfactorios en entornos reales, tanto internos como externos. Este
sistema se ha construido sobre la arquitectura BGA con el fin de usar todos los recursos del
software previamente desarrollado. Usando los datos de los sensores y la realimentación del
vídeo en tiempo real, la experiencia del operador es altamente interactiva cuando controla un
robot a través de Internet (sólo está limitado por los requisitos de ancho de banda). Este es un
aspecto que lo distingue de otras aplicaciones de robótica a través de web.
Otro aspecto importante de BGen-Web es la granularidad de tareas. Algunos sistemas
proporcionan interfaces de control al nivel más bajo (por ejemplo, control directo de los
65
CAPÍTULO 0: DESCRIPCIÓN GENERAL
motores), mientras que otros ofrecen control de comandos a un nivel mayor (por ejemplo,
situar el robot en un lugar previamente conocido). BGen-Web proporciona comandos a un
nivel intermedio: el usuario escoge la posición deseada y el robot intenta alcanzarla. Por ello,
BGen-Web muestra niveles de interactividad más allá de los presentados por otros sistemas,
que aportan mucha menos información al operador y no se perciben interactivamente. Por otra
parte, BGen-Web opera en entornos del mundo real, como lo hacen WITS y Xavier.
0.9 Trabajo futuro
Esta tesis presenta soluciones en diferentes áreas para conseguir una arquitectura
distribuida de control inteligente en robots móviles autónomos. A partir de esta tesis se abren
nuevas líneas y áreas de trabajo para la mejora de los diferentes componentes.
Las técnicas de fusión de sensores han mostrado un buen rendimiento, especialmente la
aproximación híbrida neuro-fuzzy que, desde un punto de vista cualitativo, produce buenos
resultados en el mundo real. La configuración de los experimentos necesaria para extender
estos tests, con el fin de incluir datos reales y obtener resultados cuantitativos, es muy difícil.
Más aún, la adquisición de datos reales es realmente compleja y no se ha realizado todavía.
Hacerlo mejoraría también los resultados.
El problema de la navegación se ha resuelto satisfactoriamente para la mayoría de las
situaciones, pero existen aún algunos aspectos que deben ser mejorados. El mapa de celdas
difusas asume una odometría perfecta, y las celdas se actualizan basándose en la localización
actual. Si este mapa se usa junto con un mapa de segmentos difusos, la calidad de los
resultados depende fuertemente de la frecuencia de actualización de la localización. Si el
seguimiento del entorno actual se puede por cualquier razón, la localización inicial podría
fallar en entornos muy simétricos, como los que se encuentran en edificios con pasillos. En
este caso, una técnica de coincidencia geométrica no es suficiente, y queda como un problema
pendiente.
La arquitectura BGA es suficientemente flexible como para afrontar diferentes
problemas y tareas, como los que se presentan en esta tesis. A pesar de que esta arquitectura
66
HUMBERTO M ARTÍNEZ BARBERÁ
no está restringida a un robot único, son necesarias ciertas estructuras adicionales para poder
emplear BGA en un escenario multi robot. Una posible mejora es la inclusión de una facilidad
de paso de mensajes para la comunicación de diferentes robots, mientras se mantiene el actual
modelo de pizarra para la comunicación interna del robot. En este contexto, la tecnología de
red JINI, basada en Java, es un marco interesante para establecer comunicaciones entre los
diferentes agentes de la arquitectura. Proporciona un mecanismo simple que permite a los
agentes conectarse para formar una comunidad sin ninguna planificación, instalación o
intervención humana. Adicionalmente, el lenguaje BG debería ser modificado para incorporar
las nuevas estructuras de control necesarias para la comunicación con otros robots.
El lenguaje BG, en su origen, fue diseñado para usar mapas de celdas borrosos y
planificación de trayectorias A*. Durante el trabajo de esta tesis se han ido añadiendo nuevos
métodos y técnicas, pero no son visible s en el lenguaje BG. Una clara mejora sería emplear
estructuras de navegación genéricas (construcción de mapas, planificación de trayectorias y
localización) que se instanciarían en tiempo de ejecución usando un fichero de configuración.
De esta forma, el programa de control simplemente enviaría comandos de actualización a
estas estructuras y recibiría la realimentación correspondiente usando un interfaz genérico.
Desde el punto de vista de la legibilidad, el lenguaje podría ser extendido también para
soportar más tipos de datos (booleanos, enteros, estructuras de datos). Desde el punto de vista
del rendimiento, el intérprete BG actual es muy flexible y rápido, pero si los programas BG se
compilasen a Java se lograría un incremento considerable en su velocidad de ejecución, a
expensas de una preparación más compleja del programa (las modificaciones del programa
sobre la marcha no estarían soportadas). Actualmente, la arquitectura BGen soporta los robots
Quaky-Ant y Nomad 200, pero está siendo actualizada para poder trabajar con otros robots
comerciales, como el RWI B21.
Se ha mostrado que, para la fusión de comportamientos, la aproximación basada en
algoritmos genéticos produce buenos resultados. Su desventaja estriba en el tiempo requerido
para obtener una buena solución. El problema es doble. En primer lugar, se requieren
diferentes escenarios para descartar la búsqueda aleatoria y probar diferentes habilidades. Si
hay muchos escenarios, o sus dimensiones son grandes, el algoritmo dedica mucho tiempo a
la simulación del robot. En este caso, el uso de una versión distribuida o paralela del
67
CAPÍTULO 0: DESCRIPCIÓN GENERAL
algoritmo podría mejorar de forma notable el tiempo requerido para obtener una solución.
Además de esto, deberían incorporarse más condiciones de terminación para evitar la pérdida
de tiempo en soluciones claramente inútiles.
La aplicación de robótica basada en web BGen-Web se ha mostrado efectiva. Presenta
un gran nivel de interactividad, principalmente debido al uso de la realimentación con el
vídeo, y tiene muchas aplicaciones interesantes en el mundo real, como la exploración semiautónoma. Normalmente, sólo unos pocos Proveedores de Servicios de Internet (PSI)
proporcionan o permiten el enrutamiento de paquetes multicast. Esto limita claramente el
número de usuarios potenciales. Una mejora podría consistir en la incorporación de un
servidor de vídeo bajo demanda para permitir a los usuarios no multicast recibir la señal de
vídeo, a expensas de una mayor carga sobre la red. Otro problema es la seguridad.
Actualmente, el control del acceso se realiza basándose en la dirección IP, y no se lleva a
cabo ningún cifrado. Una mejora posible consistiría en emplear una implementación Java de
SSL (Alan et alt., 1996) para implementar cifrado y autenticación basados en el uso de
certificados. Finalmente, si se incorpora la tecnología JINI a la arquitectura BGA, la
aplicación BGen-Web se vería favorecida con varios servicios de valor añadido, como el
descubrimiento automático de los servidores BGen-Web.
El
entorno
BGen
también
podría
incorporar
nuevas
tecnologías
para
soportar
simulación distribuida. En primer lugar, el interface de usuario actual incluye únicamente una
representación bidimensional del entorno. BGen podría beneficiarse del uso de tecnologías de
visualización tridimensionales basadas en web, como VRML (Roehl et al., 1997) o Java3D
(Brown and Petersen, 1999). Estas permiten que un navegador de web muestre contenidos
tridimensionales generados desde una aplicación Java. La mayoría de los navegadores
soportan VRML, mientras que sólo los que se ejecutan sobre Windows y Solaris soportan en
la actualidad Java3D. Sin embargo, esto podría cambiar en un futuro próximo ya que Sun está
desarrollando Java3D sobre OpenGL. Por otro lado, el uso del estándar IEEE 1278.1, el
protocolo Distributed Interactive Simulation (DIS) podría permitir el uso de un entorno de
simulación más elaborado y organizado dentro de un marco distribuido. En este caso, DIS
podría ser un punto fundamental para integrar modelos de simulación no monolíticos para
robots múltiples. Además, se encuentran disponibles implementaciones de DIS sobre Java.
68
HUMBERTO M ARTÍNEZ BARBERÁ
69
PARTE IIª
71
Abstract
This thesis presents a series of mobile robotics related works, which cover the different
layers needed to have a mobile robot perform tasks autonomously in an a priori unknown
environment. Soft-Computing techniques (neural networks, genetic algorithms, and especially
fuzzy logic) are used in the development of the different layers.
The hardware architecture that leads to the Quaky-Ant platform is described first. This
is a mobile robot which is the basis of most of the tests of the different algorithms in real
environments. A solution to the sensor fusion problem, which is based on this platform, is
described. The fusion is first solved by using a simple neural network, and then, it is solved
by a hybrid system which is based on the ANFIS network and some clustering methods. One
of them, the fuzzy tabu clustering, has been developed in the context of this thesis. Once a
reliable sensor system is assumed (after the sensor fusion process), the navigation problem is
outlined. This is solved, in two steps, by the application of two navigation techniques. In the
first step a fuzzy grid map is applied for the map-building problem, and an A* search is
applied for the path-planning problem. In the second step, a fuzzy segments map is applied
for the map-building and localisation problems.
Once the basic navigation and environment perception capabilities have been
developed, a distributed control architecture (BGA) is defined. This serves as the basis of the
different control-related elements and allows us to distinguish between deliberative and
reactive processes. The different elements of the architecture communicate by a KQML over
UDP based protocol. To aid in the development of control programs using the BGA
architecture, a programming language (BG) has been defined. It uses heavily fuzzy logic and
serves to define and specify the different control modules. The BGA architecture and the BG
language have been integrated in a programming and simulation environment (BGen), which
has been developed completely in Java in order to be incorporated in different mobile robots.
To solve the behaviour fusion problem a genetic algorithm has been used. It generates and
tunes the fusion fuzzy rules by evaluating the different candidate solutions in the BGen
environment. Finally, an application of both the Quaky-Ant platform and the BGen
environment is shown (BGen-Web). This allows the teleoperation of the mobile robot in order
to perform exploratory tasks and also exhibits a high degree of interactivity due to the use of
multicast video transmission.
73
Chapter 1
Introduction
1.1 Objectives and contents
Mobile robotics is a relatively new research field which is related to the control of
autonomous or semi-autonomous vehicles. Mobile robots are not only collections of
algorithms but also physical embodiments that perform in real environments. Thus, mobile
robots provide a real world framework in which to test theoretic concepts and algorithms.
Although most mobile robots are in an experimental stage, important industrial or commercial
applications are emerging.
This thesis addresses the problem of making a mobile robot work with some reliability
in completely unknown environments. This topic is rather general, and thus, it is related to the
specification of mobile robot tasks, the development of mobile robot behaviours, the
interpretation of the environment, and the validation of the final system. There are many
families of techniques available to tackle the different subproblems. This thesis uses and
develops intelligent techniques from what is known as soft computing (Zadeh, 1994) or
computational intelligence (Schultz et al., 2000), making extensive use of fuzzy logic, genetic
algorithms, and neural networks. One important point is the use of learning techniques to
assist in the development of both robot behaviours and environment interpretation.
Given the robotics problem above, the main objective of this thesis is the use,
development and integration of techniques that do not make use of a priori environment
models. Moreover, a development environment has been created to make the robotics system
robust and to assist in the programming of control software. This development environment
includes modules for specification, learning and validation of mobile robot behaviours, and
integrates a hybrid (reactive-deliberative) architecture which runs in a distributed
environment. The different areas of work of this thesis can be summarised in the following
points:
• Development of a distributed control architecture, based on the use of intelligent
•
techniques.
Specification and development of the platform and the communication model between the
different elements of the system.
75
CHAPTER 1: INTRODUCTION
• Use and development of learning techniques for behaviour definition and combination and
•
for environment interpretation.
Use and development of navigation techniques for unknown environments: map building,
path planning and localisation.
This thesis is divided into a small number of chapters. Each chapter is self-contained,
examines previous research in the area, describes the proposed approach, and finally, presents
some results and conclusions. This organisation reflects the layered approach usually applied
to mobile robotics problems. Chapter 2 deals with the hardware platform developed for
testing and validation of the different algorithms and techniques. It describes the physical
platform of the Quaky-Ant robot and the previous developments upon which it is built.
Chapter 3 is about sensors and sensor fusion. It describes some techniques to obtain valuable
data from noisy and/or unreliable sensors. Chapter 4 is related to robot navigation. It describes
the way the robot builds a model of the world, how it plans the path to reach the desired goal
point, and how it remains referenced and localised to the modelled world. Chapter 5 deals
with the software architecture used to develop mobile robotics control software (BGA) and
the programming language used for robot programming (BG). Chapter 6 presents a sample
software scheme, which is based on the BGA architecture and using the BG language,
developed to perform different tasks, and shows some results obtained with the Quaky-Ant
robot. Chapter 7 is related to robot behaviour learning. It focuses on how to combine the
different behaviours of a robot using learning techniques without the intervention of a human
operator. Chapter 8 presents an application of the different works of this thesis in a web-based
exploration task. Finally, chapter 9 makes a brief overview of the thesis, summarises the
results, and indicates directions of further research. The remaining sections of this
introductory chapter serve as an overview of some fields related to this thesis and to put the
later chapters into context.
1.2 Some background on mobile robotics
1.2.1 Mobile robotics
Mobile robotics (Dudek and Jenkin, 2000) is a relatively new research area that deals
with the control of autonomous and semiautonomous vehicles. What sets mobile robotics
apart from other research areas such as conventional manipulator robotics, artificial
intelligence, and computer vision is the emphasis on problems related to the understanding of
large-scale spaces, that is, regions of space substantially larger than those that can be observed
from a single vantage point. Although at first blush the distinction between sensing in largescale space, with its requirement for mobility, and local sensing may appear obscure, it has
far-reaching implications. To behave in large–scale environment not only implies dealing
with the incremental acquisition of knowledge, the estimation of positional error, the ability to
recognise important or familiar objects or places, and real-time response, but it requires that
all these functionalities be exhibited in concert. This issue of extended space influences all of
76
HUMBERTO M ARTÍNEZ BARBERÁ
mobile robotics. The tasks of moving through space, sensing about space and reasoning about
space are fundamental problems within the study of mobile robots.
Mobile robots are not only a collection of algorithms for sensing, reasoning, and
moving about space. They are also physical embodiments of these algorithms and ideas that
must cope with all of the vagaries of the real world. As such, mobile robots provide a reality
check for theoretical concepts and algorithms. Mobile robotics is the domain where literally
the “rubber meets the road” for many algorithms in path planning, knowledge representation,
sensing and reasoning.
Although most of the mobile robot systems currently in operation are experimental,
some mobile robot systems are beginning to be deployed in industrial settings. Real
applications in which current mobile robots have been successfully installe d are characterised
by one of the following attributes: an absence of an on-site human operator, a potentially high
cost, and the need to tolerate environmental conditions that might not be acceptable to a
human. As such, robots are especially well suited for tasks that exhibit one of more of the
following characteristics:
• An inhospitable environment into which sending a human being would be either very
•
•
•
costly or very dangerous.
A remote environment into which sending a human operator would be too difficult or
would take too long. An extreme instance is domains completely inaccessible to humans
such as microscopic environments.
A task with a very demanding duty cycle or a very high fatigue factor.
A task that is highly disagreeable to a human.
Mobile robots are feats of engineering. The actuators, processors, user interfaces,
sensors, and communication mechanisms that permit a mobile robot to operate must be
integrated so as to permit the entire system to function as a complete whole. The physical
structure of a mobile robot is complex, requiring a considerable investment of human and
financial resources to keep it operating.
Robots can be considered from several different perspectives. At a physical, hardware,
or mechanistic level, robots can be decomposed into the following:
• A mechanism for making the robot move through its environment. It includes the physical
•
•
•
organisation of motors, belts, and gears necessary to make the robot move.
A computer or collection of computers for controlling the robot.
A collection of sensors with which the robot gathers information concerning its
environment.
Communications hardware to enable the robot to communicate to an off-board operator or
any externally based computers.
77
CHAPTER 1: INTRODUCTION
At a device level, the hardware details can be abstracted, and a robot can be considered
as:
• A software level abstraction of the motors, encoders, and motor driver boards that allow
•
•
the robot to move. Most mobile robot hardware manufacturers provide support for the
underlying hardware at this level rather than force the users to deal with the details of
actually turning the motors.
Software level mechanisms or libraries to provide access to the robot’s sensors. For
example, the current readings of the sonar sensors as an array of distances.
A standard communications mechanism such as an interface across a distributed network
to the outside world.
From a still more abstract perspective, mobile robots can be considered at a purely
computational level, so that the sensors, communications, and locomotive systems are seen
simply as software modules that enable the robot to interact with its environment. Typical
components in a software architecture include the following:
• A motion control subsystem.
• A sensor control and processing subsystem.
• A sensor interpretation subsystem, which decides on future actions.
Even higher levels of abstraction exist. The term cognitive robotics is used to refer to
the use of artificial intelligence techniques within a mobile robot and often assumes the
existence of an idealised computational abstraction of the robot.
1.2.2 History of mobile robotics
Artificial intelligence and mobile robotics (Nehmzow, 2000) have always been linked.
Even before the days of the 1956 Dartmouth College Conference, at which the term “artificial
intelligence” was coined, it was known that mobile robots could be made to perform
interesting tasks and to learn. William G. Walter built a couple of mobile robots in the early
50s which were able to learn tasks like obstacle avoidance and phototaxis by instrumental
conditioning, changing charges in a robot’s capacitor which controlled the robot’s behaviour
(Walter, 1950). Early pioneer in artificial intelligence, such as Marvin Minsky and John
McCarthy, became interested in robotics almost straightaway after de 1956 Dartmouth
Conference. In the late 50s Minsky, together with Richard Geenblood and William Gosper,
attempted to build a ping-pong playing robot. Due to technical difficulties with the machine
hardware, eventually a robot was built that could catch a ball using a basket instead of the
robot’s gripper.
At Stanford, Nils Nilsson (Nilsson, 1969) developed the mobile robot SHAKEY (Figure
1.1) in 1969. This robot used two stepper motors in differential drive arrangement to provide
78
HUMBERTO M ARTÍNEZ BARBERÁ
locomotion and was equipped with a triangulating optical range finder, a vidicon TV camera
with controllable focus and iris and tactile sensors mounted on bumpers. It was connected to
DEC PDP-10 and PDP-15 computers via radio and video links for off-board data processing.
SHAKEY used programs for perception, world modelling, and acting. Low level action
routines took care of simple moving, turning, and route planning. Intermediate level actions
strung the low level ones together in ways that robustly accomplished more complex tasks.
The highest level programs could make and execute plans to achieve goals given to it by a
user. The system also generalised and saved these plans for possible future use. SHAKEY’s
tasks included both obstacle avoidance and object movement within a highly structured
environment. All obstacles were simply uniformly coloured blocks and wedges. SHAKEY
maintained a list of formulae representing the objects of its environment and, using the
classical logic based problem solver called STRIPS, it determined plans of actions that it then
executed.
Again, although the reasoning system worked properly, SHAKEY often had problems
generating the symbolic information needed for the planner from the raw data obtained from
the sensors. The hardware was the difficult part of the robot project. Hans Moravec, then a
student at Stanford, recalls (Crevier, 1993):
“An entire run of SHAKEY could involve the robot getting into a room,
finding a block, being asked to move the block over the top of the platform,
pushing a wedge against the platform, rolling up the ramp, and pushing the
block up. SHAKEY never did this as one complete sequence. It did it in
several independent attempts, which each had a high probability of failure.
You were able to put together a movie that had all the pieces in it, but it
really was flaky.”
79
CHAPTER 1: INTRODUCTION
Figure 1.1: The SHAKEY mobile robot.
Also at Stanford, John McCarthy started a project in the early 70s to build a robot that
would assemble a colour television kit, and again the hardware of the robot (the physical act
of inserting components into printed circuit boards with sufficient accuracy) proved to be the
difficult part. Many researcher who were interested in robotics in the early days of artificial
intelligence, left the hardware aspect of robotics aside and concentrated again on the software
and reasoning components of the control system.
The JPL Rover, developed in the 70s at the Jet Propulsion Laboratory in Pasadena, was
designed for planetary exploration. Using a TV camera, laser range finder and tactile sensors
the robot categorised its environment as “traversable”, “non-traversable” and “unknown”.
Navigation was performed by dead reckoning using and inertial compass.
At Stanford, Hans Moravec developed CART (Figure 1.2) in the late 70s (Moravec,
1990). This mobile robot’s task was obstacle avoidance using a camera sensor. The robot
would take nine pictures at one location to create a two-dimensional world model. It would
the move 1 metre ahead and repeat the process. To process those images took 15 minutes: 5
minutes to digitise the 9 pictures, 5 minutes to perform a low level visual reduction of the
image, in which obstacle were represented as circles, and 5 minutes for the maintenance of
the world model and path planning. CART was successful at avoiding obstacles, albeit very
slow. It had, however, problems in getting its own position right or to see obstacles which
lacked sufficiently high contrast.
80
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 1.2: The Stanford CART mobile robot.
In the late 70s, HILARE was developed at LAAS in Toulouse (Briot et al., 1979). It was
one of the first European mobile robot projects. HILARE used computer vision, laser range
finders and ultrasonic sensors to navigate in its environment. A slow process of scene
analysis, which was carried out every 10 seconds, and a faster dynamic vision process, which
was carried out every 20 centimetres of movement, were the underlying control principles.
Nearby obstacles were avoided using ultrasonic sensors. Navigation and path planning were
achieved by using a two-dimensional polygon representation of space and a global coordinate system.
1.2.3 Robot intelligence
But, what about robot intelligence? There is no a usable and clear definition of
intelligence (Nehmzow, 2000). Intelligence refers to behaviour, and the yardstick by which it
is measured is very dependent on education, understanding, and point of view. Alan Turing
wrote in 1947:
“The extent to which we regard something as behaving in an intelligent
manner is determined as much by our own state of mind and training as by
the properties of the object under consideration. If we are able to explain
and predict its behaviour or if there seems to be little underlying plan, we
have little temptation to imagine intelligence. With the same object,
therefore, it is possible that one man would consider it as intelligent and
another would not; the second man would have found out the rules of its
behaviour.”
However intelligence is defined, the definition is tied to human behaviour. Humans
consider themselves intelligent, and therefore any machine that does what humans do has to
81
CHAPTER 1: INTRODUCTION
be considered intelligent too. For mobile robotics, this view has interesting consequences.
People have always felt that games like chess require intelligence, whereas moving about in
the world without any major problem is just ordinary stuff, not requiring intelligence. Rodney
Brooks defined intelligence as:
“… what humans do, pretty much all the time.”
With this definition, ordinary behaviour is the key to intelligent robots. It has proven a
lot more difficult to build robots that move around without getting into trouble than it has to
build chess playing machines. This artificial intelligence based point of view is well accepted
and it serves as a generalised reference, although there are some cases where it is not, as
Wallace Marshall stated in 1987:
“Artificial stupidity may be defined as an attempt by computer scientists to
create computer programs capable of causing problems of a type normally
associated with human thought.”
Like intelligence, the behaviour of any robot cannot be evaluated independently from
the robot’s environment and the task which the robot is performing Robot, task and
environment depend upon each other, and influence each other (Figure 1.3).
Robot
Task
Environment
Figure 1.3: Robot, task and environment interlinks.
A famous example of this fact is Tom Smither’s spider example:
“… extremely competent at survival in the countryside, but utterly
incompetent in the bathtub!.”
This means, consequently, that a general-purpose robot cannot exist, in the same way
that a general-purpose living being does not exist. A robot’s function and operation are
defined by the robot’s own behaviour within a specific environment, taking into account a
specific task. Only the simultaneous description of agent, task and environment described an
agent completely.
82
HUMBERTO M ARTÍNEZ BARBERÁ
1.2.4 Soft Computing
There are many techniques that can be used to provide a robot with intelligence.
Nowadays, one of the hottest field is what is called Soft Computing (Bonissone, 1997), which
is a recently coined term describing the symbiotic use of many emerging computing
disciplines. Lotfi Zadeh wrote in 1994:
“... in contrast to traditional, hard computing, soft computing is tolerant of
imprecision, uncertainty, and partial truth.”
In this context, the main components of Soft Computing are Fuzzy Logic (FL),
Probabilistic Reasoning (PR), Neural Networks (NN) and Genetic Algorithms (GA). Fuzzy
Logic, introduced by Zadeh (Zadeh, 1965), provides a language, with syntax and local
semantics, in which an expert can translate his qualitative knowledge about a problem to be
solved. FL’s main characteristic is the robustness of its interpolative reasoning mechanism.
Probabilistic Reasoning, based of the original work of Bayes (Bayes, 1763) and DempsterShafer’s theory of belief (Dempster, 1967)(Shafer, 1976), provides a mechanism to evaluate
the outcome of systems affected by randomness or other types of probabilistic uncertainty.
PR’s main characteristic is its ability to update previous outcome estimates by conditioning
them with newly available information. Neural Networks, first explored by Rosenbaltt
(Rosenbaltt, 1959) and Widrow and Hoff (Widrow and Hoff, 1960), are computational
structures that can be trained to learn patterns from examples. By using a training set that
samples the relation between inputs and outputs, and a back-propagation type algorithm
(Werbos, 1974), NNs provide a supervised learning algorithm that performs fine-granule local
optimisation. Genetic Algorithms, proposed by Holland (Holland, 1975), provide a way to
perform randomised global search in a solution space. In this space, a population of candidate
solutions, encoded as chromosomes, is evaluated by a fitness function in terms of its
performance. The best candidates evolve and pass some of their characteristics to their
offsprings.
The common denominator of these technologies is their departure from classical
reasoning and modelling approaches that are usually based on boolean logic, analytical
models, crisp classification, and deterministic search. In ideal problem formulations, the
systems to be modelled or controlled are described by complete and precise information. In
this case, formal reasoning systems, such as theorem provers, can be used to attach binary
truth-values to statements describing the state or behaviour of the physical systems.
When attempting to solve real-world problems, however, it is easy to realise that the
problems are typically ill defined systems, difficult to model and with large scale solution
space. In this case, precise models are impractical, too expensive, or non-existent. The
relevant available information is usually in the form of empirical prior knowledge and inputoutput data representing instances of the system’s behaviour. Therefore, approximate
reasoning systems capable of handling such imperfect information are needed. Soft
Computing technologies provide with a set of flexible computing tools to perform these
approximate reasoning and search tasks. Mor eover, these tools can also be combined into
hybrid systems, such as the control of GAs and NNs parameters by FL, the evolution of NNs
topologies and weights by GAs or its application to tune FL controllers, and the realisation of
83
CHAPTER 1: INTRODUCTION
FL controllers as NNs tuned by backpropagation-type algorithms. The figure below (Figure
1.4) provides a graphical summary of these hybrid algorithms and their components. Lighter
cells correspond to techniques and technologies covered or used in this thesis.
In the followings subsections neural networks, genetic algorithms and fuzzy logic are
introduced. The idea is to provide a very light introduction so that the non-expert reader can
follow the parts of this thesis related to intelligent techniques and its applications. For a more
in deep discussion the reader is suggested to consult (Mitchell, 1997) and (Jang et alt., 1997).
Approximate Reasoning Approaches
Probabilistic
Models
Bayesian
Belief Nets
DempsterShafer Theory
Search/Optimisation Approaches
Multivalued and
Fuzzy Logics
MV-Algebras
Fuzzy Logic
Neural
Networks
Feedforward
Single
layer
Evolutionary
Computation
Feedback
Kohonen
SOM
RBF
nets
ART
models
Hopfield
nets
Evol .
Strat.
Evol.
Progr.
Gen.
Progr.
Gen.
Algs.
Compet.
nets
Multi
layer
Probability
of fuzzy
event
Belief
of fuzzy
event
FL controllers
generated and
tuned by GAs
FL controllers
tuned by NNs
NN parameters
controlled by FL
NN topology
and weights
generated by GAs
GA parameters
controlled by GA
GA parameters
controlled by FL
Figure 1.4: Soft Computing overview.
1.2.4.1 Neural networks
Neural network (Mitchell, 1997) learning methods provide a robust approach to
approximating real-valued, discrete-valued, and vector -valued target functions. For certain
types of problems, such as learning to interpret real-world sensor data, artificial neural
networks are among the most effective learning methods currently known. The study of
artificial neural networks has been inspired in part by the observation that biological learning
systems are built of very complex webs of interconnected neurons. In rough analogy, artificial
neural networks are built out of a densely interconnected set of simple units, where each unit
takes a number of real-valued inputs and produces a single real-valued output.
The simplest neural network is the perceptron. It is composed of a single neuron with
some real inputs and one output, which gives 1 if a linear combination of the input is greater
84
HUMBERTO M ARTÍNEZ BARBERÁ
than 0, or –1 otherwise. Formally, given the inputs x1, …, xn, the output is as follows (Eq.
1.1):
 1 w0 x0 + ... + wn x n > 0
o(x1 ,...,xn ) = 
otherwise
−1
(1.1)
where wi, i=0,…,n are the weights assigned to the inputs. These weights indicate the
importance of the corresponding inputs. The perceptron produces an output using a square
function, which is called the squashing function (Figure 1.5).
Figure 1.5: Single perceptron structure.
This kind of node is very powerful for function representation or approximation. A
perceptron can represent logic functions as AND and OR. It is important to notice that a
single perceptron can not represent a XOR function or other arbitrary functions.
The basic rule for perceptron training is very simple. It takes and applies each example
and varies the weights wi (Eq. 1.2) accordingly.
∆ wi = η(t − o) xi
(1.2)
wi ← wi + ∆wi
where t is the desired output, and o is the output obtained with the current example. This
process converges provided the example set is linearly separable and η is small enough. This
η, called learning rate, controls the convergence speed of the hyperplane that separates the
examples.
When the examples are not linearly independent separable the basic rule does not
provide convergence. In this case a different technique, known as delta rule, solves the
problem by providing asymptotic convergence to the hyperplane that best matches the
examples, even if they are not linearly separable. It uses a function (Eq. 1.3) to measure the
training error, that is, how far is the approximation of the perceptron from the function
defined by the examples.
r
1
2
E( w) = ∑ (td − od )
2 d∈D
(1.3)
85
CHAPTER 1: INTRODUCTION
where D is the training set, td is the desired output and o d is the output of the perceptron.
From the definition it is clear that the function to be minimised is a hyperbolic surface with a
global minimum. The delta rule is a two-step process. First, the different ∆wi are calculated
for each example, and second, the wi are updated with the new values (Eq
. 1.4). The different
wi are initialised randomly and the ∆wi are set to zero. The process is stopped when the
desired error is achieved. Other learning rules are described in the literature (Mitchell, 1997).
∆ wi ← ∆ wi + η(t − o)xi
(1.4)
wi ← wi + ∆wi
Figure 1.6: Multilayer neural network.
The perceptron can approximate simple and linear functions. If more complex functions
are to be approximated or modelled a richer structure is needed. The multi-layer network
(Figure 1.6) is similar to a series of interconnected perceptrons where the linear output unit
has been replaced by a non-linear function (Figure 1.7), usually a sigmoid function (Eq. 1.5).
Figure 1.7: Sigmoid neuron structure.
σ(y) =
1
y
1 + e−
(1.5)
This structure is more complex than the single perceptron, and thus, it can approximate
more complex and non-linear functions. In a similar way as with the delta rule, the
backpropagation algorithm learns the weights for a multilayer network. It also uses a training
error function (Eq. 1.6) considering different input and output nodes.
86
HUMBERTO M ARTÍNEZ BARBERÁ
r
1
2
E( w) = ∑ ∑ (t kd − okd )
2 d∈D k ∈O
(1.6)
where D is the training set, O is the output nodes set, tkd is the desired output of the
example d in the node k, and okd is the output obtained in node k for the example d. In this
case the error surface is not like the perceptron one, and now more than one local minimum
can appear. Unfortunately, this means that gradient descent is guaranteed only to converge
toward some local minimum, and not necessarily to the global minimum error. Despite this
fact, the backpropagation has been found to produce excellent results in many real-world
applications.
The backpropagation algorithm proceeds as follows. It first initialises the values of the
different weights wi randomly. Then, it iteratively applies a two-phase process to each
example. The first phase propagates the input forward the network, computing the output ou
of every unit u in the network. The second phase propagates the error backward through the
network. For each network output unit k, its error term is computed (Eq. 1.7).
δ k ← o k (1 − ok )(t k − ok )
(1.7)
Then, for each hidden unit h, its error term is computed (Eq. 1.8).
δ h ← oh (1 − oh ) ∑ wkhδ k
(1.8)
k∈O
Finally, the network weights wji are updated (Eq. 1.9).
∆w ji = ηδ j x ji
(1.9)
w ji ← w ji + ∆w ji
This method (the stochastic gradient descent backpropagation) is in fact very similar to
the delta rule and takes into account the derivative of the sigmoid units. The algorithm is
typically applied thousands of times (epochs) over the entire training set. The termination
condition is usually the maximum number of epochs, although different criteria, like the
learning error gradient, threshold can be used.
The set of functions that can be represented by a feedforward multilayer network
depends on the width and depth of the network. Although much is still unknown about which
function classes can be described by which types of networks, three quite general results are
known:
• Boolean functions. Every boolean function can be represented exactly by some network
•
with two layers of units, although the number of hidden layers grows exponentially in the
worst case with the number of network inputs.
Continuos functions. Every bounded continuous function can be approximated with
arbitrarily small error (under a finite norm) by a network with two layers of units
(Cybenko, 1989). The theorem in this case applies to networks that use sigmoid units at
87
CHAPTER 1: INTRODUCTION
•
the hidden layer and linear units at the output layer. The number of hidden units required
depends on the function to be approximated.
Arbitrary functions. Any function can be approximated to arbitrary accuracy by a network
with three layer of units (Cybenko, 1988). Again, the output layer uses linear units, the
two hidden layers use sigmoid units, and the number of units required at each layer is not
known in general.
1.2.4.2 Genetic algorithms
Genetic algorithms (Mitchell, 1997) provide a learning method motivated by an analogy
to biological evolution. Rather than search from general-to-specific hypotheses, or from
simple-to-complex, genetic algorithms generate successor hypotheses by repeatedly mutating
and recombining parts of the best currently known hypotheses. At each step, a collection of
hypotheses, called the current population, is updated by replacing some fraction of the
population by offspring of the best current hypotheses. The best hypothesis is defined as the
one that optimises a predefined numerical measure for the problem at hand, called the
hypothesis fitness.
Although different implementations of genetic algorithms vary in their details, they
typically share the following structure. The algorithm operates by iteratively updating the
population, a pool of hypotheses. On each iteration, all members of the population are
evaluated according to the fitness function. A new population is then generated by
probabilistically selecting the best individuals from the current population. Some of these
selected individuals are carried forward into the next generation population intact. Others are
used as the basis for creating new offspring individuals by applying genetic operations such as
crossover and mutation.
Hypotheses in genetic algorithms are often represented by bit strings, so that they can be
easily manipulated by genetic operators. These hypotheses can be quite complex. For example
sets of if-then rules can easily be represented in this way by choosing an encoding of rules
that allocates specific substrings for each rule precondition and postcondition.
The generation of successors in a genetic algorithm is determined by a set of operators
that recombine and mutate selected members of the current population. These operators
correspond to idealised versions of the genetic operations found in biological evolution. The
two most common operators are crossover and mutation. The crossover operator (Figure 1.8)
produces two new offspring from two parent strings by copying selected bits from each
parent. The bit at position i in each offspring is copied from the bit at position i in one of the
parents. The choice of which parent contributes the bit for position i is determined by and
additional string called the crossover mask. In addition to recombination operators, a second
type of operators produces offspring from a single parent.
88
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 1.8: Crossover operator.
The mutation operator (Figure 1.9) produces small random changes to the bit string by
choosing a single bit at random and then changing its value. Some genetic systems employ
additional operators, especially operators that are specialised to the particular hypothesis
representation used by the system (Mitchell, 1997).
Figure 1.9: Mutation operator.
The fitness function defines the criterion for ranking potential hypotheses and for
probabilistically selecting them for inclusion in the next generation population. If the task is
to learn classification rules, then the fitness function typically has a component the scores the
classification accuracy of the rule set over a set of provided examples. More generally, when
the bit-string hypothesis is interpreted as a complex procedure, the fitness function may
measure the overall performance of the resulting procedure rather than performance of
individual rules. This is the case when the bit-string represents a collection of if-then rules
that will be chained together to control a robotic device. In some genetic systems the
probability that a hypothesis will be selected is given by the ratio of its fitness to the fitness of
other members of the current population (Eq. 1.10). This method is called roulette wheel
selection.
Pr(h) =
f (h)
∑ f (hi )
(1.10)
i
In tournament selection two hypotheses are first chosen at random from the current
population. With some predefined probability p the best of these two is then selected, and
with probability (1 – p) the worst hypothesis is selected. Tournament selection often yields a
more diverse population than roulette wheel selection.
Genetic algorithms employ a randomised beam search method to seek a maximally fit
hypothesis. To contrast the hypothesis space search of genetics with that of neural network
backpropagation, for example, the gradient descent search in backpropagation moves
smoothly from one hypothesis to a new hypothesis which is very similar. In contrast, the
genetic algorithm search can move much more abruptly, replacing a parent hypothesis by an
offspring which may be radically different from the parent. Then a genetic algorithm is less
likely to fall into the same kind of local minima that can plague gradient decent methods.
In many natural systems, individual organisms learn to adapt significantly during their
lifetime. Different evolution models can be defined to integrate species evolution and
individual learning. In Lamarck ian evolution the experiences of a single individual directly
affect the genetic makeup of its offspring. Although current scientific evidence
89
CHAPTER 1: INTRODUCTION
overwhelmingly contradicts Lamarck’s model in biological systems, recent computer studies
have shown that Lamarck processes can sometimes improve the effectiveness of genetic
algorithms. Although Lamarckian evolution is not an accepted model of biological evolution,
other mechanisms have been suggested by which individual learning can alter the course of
evolution. The Baldwin effect, which provides an indirect mechanism for individual learning
to positively impact the rate of evolutionary progress, is based on the following observations:
if a species is evolving in a changing environment, there will be evolutionary pressure to
favor individuals with capability to learn during their lifetime; and those individuals who are
able to learn many traits will rely less strongly on their genetic code to hard-wire traits,
relying on individual learning to overcome the missing or not quite optimised traits in their
genetic code.
1.2.4.3 Fuzzy sets and systems
Fuzzy set theory (Jang et alt., 1997) provides a systematic calculus to deal with
imprecise and incomplete information linguistically. Numerical computations can be
performed using the membership functions associated to linguistic labels. These linguistic
labels can be placed in the antecedent and consequent of if-then rules, resulting in what is
called fuzzy if-then rules. Moreover, a selection of fuzzy if-then rules forms the key
component of a fuzzy inference system that can effectively model human expertise in a
specific application.
A classical set is a set with a crisp boundary. Usually a classical set (Eq. 1.11) can be
defined using a membership function µ, which characterises the elements that belong to the
set over a given universe of discourse X.
µc : X → {0,1}
(1.11)
In contrast to a classical set, a fuzzy set, as the name implies, is a set without a crisp
boundary. That is, the transition from “belong to a set” to “not belong to a set” is gradual.
Usually, a fuzzy set (Eq. 1.12) is defined using a membership function µ, which characterises
the degree of membership to the set of all the elements from a given universe of discourse X.
Given this definition, a classical set is an especial case of a fuzzy set where the elements have
full or zero degree of membership.
µf : X → [0,1]
(1.12)
A fuzzy set is uniquely specified by its membership function. In practice, when the
universe of discourse X is a continuous space, X is usually partitioned (Figure 1.10) into
several fuzzy sets whose membership functions cover X in a more or less uniform manner.
These fuzzy sets, which usually carry names that conform to adjectives appearing in daily
linguistic usage, are called linguistic values or linguistic labels.
90
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 1.10: A fuzzy partition.
Union, intersection, and complement are the most basic operations on classical sets.
Corresponding to these ordinary operations, fuzzy sets have similar operations: union (Eq.
1.13), which intuitively corresponds to the smallest fuzzy set containing A and B, intersection
(Eq. 1.14), which intuitively corresponds to the largest fuzzy set contained in both A and B,
and complement (Eq. 1.15), which intuitively corresponds to the elements that do not belong
to the set A. The operations defined this way are usually referred as classical fuzzy operators.
C = A∪ B
µC ( x) = max{µA (x), µB (x)}
(1.13)
C = A∩ B
µC (x) = min {µ A ( x), µ B ( x)}
(1.14)
A= X−A
µ A (x) = max{0,1 − µA (x)}
(1.15)
Since most fuzzy sets in use have a universe of discourse X consisting of the real line R,
it would be impractical to list all the pairs defining a membership function. A more
convenient and concise way to define a membership function is a mathematical formula.
Typical membership functions (Figure 1.11) have triangular, trapezoidal gaussian or bell
shapes.
91
CHAPTER 1: INTRODUCTION
Figure 1.11: Standard membership function shapes.
The extension principle is a basic concept of fuzzy sets theory that provides a general
procedure for extending crisp domains of mathematical expressions to fuzzy domains. If f is a
function from X to Y, and A is a fuzzy set on X defined as (Eq. 1.16):
A = µ A (x1 ) / x1 + ... + µA (xn )/ x n
(1.16)
Then the extension principle states that the image of the fuzzy set A under the mapping f
can be expressed as a fuzzy set B (Eq. 1.17):
yi = f ( xi )
(1.17)
µB ( y) = max {µA (xi )}
x= f
−1
(y)
B = f (A) = µ B (y1 )/ y1 + ...+ µB (y n )/ yn
Binary fuzzy relations (Eq. 1.18) are fuzzy sets in X x Y, which can map each element in
X x Y to a membership grade between 0 and 1. These fuzzy relations are in fact 2D
membership functions.
R = {(( x, y), µR (x, y)) (x, y) ∈ X × Y}
(1.18)
A fuzzy if-then rule assumes the form (Eq. 1.19):
(1.19)
If x is A then y is B
where A and B are linguistic values defined by fuzzy sets on universes of discourse X
and Y. Fuzzy rules are usually also expressed as fuzzy implications (A → B). In essence, this
expression describes a fuzzy relation between two variables x and y, and thus it can be viewed
as a 2D membership function.
Fuzzy reasoning is an inference procedure that derives conclusions from a set of fuzzy
if-then rules and known facts. The basic rule of inference in traditional two-valued logic is
modus ponens, according to which the truth of a proposition B can be inferred from the truth
of A and the implication A → B. If these propositions make use of fuzzy sets of appropriate
universes, the procedure is called fuzzy reasoning or generalised modus ponens. Given a
certain rule and fact (Eq. 1.20), the fuzzy reasoning is as follows (Eq. 1.21):
Premise 1 (fact):
Premise 2 (rule):
Consequence (conclusion):
x is A’
if x is A then y is B
y is B’
µB ′ (y) = maxx min {µ A′ ( x), µR ( x, y)}
(1.20)
(1.21)
where µR(x,y) is a fuzzy implication, which, for instance, can be the Mamdani fuzzy
implication (Eq. 1.22).
92
HUMBERTO M ARTÍNEZ BARBERÁ
Rm = A × B = ∫
X ×Y
µ A ( x) ∧ µ B ( y)/(x, y)
(1.22)
fc (x, y) = min {x,y}
For instance, given a series of rules with multiple antecedents (Eq. 1.23), the fuzzy
reasoning procedure is graphically as follows (Figure 1.12):
Premise 1 (fact):
Premise 2 (rule):
Premise 3 (rule):
x is A’ and y is B’
if x is A1 and y is B1 then z is C1
if x is A2 and y is B2 then z is C2
Consequence (conclusion):
z is C’
(1.23)
Figure 1.12: Graphic representation of fuzzy reasoning.
A fuzzy inference system is a popular computing framework based on the concepts of
fuzzy set theory, fuzzy if-then rules, and fuzzy reasoning. It has found successful applications
in a wide variety of fields, such as automatic control, data classification, decision analysis,
expert systems, time series prediction, robotics, and pattern recognition. It is also known by
other names as fuzzy logic controller. The basic structure of a fuzzy inference system consists
of three conceptual components: a rule base, which contains a selection of fuzzy rules, a
database, which defines the membership functions used in the fuzzy rules, and a reasoning
mechanism, which performs the inference procedure upon the rules and given facts to derive a
reasonable output or conclusion. Usually, a fuzzy inference system can take either fuzzy
inputs or crisp inputs (which are viewed as fuzzy singletons), and the outputs are fuzzy sets.
Sometimes is necessary to have crisp outputs instead, especially in situations where the fuzzy
system is used as a controller. In this case a defuzzification method is applied to obtain a crisp
value that best represents a fuzzy set. With crisp inputs and outputs, a fuzzy inference system
implements a nonlinear mapping from its input space to output space. This mapping is
accomplished by a number of fuzzy if-then rules, each one describing the local behaviour of
the mapping. In particular, the antecedent of a rule defines a fuzzy region in the input space,
while the consequent specifies the output in the fuzzy region.
93
CHAPTER 1: INTRODUCTION
Three types of fuzzy inference systems have been widely employed in various
applications. The differences between them lie in the consequents of their fuzzy rules, and
thus their aggregation and defuzzification procedures differ accordingly. The Mamdani model
uses standard fuzzy sets in both the antecedents and consequents of the rules, and utilises a
defuzzification procedure to obtain a crisp value from the output. There are many different
defuzzification methods, although the most widely adopted strategy is known as the centroid
of area (Eq. 1.24), which gives the centroid of the area of the fuzzy set as the result.
zCOA =
∫µ
Z
A
∫µ
(z) ⋅ z ⋅ dz
A
(1.24)
(z) ⋅ dz
Z
The Sugeno model, also known as TSK, uses standard fuzzy sets in the antecedents of
the rules, and utilises a crisp function f(x 1,…,xn) in the output. Usually this function is a
polynomial in the input variables x1,…,xn, but it can be any function as long as it can
approximately describe the output of the model within the fuzzy region specified by the
antecedent of the rule. Since each rule has a crisp output, the overall output is obtained via
weighted average, thus avoiding the time-consuming process of defuzzification required in a
Mamdani model. Unlike the Mamdani model, the Sugeno model cannot follow the
compositional rule of inference strictly in its fuzzy reasoning mechanism. This poses some
difficulties when the inputs to a Sugeno model are fuzzy. Without the time-consuming and
mathematically intractable defuzzification operation, the Sugeno fuzzy model is by far the
most popular candidate for sample-data-based fuzzy modelling.
The Tsukamoto model uses standard fuzzy sets in the antecedents and consequents of
the rules, but those fuzzy sets are constrained to have monotonical membership functions in
the consequents. As a result, the inferred output of each rule is defined as a crisp value
induced by the rule’s firing strength. The overall output is taken as the weighted average of
each rule’s output. Since each rule infers a crisp output and the aggregation is by weighted
average, the Tsukamoto model avoids the time-consuming process of defuzzification.
However, this model is not used often since it is not as transparent as either the Mamdani or
Sugeno models.
In a similar way as in neural networks, the representational power of fuzzy systems is
characterised. Thus, from a control point of view, any continuous nonlinear function can be
approximated as exactly as needed with a finite set of fuzzy variables, values, and rules
(Kosko, 1992). But still what is not answered is how many rules are needed and how to find
them. In many cases, relatively small and simple systems perform well in real-world
applications. The advantages of using fuzzy systems for control usually fall in the following
categories (Driankov et alt., 1993):
• Expert knowledge representation. In many industrial processes there is a variety of
conventional control loops, but a human operator is still needed during certain phases of
the operation (starting/closing phase, parameterisation of controllers, or switching
between operational modes). The knowledge of the operator is usually based on
experience and usually does not lend itself to being expressed in differential equations. In
94
HUMBERTO M ARTÍNEZ BARBERÁ
•
•
this case, fuzzy control offers a method for representing and implementing the expert’s
knowledge.
Robust nonlinear control. In some cases there are good and exact models of systems
available. In such cases it is not difficult to realise classical controllers that work pretty
well for known system parameters within a narrow range. In the presence of major
disturbances, classical systems are usually faced with a trade-off between fast reactions
with significant overshoot or smooth but slow reactions. In this case, fuzzy control offers
ways to implement simple but robust solutions that cover a wide range of system
parameters and that can cope with major disturbances.
Reduction of development time. Fuzzy control offers languages at two different levels of
expertise: a symbolic level, which is appropriate for describing the application-related
expert knowledge, and a compiled level, which is well understood by the development
engineers. Since there is a well-defined, formal translation between those levels, a fuzzy
based approach can help reducing communication problems. In typical real systems,
communication problems between application and development specialist can lead to very
long development times.
1.3 Contributions of this thesis
This thesis describes an experimental investigation into the issues of mobile robot tasks
specification, mobile robot behaviours development, environment interpretation, and robot
performance validation. For this task, a series of methods, algorithms, techniques and physical
robots have been integrated into effective working systems.
The novel contributions of this research consists of:
• The definition of BGA, a new control architecture, which is based on the concept of
•
•
•
•
•
multi-agent systems and relies on a blackboard architecture. From a cognitive point of
view, this architecture is used as a hybrid three-layer architecture.
In the context of the BGA architecture, the definition and implementation of BGL, a new
robot programming language and its development environment to speedup the
development time for control software.
The implementation of an UDP based KQML protocol for agent communication in BGA,
including a simple yet powerful content description language. One of its applications is
the extension of the BGA architecture to a web-based robotics application.
The modification of the context dependent blending technique for behaviour fusion and its
later integration in a learning environment to automatically obtain the blending rules.
The development of a new clustering algorithm which is applied in a sensor fusion
problem.
The modification and improvement of different map-building and path-planning
algorithms to achieve autonomous control of a robot in unknown environments.
95
Chapter 2
Hardware Architecture
2.1 Introduction
During the preparation of this thesis, different mobile robotics platforms were built and
tested, from the very simple first MadCar to the final Quaky-Ant robot. In this chapter the
different models are presented and described, and their most important pros and cons are
analysed. The Quaky-Ant robot, which is one the robots used to obtain the real-world results
of this thesis, is described in greater depth. The development of the different robots was
constrained by different factors, the most important being the physical dimensions of the
laboratory and adjacent corridors, and the cost. The physical constraint necessitated that the
robots were no larger than medium size (approximately 50 centimetres of diameter). The
budget limited the final robot cost to no more than $3,500.
In addition to the initial constraints, some requisites were imposed on the final robot,
namely:
• The platform was to be flexible enough to test the different algorithms and architectures
•
•
presented in this thesis (see following chapters).
The platform was to be robust and easy to upgrade, and use different sensors (mainly
sonar and infrared sensors).
The platform was to be capable of successfully carrying out complex tasks, operating in
both indoor environments and non-rough outdoor environments.
A brief survey of the medium size robotics market (see appendix A) shows that there
are clearly two commercial robot groups: the high-tech group and the do-it-yourself group.
While the latter does not meet the criteria specified above, the former provides ways of
enhancing and expanding the systems, and includes the necessary set of sensors, either in
basic configuration or as optional packages. The platforms that are close to fulfilling the
requirements are the Magellan and Pioneer-2 DX (see appendix A). They present some
important drawbacks:
• They are not easy to expand using custom sensor modules, mainly because their
interconnection buses and protocols are proprietary.
97
CHAPTER 2: HARDWARE ARCHITECTURE
• They are intended to be used in conjunction with their own development environments,
and the only option for testing different systems completely is to build gateways, with the
corresponding performance penalty.
Although the present work could have been performed using off-the-shelf mobile
robots, it was decided to develop one from scratch, taking commercial robots as models when
possible and trying to improving them. Either way, leaving aside comparisons of performance
with commercial robots, the Quaky-Ant robot features a flexible and modular architecture
designed to support custom developments, namely algorithms for sensors (chapter 3),
navigation algorithms (chapter 4), and a control architecture (chapter 5). Finally, the robot has
been successfully used in real world applications (chapters 6 and 8).
2.2 MadCar micro-robot
2.2.1 Description
The robot named MadCar (Figure 2.1) was the first platform developed and built to
perform tests for this thesis (Villalba and Martínez Barberá, 1998). MadCar is based on an
AA-Kidsmate BMW Sport RC (radio-controlled for hobbyist use) 1:10 scale model car. The
purpose of this robot was to test some simple reactive algorithms in a low cost platform, using
sonar as the main sensory system. The whole system is based on a microcontroller that is in
charge of reading sensors, performing some processing, and actuating motors. The car has
two 12V DC motors, one connected to a mechanical differential for traction (capable of
driving the car forward or backward), and the other connected to an Ackerman style steering
system (to turn front wheels to the left or to the right). The main characteristics of the
platform are presented in Table 2.1, and the distribution of the different elements is shown in
Figure 2.2.
Dimensions
Base: 50 x 25 cm
Height: 18 cm
Weight: 2 Kg
Sensors
4 sonar sensors
1 shaft encoder
1 Hall effect compass
4 switches
Actuators
2 DC motors
Additional equipment
1 MC68HC11
1 RF modem (100 MHz)
10 NiCd 1.2V 1.8Ah batteries
1 NiCd 9V 0.8 Ah battery
Table 2.1: Main characteristics of MadCar.
The car has different sensors (Table 2.1 and Figure 2.2) and the most relevant
characteristics are listed below:
• The custom Hall effect compass is used for measuring direction and elevation. The
accuracy is not very good because of the mechanical design: it is based on an oilgimballed mechanical compass, with the magnetic ball having a tendency to get trapped in
the mountings.
98
HUMBERTO M ARTÍNEZ BARBERÁ
• The four ultrasonic sonar sensors are used to measure the distance to walls and objects. A
•
•
custom board to control and multiplex the sonar sensors firing modules was designed.
This control board does not allow firing of more than one sensor simultaneously.
The shaft encoder sensor mounted on the main axle is used for measuring car speed. The
precision is 8 pulses per revolution, and it was intended only as a rough estimate of the
car’s speed.
The four switches are arranged in two bumpers (one located at the front and the other at
the rear) and they are used to detect collisions.
Figure 2.1: MadCar’s lateral view.
Figure 2.2: Hardware arrangement.
A HandyBoard (Martin, 1994), a Motorola 68HC11 microcontroller based board with
32 KB of static memory, controls the robot. It has a set of A/D-D/A input/output converters,
in which all the sensors and actuators are connected (Figure 2.3), and an integrated RS-232
serial port, to which a wireless modem is attached. The control software has been developed
using Interactive C (Martin, 1996) using a 68040 Macintosh as the host computer. The
modem, a custom half-duplex wireless 1200 baud modem, in conjunction with the RC
emitter/receptor, is used to control the car manually and to transfer sensor data to the host
computer to perform data analysis. The resulting control software (see chapter 5 for the
control software architecture used) is then downloaded to the microcontroller for later
autonomous execution.
99
CHAPTER 2: HARDWARE ARCHITECTURE
Sensors
Sonar
Effectors
Traction
Motor
68HC11
Steering
Compass
Switches
Shaft
Encoder
Figure 2.3: MadCar’s sensors and effectors.
2.2.2 Sonar controller
The sonar controller uses the Polaroid 6500 series Ultrasonic Sonar Range Finder
modules. These sensors operate using a single transducer for both sending and receiving the
ultrasonic pulses (at 49.4 kHz). The maximum intensity width of the ultrasonic beam is
typically between 15 – 20 degrees. The module can measure from 15 centimetres up to 10
metres with an accuracy of ± 1% over the entire range. The interface to these modules is
simple: there are some signals that need to follow a simple timing (Figure 2.4) to control the
firing of the sensor.
Figure 2.4: Polaroid 6500 timing diagram.
As the microcontroller board does not have enough free I/O ports to control the four
Polaroid 6500 modules, a sonar controller board was designed to fulfil the following
objectives:
• The board had to be capable of firing all the sensors (although not simultaneously).
• The number of I/O lines should be minimal.
The sonar control board (Figure 2.5) works as follows. The microcontroller selects the
sensor to be fired (Select0 and Select1 lines), and then sends the firing command (Init line).
100
HUMBERTO M ARTÍNEZ BARBERÁ
The firing signals are generated and as result the 6500 module emits an ultrasonic pulse.
When this pulse returns to the transducer, an echo signal is sent to the microcontroller (Echo
line) where an interruption is produced.
Figure 2.5: Sonar controller schematics.
Figure 2.5 shows the control circuitry for one 6500 module. The Select0 and Select1
lines set the active module using a 74LS139 multiplexer. When an Init signal arrives from the
microcontroller (active in low level) it is passed on to a BC237 transistor in common emitter
configuration. This acts as an inverter and a current driver to generate the INIT signal in the
6500 module. In order to avoid detecting the emitted pulse as an echo (only one transducer
per 6500 module is used), the 6500 module generates an internal blanking signal for 2.38 ms.
This sets the minimum sensing distance to 0.4 m. To reduce this distance to 0.1 m a blanking
inhibit signal BINH is also needed. The Init signal from the microcontroller is driven through
a RC network that produces a delay of 0.6 ms, which is passed on to the 74LS139, and finally
to a 7404 inverter to obtain the BINH signal. Additionally, the BINH is inverted to obtain the
blanking BLNK signal. The ECHO signal from the 6500 module is driven to an open-collector
inverter gate to obtain the Echo signal in the microcontroller.
2.2.3 Shaft encoder
To measure the speed of the car approximately, 8 magnets have been disposed in the
perimeter of one of the rear wheels. A Hall effect based encoder generates a pulse, which is
driven to the microcontroller whenever a magnet is detected. The wheels have a diameter of 7
cm, giving an encoder accuracy of 2.74 cm. As the Hall effect sensor signal cannot be directly
101
CHAPTER 2: HARDWARE ARCHITECTURE
attached to the microcontroller, because the signal needs some conditioning, an encoder board
was designed.
The encoder circuitry (Figure 2.6) consists of three stages: a band-pass filter, an
amplifier, and a hysteresis comparer. The UGN3501 Hall effect sensor produces a voltage
that depends on the distance to the magnet. In our case there was some noise with a frequency
of 500 Hz. The band pass filter (R1C1-R2C2) eliminates noise below 5 Hz and above 100 Hz
(with 8 magnets this gives a superior bound of 2 m/s, far higher than the car’s working speed).
After the filter, the signal is driven to the A741 opamp based amplifier, which amplifies the
signal 10 times. Finally, the amplified signal is converted to a square wave signal by the
hysteresis comparer. This square wave signal is driven to a PN2221 transistor in common
emitter configuration to limit the output voltage to +5V in order to protect the digital I/O
circuitry of the HandyBoard.
Figure 2.6: Shaft encoder schematics.
2.2.4 Electronic compass
As an additional sensor system, MadCar has a custom electronic compass, which
measures the orientation of the car with respect to the Earth’s magnetic North pole. The
electronic compass, which has been developed and built, is composed of two subsystems:
• Mechanical subsystem: composed of an extremely low cost mechanical oil-gimballed
•
compass, intended for use in bicycles.
Electronic subsystem: detects the orientation of the mechanical compass and send the data
to the microcontroller.
102
HUMBERTO M ARTÍNEZ BARBERÁ
The compass system works as follows: the mechanical compass contains a sphere that
aligns to the Earth’s magnetic North pole using a small magnet mounted in its surface. There
are two Hall effect sensors orthogonally disposed in the outside of the sphere mountings.
These sensors measure the deviation of the sphere with respect to a reference position, and so
indirectly measure the orientation with respect to the magnetic North pole. In this way the
system does not measure the Earth’s magnetic field directly but the mechanical compass
magnetic field. The interface circuitry works in a similar way to the shaft encoder, but uses a
pair of 634SS2 Hall effect sensors. Figure 2.7 shows the circuitry corresponding to only one
sensor. The output of the 634SS2 is differential, so the first step is to subtract the
corresponding outputs, eliminate noise and amplify the signal 10 times (using A741
operational amplifiers). The signal is then sent to a variable gain stage to adapt the range of
the signal (from full magnetic North to full South). Finally, the output is shifted so that a 0
Gauss magnetic field reads as 2.5 V.
Figure 2.7: Electronic compass schematics.
2.2.5 Conclusions
Although much effort was put into trying to develop good electronic and software
systems, no modifications were made to the mechanics. This strongly conditioned the way the
robot performed its tasks (see chapter 5 for a performance discussion). The main lesson
learned was: “mechanics do matter!”. The most important mechanical drawbacks were:
• The steering system is an all or nothing turning system, and is a major problem in
accomplishing smooth long radius of curvature turns. Thus, the robot performed in a jerky
way when turning.
103
CHAPTER 2: HARDWARE ARCHITECTURE
• The sphere of the mechanical compass, due to nonuniformities in the mountings, could
•
suddenly get trapped producing the same output although the car was changing its
direction abruptly.
The original car assembly was very lightweight. As a children’s toy, it was designed to
obtain good top speed on smooth surfaces, and consequently it equips a low reduction
gearbox. As the added weight was 5 times the original, the control at low speed was very
critical.
Despite the above facts, it nevertheless served as a very good robotics background, and
most of it would be the basis of the following robots.
2.3 La-Garra robot
2.3.1 Description
The robot named La-Garra (Figure 2.8) was developed just after the completion of
MadCar. It is based on a MiniMax’96 Robot Kit from Wirz Electronics. The purpose of this
robot was to continue the development of the software architecture while taking advantage of
the previously developed sensing system. The whole system is based on a microcontroller that
is in charge of reading sensors and actuating motors, and a personal computer without its
cover. The robot has two 12V DC motors attached to the differentially steered wheels (to turn
them independently forward or backward) and has the support of two castor wheels (one in
the front and one in the rear). This is a typical diamond configuration to overcome the
steering difficulties of MadCar, and thus La-Garra is a holonomic platform. The main
characteristics of the platform are presented in Table 2.2, and the distribution of the different
elements is shown in Figure 2.9.
Dimensions
Base: 40 x 40 cm
Height: 55 cm
Weight: 6 Kg
Sensors
4 sonar sensors
2 shaft encoders
4 switches
3 battery level sensors
1 B/W camera
Actuators
2 DC motors
1 robotics arm
Additional equipment
1 MC68HC11
1 i486DX/2 @ 66 MHz
1 10baseT ethernet
1 NiCd 12V 1.8 Ah battery
1 NiCd 12V 2 Ah battery
1 NiCd 9V 0.8 Ah battery
Table 2.2: Main characteristics of La-Garra.
The robot has different sensors (Table 2.2 and Figure 2.9). The most relevant
characteristics are listed below:
• The four ultrasonic sonar sensors are used to measure the distance to walls and objects.
The control circuitry is identical to that used in MadCar.
104
HUMBERTO M ARTÍNEZ BARBERÁ
• The two optical shaft encoder sensors mounted on the axles of the steering wheels are
•
•
used for odometry. The precision is 28 pulses per revolution, which was intended only as
a rough estimate of the robot’s speed and position.
The four switches are positioned in two bumpers (one located in the front and the other in
the rear) and they are used to detect collisions.
The three propioceptive battery level sensors are used to trigger an alarm when the output
voltage drops dangerously.
In addition, some additional elements were added to test if they were feasible and worth
the development time:
• A 256 grey scale tones and low resolution QuickCam camera. The idea was for it to be
•
capable of identifying tennis balls and/or soft drink cans to perform collection and
cleaning tasks.
A LynxArm mechanical robotics arm with 4 degrees of freedom, which serves as a pantilt support for the camera.
Figure 2.8: La-Garra’s lateral view.
Figure 2.9: Hardware arrangement.
La-Garra, as with MadCar, has a HandyBoard, although now it only takes care of
interfacing the sensing system and the actuators. The control board is a standard i486DX2
personal computer without its covering case. It has 10 Mb of memory and 500 Mb of harddisk, and runs the Linux operating system. It works as the master of the system, whereas the
HandyBoard runs as a slave. For data communication between the different subsystems, there
are two serial RS232C ports (486 to 68HC11, and 486 to LynxArm) and a Centronics parallel
105
CHAPTER 2: HARDWARE ARCHITECTURE
port (486 to QuickCam). In addition, there is a wired ethernet link that is used just for
development and testing purposes. In the very beginning the control software was developed
in C, but there were some problems testing the software both on board and off board, and thus
it was decided to rewrite everything in Java (see chapter 5 regarding the new control software
architecture developed).
4.3.2 Conclusions
With this robot the suitability of a distributed processing architecture was tested and
simple tasks were tried. Major achievements were due to software, as there were no
significant difference in the final result compared to MadCar. The main lesson learned was:
“mechanics still do matter!”. The most important problems found were:
• The platform was still inadequate because of its lack of stability in tight turns.
• Odometry needs encoders with much higher precision.
• There is not enough processing power on board to perform control and image analysis.
2.4 Quaky-Ant robot
2.4.1 Description
The robot named Quaky-Ant (Figure 2.10) started as some “minor” modifications to LaGarra in search of robustness. It soon became apparent that the only way to achieve better
performance was to spend some time on designing the mechanics of the platform and to
increase the number and type of sensors. Curiously, the platform bears a resemblance to a
Pioneer from ActiveMedia Robotics (see appendix A), but it is merely a resemblance. A
major step ahead with this robot was the incorporation of a distributed computing architecture
organised in two layers, a low-level layer based on small microcontrollers and a high-level
layer based on a microprocessor. In addition to this, the software and algorithms were
sufficiently evolved so that more complex tasks could be accomplished, and consequently the
robot needed to exhibit more features. The design goal was to develop a mobile robot that
could work autonomously and could be teleoperated in both indoor and outdoor
environments, while reusing most of the hardware and all the software.
The robot has two 12V DC motors attached to the differentially steered wheels (to turn
them independently forward or backward) and has the support of a small castor wheel
(located in the rear). This is a typical triangle configuration to achieve better equilibrium. In
addition, ground clearance is small to allow for a lower centre of gravity. The main
characteristics of the platform are presented in Table 2.3, and the distribution of the different
elements is shown in Figure 2.11.
106
HUMBERTO M ARTÍNEZ BARBERÁ
Dimensions
Base: 50 x 40 cm
Height: 22 cm
Weight: 15 Kg
Sensors
10 sonar sensors
7 infrared sensors
2 shaft encoders
1 flux-gate compass
1 GPS receiver
1 RGB camera
Actuators
2 DC motors
Additional equipment
10 PIC16F877
1 i486DX/4 @ 133 MHz
1 10baseT ethernet
1 radio modem 900 MHz
1 video transmitter 2.4GHz
1 NiCd 12V 2 Ah battery
1 NiCd 12V 7 Ah battery
Table 2.3: Main characteristics of Quaky-Ant.
The robot has different sensors (Table 2.3 and Figure 2.11). The most relevant
characteristics are listed below:
• The ten ultrasonic sonar sensors are used to measure the distance to distant walls and
•
•
•
•
•
objects. The control circuitry is based on the one used in MadCar and La-Garra.
The seven infrared sensors are used to measure the distance to near walls and objects.
The two high-resolution shaft encoder sensors mounted on the axles of the driving wheels
are used for odometry.
The flux-gate compass measures the heading of the robot to the Earth’s magnetic North
pole. It is intended to compensate odometry errors in indoor environments.
The GPS receiver serves as an absolute positioning system. It is intended to compensate
odometry errors in outdoor environments.
The RGB camera is used to have video feedback in remote control robotics tasks.
Figure 2.10: Quaky-Ant’s lateral view.
Figure 2.11: Hardware arrangement.
2.4.2 Modular architecture and interconnection
107
CHAPTER 2: HARDWARE ARCHITECTURE
The core of the processing system is an i486DX4 based single board computer (SBC) at
133 MHz, with 32 Mb of memory and 24 Mb of solid state disk (DiskOnChip). It runs the
Java virtual machine under the Linux operating system, conveniently reduced to fit into 24
Mb. It was decided to modularise the system (Figure 2.12) so that the SBC is in charge of the
high-level control software, and leave the low-level control and data acquisition software in
different units. These modules interface with each other through a standardised bus for power
and inter-module communication. Each module contains its own microprocessor that
implements the inter-module communication protocol and performs the required low-level
signal processing functions for sensor and actuator control. Only a few units are directly
attached to the SBC (the GPS receiver and the RGB camera focus controller), because it was
not worth adapting them to the interconnection bus. This architecture presents the following
benefits:
• Expandability: new sensor or actuator modules can be added at anytime without any
•
•
•
modification of the existing units.
Flexibility: all modules use the same bus-protocol, and thus system maintenance and
upgrading do not require much effort.
Robustness: if one module fails, the others will continue working.
Specialisation: each module contains only the hardware necessary to comply with its
mission, reducing the overall sizes and price of the system.
The inter-module communication bus is an I2C bus (Lekei, 1997) operating at 100 Kb/s.
I C is a bus design and communication protocol that allows multiple modules to be connected
to a common two-wire bus. One wire provides a high speed synchronous clock (SCL), while
the other provides a two-way data communication line (SDA). Target modules are
distinguished from one another by pre-appending each message on the data line with an
address header. Only the module that matches the address acts on that message. In addition to
the standard I2C lines, there are additional lines for +5 V power (VCC), ground (GND), reset
(RESET), and two global sensor-disabling lines (US-INT and IR-INT). These last two lines are
intended for power saving while in debug mode.
2
All the modules connected to the bus are based on the PIC16F877 microcontroller,
running at 20 MHz, and having 8 Kb of program memory and 196 bytes of data memory. The
I2C bus master is also attached to the SBC. The transition from 68HC11 to 16F877 enabled
the fast development of all the different modules, at the expense of a more complex
programming task (the use of a very reduced instruction set assembler instead of the C
language). The PIC16F877 was chosen, among other things, because it has built-in hardware
support for the I2C bus.
108
HUMBERTO M ARTÍNEZ BARBERÁ
Radio Link
Video
Transmitter
RGB
Camera
Serial
SBC i486
GPS
Receiver
Serial
Serial
Infrared
Sonar
Bus Master
Infrared
Sonar
Motor Driver
Encoder
I2 C Bus
Infrared
Sonar
Infrared
Sonar
Infrared
Sonar
Motor Driver
Encoder
Compass
Figure 2.12: Quaky-Ant modular architecture.
The custom protocol that has been defined and implemented over the I2C bus uses a
single -master mode configuration, with two different high-level data packet formats: PWRT
for sending commands and PANS for requesting data. The protocol has been defined as
stateless, so that if any packet between any other two is lost, the microcontrollers do not enter
into a wait state. Should any slave get hanged for any reason, the whole system only loses the
functionalities given by that microcontroller, while the others continue working. This decision
has been taken for simplicity and robustness, mainly because communication deadlocks are
thus avoided. As I2C uses byte-oriented and master-initiated data transmission, PWRT and
PANS data packets are then converted to I 2C logic sequences.
PWRT packets (Figure 2.13) are used to send commands from the master module to any
slave module. The data interchange process is as follows (Figure 2.15): the master initiates the
communication sending the slave address (ADDR), the code of the command to be executed
(CMD), then the list of parameters needed for that command (PARAMi), and finally the end of
packet code (TRAIL).
ADDR
CMD
PARAM 1
…
PARAM n
TRAIL
Figure 2.13: PWRT packet format.
PANS packets (Figure 2.14) are used to receive data from any slave module to the
master module. The data interchange process is as follows (Figure 2.15): the master initiates
the communication sending the slave address (ADDR), and then the slave sends the list of data
available (ANSi) and the end of packet code (TRAIL).
ADDR
ANS1
ANS2
…
Figure 2.14: PANS packet format.
109
ANSm
TRAIL
CHAPTER 2: HARDWARE ARCHITECTURE
a) PWRT packet
b) PANS packet
Figure 2.15: PWRT and PANS packets data interchange.
Each slave module implements only a subset of the different commands available, but
does not produce an error whenever and unimplemented command arrives. Usually, the data
that the slave modules return is the current reading of their own sensors. The amount of data
that each module type returns depends on the sensors it implements, and the control
application is responsible for knowing how to parse that data correctly. This protocol enables
the interoperation of many different subsystems with a low protocol overhead. The different
parameters and options are discussed in the following sections.
2.4.3 External video and data communication
One of the objectives when building this robot was mobility while remaining connected
to a fixed station both for debugging tasks, distributed control (see chapter 5) and for
teleoperation (see chapter 8). In the development and test scenario, the robot is connected
remotely to the Faculty of Computer Science network, where the remote-debugging host is
located. Additionally, there is also a video receiver connected to another host that broadcasts
images across the network. The mobility objective directly excludes the use of a tethered
solution and implies as well the need to send data (the system status), receive data (the desired
commands), and send video (visual feedback) in wireless mode (Cheah, 1999). The video was
not a problem because a custom modified video transmitter, working in the 2.4 GHz band,
was reused from another project. The data communication between the different high-level
systems is based on the UDP/IP protocols (see chapter 5), so this protocol was a must. The
first option tested was a wireless ethernet link at 2 Mb/s, also working in the 2.4 GHz band.
110
HUMBERTO M ARTÍNEZ BARBERÁ
After some tests, the frequency assignment decided upon was the one that appears in Figure
2.16. The leftmost peak corresponds to the video signal and the rightmost peak corresponds to
the ethernet signal.
Figure 2.16: Wireless video and ethernet bandwidths.
The ethernet solution had a major problem: since it acts as a repeater to the wired
10/100 Mb/s fixed ethernet network, the wireless link was overloaded almost all the time due
to the amount of traffic. In short, it only worked at night (when the Faculty’s traffic decreases
noticeably). To overcome these difficulties it was decided to use a wireless modem at 115
Kb/s, working in the 900 MHz band (Cheah, 1999). A PPP channel is established over the
wireless link, to support the UDP/IP protocols. This final solution provided robustness and
high availability (the frequency does not interfere either with the video link or other wireless
links that exist in the area) at the expense of a reduced bandwidth that excludes the use of
complex real-time communication patterns. A possible improvement over these two
approaches (the wireless ethernet and the wireless modem) is the use of modern IEEE 802.11
wireless LAN cards (Welch, 1999) that also operate in the 2.4 GHz band and provide 11 Mb/s
transmission. It was not possible to test any of these cards during the work of this thesis,
although such a link will be incorporated in a future. Another possibility is the use of a lowend computer with two network cards acting as an inexpensive router. This option was not
tested because the necessary equipment was not available at the moment.
2.4.4 Sonar and Infrared module
Each sonar and infrared module can handle up to four sensors: two sonar sensors and
two infrared sensors. The module is designed so that a flexible firing pattern can be achieved,
111
CHAPTER 2: HARDWARE ARCHITECTURE
and is based on the work of MadCar’s sonar controller. In fact, the previous experience with
MadCar’s sonar controller speeded up the development of the sonar and infrared module. The
sensor used is still the Polaroid 6500 range finder module, which operates using a single
transducer for both sending and receiving the ultrasonic pulses (at 49.4 kHz). The maximum
intensity width of the ultrasonic beam is typically between 15 – 20 degrees. The module can
measure from 15 centimetres up to 10 metres with an accuracy of ± 1% over the entire range.
For infrared range measurement, this module uses the Sharp GP2D02 High Sensitive Distance
Measuring Sensor. These sensors operate using an infrared LED emitter and a PSD light
detector. The maximum intensity width of the light beam is typically about 10 centimetres.
The sensor can measure from 10 centimetres up to 120 centimetres with accuracy and
resolution, depending on the distance.
ALIMENTACIÓN SENSORES SONAR
Vcc
3
GND
J1
J3
VCC
R1
4K7
C4
1
2
3
4
5
6
7
8
9
R2
4K7
GND
VCC
1
2
3
4
5
6
7
8
9
+
J2
CAPACITOR POL
2
VOUT
1
2
VIN
1
US-POWER
U4
LM7806/TO
VIN1
US1
US2
VCC
R3
VCC
POWER SUPPLY
R4
VCC
2K2
VDD
2K2
Q1
Q2
NPN CBE
NPN CBE
VIN2
R5
10K
ECHO2
ECHO1
R6
10K
CON5
J5
1
2
3
4
5
CON5
VOUT2
VCC
Vss
J4
1
2
3
4
5
VOUT1
VCC
INIT2
INIT1
BINH2
BINH1
U1A
INIT1
1
3
2
Y1
SN7408
12M
U1B
C1
47P
C2
47P
INIT2
4
6
5
U2
SN7408
DIODE
2
3
4
5
6
7
RBO/INT
RB1
RB2
RB3
RB4
RB5
RB6
RB7
R7
47K
R8
1
13
200
MCLR/Vpp
OSC1/CLKIN
C3
104Kp
RCO/T1OSO/T1CKI
RC1/T1OSI/CCP2
RC2/CCP1
RC3/SCK/SCL
RC4/SDI/SDA
RC5/SDO
RC6/TX/CK
RC7/RX/DT
RD0/PSP0
RD1/PSP1
RD2/PSP2
RD3/PSP3
RD4/PSP4
RD5/PSP5
RD6/PSP6
RD7/PSP7
RE0/RD/AN5
RE1/WR/AN6
RE2/CS/AN7
U1C
D1
9
8
10
SN7408
U1D
33
34
35
36
37
38
39
40
11
13
SN7408
IR-INT
15
16
17
18
23
24
25
26
19
20
21
22
27
28
29
30
8
9
10
D2
12
US-INT
SCK
SDA
R9
VCC
VCC
VCC
1k
SW1
1
2
3
4
J6
8
7
6
5
1
2
3
4
CON4
SW DIP-4
PIC16C77_P
RESET
J7
IIC BUS
1
2
3
4
5
6
7
8
9
10
D3
14
RA0/AN0
RA1/AN1
RA2/AN2
RA3/AN3/VREF
RA4/TOCKI
RA5/AN4/SS
1
2
3
4
5
6
7
8
9
10
VCC
OSC2/CLKOUT
J8
IIC BUS
Figure 2.17: Sonar and infrared module schematics.
The board (Figure 2.17) has two I2C connectors, four address selection switches, and an
additional +12 V power connector. As the 6500 modules are powered using +6 V, a LM7806
voltage regulator obtains the required voltage from the +12 V supply. No sensor is
multiplexed, so each one has its own firing circuitry. The sonar INIT lines and infrared VIN
lines are globally enabled/disabled using the US-INT and IR-INT bus lines and the 7408 AND
gates. The sonar sensor firing timing (Figure 2.4) and circuitry is similar to that described in
112
HUMBERTO M ARTÍNEZ BARBERÁ
MadCar’s sonar controller. The microcontroller uses INIT to fire the 6500 module, which is
driven through a BC237 transistor in common emitter configuration. This acts as an inverter
and a current driver to excite the 6500 module. In order to avoid detecting the emitted pulse as
an echo (only one transducer is used per 6500 module), the 6500 module generates an internal
blanking signal for 2.38 ms. This sets the minimum sensing distance to 0.4 m. To reduce this
distance to 0.1 m a blanking inhibit signal BINH is also needed. The microcontroller
introduces a delay of 0.9 ms between the INIT and BINH signals. The ECHO signal from the
6500 module is attached to an interrupt-driven digital input of the microcontroller, in opencollector configuration. The infrared firing is simple (Figure 2.18). The serial protocol to
interface the GP2D02 uses the VIN and VOUT lines. The microcontroller uses VIN to fire the
GP2D02. After 70 ms, when a valid measure is present, the infrared triggers VOUT, so that
the serial protocol interchange starts. Each bit is signalled by the microcontroller using VIN,
and the infrared returns bit values using VOUT. The microcontroller should wait 0.2 ms
between two contiguous VIN pulses. When finished, a valid 8 bit value has been received.
Figure 2.18: GP2D02 infrared sensor timing chart.
The board interprets three different commands, developed using the custom
PANS/PWRT protocol: two for setting firing masks for sonar (US-PAT) and infrared (IRPAT) sensors, and the other for actually firing them (START). Each mask specifies how many
sensors of its class are to be fired: none, one, or both. When issuing a PANS packet, there
should be a minimum period of 100 ms in order to obtain valid readings. Otherwise, the board
returns the last valid set of measures. The typical sensor cycle (Figure 2.19) is as follows: the
master sends a US-PAT command to set the sonar mask, then sends a IR-PAT command to set
the infrared mask, and last sends a START command to fire the specified sensors
simultaneously. After a time greater than 100 ms, the master can send a PANS packet to
receive data. It is important to notice here that, whatever sensor masks are set, the board
always sends the lasts valid readings corresponding to all the sensors.
113
CHAPTER 2: HARDWARE ARCHITECTURE
Figure 2.19: Typical sonar and infrared protocol sequence.
2.4.5 Motor control module
Each motor control module can handle one motor, one encoder and up to eight digital
inputs, intended to be used in bumpers or switches. The board (Figure 2.20) has one I2C
connector, one encoder connector, one motor connector, four address selection switches, and
an additional +12 V power connector. Each motor control board is attached to a MAXON AMax y32 12V and 15w motor and a HP HEDS-5540 three channel 500 pulses per revolution
optical encoder. The motor is geared with a 60:1 ratio so that 30,000 pulses are generated per
revolution. As the wheels have a diameter of 14.87 cm, in theory a high accuracy can be
obtained. The microcontroller interfaces the encoder using three interrupt-driven digital input
ports to process the two quadrature (A and B) and index (Z) signals. To interface the motor,
each board has a L298 dual H-bridge current driver, with both drivers in parallel to support up
to 4 A at 12 V. The microcontroller then sets the speed of the motor varying the duty cycle of
a 4 KHz PWM signal. To change the direction of turn or to brake the motor, it uses two lines
(A1 and A2), and to disable the motor (free running mode) it uses an enable signal (E).
114
HUMBERTO M ARTÍNEZ BARBERÁ
POWER SUPPLY
CONDESADORES DRIVERS
DIODOS PROTECCIÓN
VS
J1
VCC
VDD
VCC1
VS
VCC2
VCC1
2
1
VS
VS
C1
100n
VS
+
C2
100n
C3
330u
D1
Alimentacion Motor
Vss
OUT1
3
J2
VS
R1
Y1
2
D2
10K
1
2
PWM MOTOR
D3
U1
5
7
1
12M
C4
22-47P
C5
22-47P
10
12
U2
VCC
D5
DIODE
RBO/INT
RB1
RB2
RB3
RB4
RB5
RB6
RB7
R2
47K
R3
1
13
200
MCLR/Vpp
OSC1/CLKIN
C6
104Kp
RCO/T1OSO/T1CKI
RC1/T1OSI/CCP2
RC2/CCP1
RC3/SCK/SCL
RC4/SDI/SDA
RC5/SDO
RC6/TX/CK
RC7/RX/DT
RD0/PSP0
RD1/PSP1
RD2/PSP2
RD3/PSP3
RD4/PSP4
RD5/PSP5
RD6/PSP6
RD7/PSP7
RE0/RD/AN5
RE1/WR/AN6
RE2/CS/AN7
14
6
11
2
3
4
5
6
7
2A1
2A2
2Y1
2Y2
1EN
2EN
1E
2E
2
3
OUT2
13
14
D4
1
15
L298
A-ENCODER
B-ENCODER
Z-ENCODER
33
34
35
36
37
38
39
40
J3
15
16
17
18
23
24
25
26
1
2
3
4
5
VCC
CON5
SCK
SDA
VCC
19
20
21
22
27
28
29
30
8
9
10
SW1
4
3
2
1
5
6
7
8
1
2
3
4
5
6
7
8
9
10
RA0/AN0
RA1/AN1
RA2/AN2
RA3/AN3/VREF
RA4/TOCKI
RA5/AN4/SS
1Y1
1Y2
10
9
8
7
6
5
4
3
2
1
OSC2/CLKOUT
1A1
1A2
BUMPER
J4
J5
BUS ADDRESS
IIC BUS
100k
R4
VCC
PIC16C77_P
RESET
Figure 2.20: Motor control module schematics.
There are two different commands that the board interprets: one to enable the motor
(ENA) and the other to change the speed and direction (CTRL). A CTRL command is valid
until a new CTRL command is received, independently of the ENA command. The pulse
counter is a 3 byte register to permit flexible encoder management. Whenever the master
issues a PANS, the microcontroller returns the current encoder value and the current bumper
status, and then resets the encoder counter.
2.4.6 Compass module
The compass board (Figure 2.21) has two I2C connectors and four address selection
switches. It uses a Precision Navigation Vector 2XG mechanically gimballed 2-axis electronic
compass. It features an accuracy of 2º RMS with a resolution of 1º. The microcontroller polls
the Vector 2XG at its maximum rate (5 Hz). When reading data from the compass (Figure
2.22), the microcontroller starts to set the polling signal (P/C) and then waits for
approximately 100 ms until the compass gets a valid measure and sets the end of capture line
(EOC). The microcontroller sets the slave mode line (SS), so that the serial protocol
interchange starts. Each bit is signalled by the microcontroller using SCLK, and the compass
returns bit values using SDO. The microcontroller should wait 0.1 ms between two
contiguous SCLK pulses. When finished, two valid 16 bit values, corresponding to the X and
Y components of the heading, have been received.
115
CHAPTER 2: HARDWARE ARCHITECTURE
POWER SUPPLY
VCC
VDD
Vss
Y1
12M
SCLK
SDO
C2
47P
SS
P/C
U1
RA0/AN0
RA1/AN1
RA2/AN2
RA3/AN3/VREF
RA4/TOCKI
RA5/AN4/SS
VCC
D1
DIODE
RBO/INT
RB1
RB2
RB3
RB4
RB5
RB6
RB7
R2
47K
R3
1
13
200
MCLR/Vpp
OSC1/CLKIN
C3
104Kp
RCO/T1OSO/T1CKI
RC1/T1OSI/CCP2
RC2/CCP1
RC3/SCK/SCL
RC4/SDI/SDA
RC5/SDO
RC6/TX/CK
RC7/RX/DT
RD0/PSP0
RD1/PSP1
RD2/PSP2
RD3/PSP3
RD4/PSP4
RD5/PSP5
RD6/PSP6
RD7/PSP7
RE0/RD/AN5
RE1/WR/AN6
RE2/CS/AN7
2
3
4
5
6
7
VCC
BCD/BIN
Y FLIP
X FLIP
CI
EOC
RAW
33
34
35
36
37
38
39
40
VCC
8
9
10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
CON17
RESET
15
16
17
18
23
24
25
26
19
20
21
22
27
28
29
30
CAL
RES
M/S
R1
10k
SCK
SDA
R4
VCC
VCC
VCC
1k
SW1
1
2
3
4
J2
8
7
6
5
1
2
3
4
CON4
SW DIP-4
1
2
3
4
5
6
7
8
9
10
OSC2/CLKOUT
14
J1
PIC16C77_P
RESET
J3
IIC BUS
1
2
3
4
5
6
7
8
9
10
C1
47P
J4
IIC BUS
Figure 2.21: Compass module schematics.
Figure 2.22: Vector-2 GX timing chart.
As the compass board works in continuous polling mode, it does not need any special
configuration, and so the board does not implement any command. Whenever the master
issues a PANS, the microcontroller returns the last valid reading of the compass, consisting of
the X and Y values.
116
HUMBERTO M ARTÍNEZ BARBERÁ
2.4.7 Bus master module
The bus master board (Figure 2.23) has one I2C connector, one RS-232 connector, one
20x2 LCD connector, and one +12 V power connector. The +12 V input is tied to a LM7805
voltage regulator to obtain the +5 V needed to power all the boards connected to the bus. The
microcontroller’s logic is in charge of accepting and sending serial data using the RS-232 port
through a MAX232 integrated circuit. The serial packet format is similar to that of the bus
format. The RS-232 port is set at a speed of 38.4 Kbaud. In addition to the commands
supported by the different boards, the bus master also accepts commands to send or format
data to the LCD display.
Alimentacion Sistema
Iluminacion Display
VCC
1
VIN
VOUT
VCC
2
R1
GND
2
1
VDD
3
U3
L7805/TO3
J1
POT
2
3
VO
1
Vss
Y1
12M
C2
22-47P
J2
CON16
D1
RBO/INT
RB1
RB2
RB3
RB4
RB5
RB6
RB7
R2
DIODE
47K
R3
1
13
200
MCLR/Vpp
OSC1/CLKIN
SW1
C8
SW PUSHBUTTON
104Kp
RCO/T1OSO/T1CKI
RC1/T1OSI/CCP2
RC2/CCP1
RC3/SCK/SCL
RC4/SDI/SDA
RC5/SDO
RC6/TX/CK
RC7/RX/DT
RD0/PSP0
RD1/PSP1
RD2/PSP2
RD3/PSP3
RD4/PSP4
RD5/PSP5
RD6/PSP6
RD7/PSP7
RE0/RD/AN5
RE1/WR/AN6
RE2/CS/AN7
VCC
15
16
17
18
23
24
25
26
19
20
21
22
27
28
29
30
VO
VCC
+ C3
10u-16v
D0
D1
D2
D3
D4
D5
D6
D7
33
34
35
36
37
38
39
40
VCC
C4
U2
C5
10u-6.3v
VV+
C2C2+
C1C+
C0
C1
C2
SCK
7
14
9
12
SDA
DTR
+
VCC
T2OUT
T1OUT
R2OUT
R1OUT
T2IN
T1IN
R2IN
R1IN
6
2
5
4
3
1
10u
C6
+
2
3
4
5
6
7
C7
10
11
8
13
10u-15v
10u-6.5v
DTR
TXD
MAX232
DCD
RXD
DCD
VCC
IR-INT
1
2
3
4
5
6
7
8
9
10
RA0/AN0
RA1/AN1
RA2/AN2
RA3/AN3/VREF
RA4/TOCKI
RA5/AN4/SS
+
14
+
OSC2/CLKOUT
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
U1
C0
C1
C2
D0
D1
D2
D3
D4
D5
D6
D7
C1
22-47P
US-INT
J3
8
9
10
RS-232
R4
R5
VCC
RESET
C9
CAP NP
1
2
3
4
5
6
7
8
9
10
PIC16C77_P
J4
RESISTOR
RESISTOR
C10
CAP NP
IIC BUS
Figure 2.23: Bus master module schematics.
2.4.8 Conclusions
The Quaky-Ant is a simple but powerful design that sticks to the operational
requirements pointed and discussed at the beginning of the chapter. When measuring cost,
performance and expandability Quaky-Ant compares favourably with commercial robots. The
main characteristics of this robot can be summarised as:
117
CHAPTER 2: HARDWARE ARCHITECTURE
• It features an open architecture. Two key points are the use of a standard I2C bus for the
•
•
•
interconnection of different modules and the definition of a generic open protocol
(PWRT/PANS packets).
One consequence of the design of the architecture is its robustness (mainly due to the
custom protocols) and its expandability (all the modules are addressed the same way
irrespective of their nature).
The architecture is not tied to any control architecture or development system, although
one has been developed (see chapter 5).
It operates smoothly and correctly in indoor environments and is also capable of operating
in non-rough outdoor environments. Either way, it does not exhibit an outstanding top
speed (approximately 0.45 m/s) because this was not a design concern. In contrast it
exhibits manoeuvrability and precision.
118
Chapter 3
Sensor Fusion
3.1 Introduction
A mobile robot collects information about its environment through external sensors.
This is combined with the information provided by internal sensors and both of them are used
to perform navigation tasks. External sensors used in mobile robots are ultrasonic, infrared,
and laser sensors, vision cameras and others. Each of them provides readings with a degree of
uncertainty depending on the sensor characteristics. In many applications, matching
inaccurate information coming from various sensors results in a reduction in uncertainty of
global sensor information.
The Quaky-Ant robot (see chapter 2) has ultrasonic and infrared sensors. Sonar sensors
measure distances up to nine metres, but short distances are very inaccurate and they have
many problems with multiple reflections and wide beam width. Infrared sensors measure up
to eighty centimetres, with less uncertainty than sonar in obstacle avoidance tasks (due to
their aperture). Both kind of sensors are placed with similar orientation and location around
the robot, and thus, different redundant sensors can reduce the uncertainty of information.
Many techniques have been studied in the process of integrating data from different
sensors (sensor fusion). These methods translate the different sensory inputs into reliable
estimates that can be used by other navigation subsystems. There have been several reviews
of sensor fusion (Luo, 1989), (Kam, 1997). Luo (Luo, 1989) shows the advantages gained
through the use of multi-sensory information as a combination of four aspects: redundancy,
complementarity, timeliness and cost of information. Kam (Kam, 1997) makes a review
describing integration techniques in two categories: low level fusion used for direct
integration of sensory data, and high level fusion used for indirect integration of sensory
information through control signals or commands suggested by different modules in the
control architecture.
Sensor fusion with known statistics relies on well-developed techniques such as Kalman
filtering or Bayesian theory. Their goal is to determine a parameter θ ∈ Rm from a set of
observations z ∈ Rn. They are both related through the non-linear function g(z, θ)=0.
Extended Kalman filtering (Leonard, 1992) is the most popular tool proposed in the literature
for sensor fusion in mobile robot navigation.
119
CHAPTER 3: SENSOR FUSION
If there is no model of the uncertainty, other specific techniques such as rule based
sensor fusion (Flynn, 1988), fuzzy logic or neural networks (Chang, 1996) and others are
used. These algorithms do not require an explicit analytical model, thus resulting in more
flexible models. These models are obtained by taking into account experiments (inputs and
outputs) with no prior knowledge, or the knowledge of an expert operator through descriptors
or rules.
One of the goals of this thesis is the Quaky-Ant robot to perform delivery tasks in an
indoor office-like environment, without prior knowledge of the floor plant, and without any
modification or special engineering of the environment. This objective, among other things,
dictates the necessity of having reliable sensors both to build maps and to navigate the robot.
Quaky-Ant has only low cost sensors that return a distance as output: ultrasonic sonars and
infrared sensors. These two kinds of sensors can be complementary (each one has its own
error sources) and a natural way to cope with their uncertainty is to perform signal-level
sensor fusion. Thus, the output of each pair of sensors targeting the same area is fused in what
is called a virtual sensor. As the robot builds its own map without an a priori world model,
fusion techniques that need a model of the environment in advance (like Kalman filters) are
not applicable. Supervised learning based techniques have been previously used for fusion
and are also used in this thesis, in particular a neural network and a fuzzy reasoning system.
One of the most crucial problems when using a supervised learning method for sensor
fusion is data acquisition. This acquisition is not trivial because quite a few representative
sensor measurements are needed: both the sensor reading and the real distance, for different
incidence angles, object surfaces, and distances. In this case, the real experimental set-up is
very difficult. Although important, the real set-up is very time consuming and thus a different
approach is taken. There are good sensor models available in the literature, and thus the data
acquisition is performed using sensor simulation. The fusion method is trained with these
data, and then the fusion model is validated in real-world environments by an iterative process
(Figure 3.1). The fusion module will accept the distance measured by each sensor and
produce the fused distance, which will be used by other robotics related tasks like mapbuilding (chapter 4), robot control (chapter 5), or feature-level sensor fusion (chapter 6).
Figure 3.1: Fusion method training and validation.
120
HUMBERTO M ARTÍNEZ BARBERÁ
3.2 Sensor models
3.2.1 Ultrasonic sonar range finders
Quaky-Ant’s uses the Polaroid 6500 series Ultrasonic Sonar Range Finder. These
sensors operate using a single transducer for both sending and receiving the ultrasonic pulses
(at 49.4 kHz). The maximum intensity width of the ultrasonic beam is typically about 15 – 20
degrees. The module can measure from 15 centimetres up to 10 metres with an accuracy of
± 1% over the entire range. There are three fundamental kinds of error:
• Loss of echo signal. When the angle of incidence between the sonar and a given objet is
•
•
wide enough, the pulses do not go back to the receiver and the sensor does not detect the
object.
Uncertainty due to angular aperture. Given the typical sonar beam, the further away the
object is, the more uncertainty the measure has. The uncertainty zone of the sonar is a
cone with its vertex located on the sensor. The distance to the object is known, but its
position is not.
Double rebounds. Sometimes, when the angle of incidence between the sonar and the
object is big, the reflected pulse can reach another object and then go back to the sensor.
In this case the sensor detects the object but produces a wrong measurement (longer than
the real one).
There are some sonar models available in the literature (Barshan, 1990)(Leonard, 1992),
based on the physical model of an ultrasonic pulse reaching planes and corners. These
methods feature a local scope, because they decompose the environment in discrete elements
(segments, corners, cylinders, etc) and then model the sonar behaviour with each element.
The major drawbacks of these methods are: when using a given environment, it must be
decomposed in the different elements and the methods fail to model double rebounds, which
are very significative in a cluttered environment. To cope with these problems there are
techniques (Gallardo, 1999) based on the well-known raytracing method for computer
graphics (Watt, 1992) and the sonar models described below.
The method of sonar tracing for a given sonar is as follows (Gallardo, 1999):
• Let θ be the orientation of the sonar. First compute the sensibility level ∆θd (Eq 3.1). Then
•
•
send δ rays from θ − ∆θd to θ + ∆θd. Recursively follow each of the rays until they reach
the sensor. Return the distance of the shortest path.
To compute the path of an emitted ray rout, first find the intersection point p with the
environment, and the direction rref. Then test if rref reaches the sensor with amplitude
greater than a0. In this case the distance d is calculated (Eq 3.2) based on the path
followed by the ray (p 1 … p n).
If the amplitude is not greater than a0 then compute the next reflection, in the rref
direction. This way rref becomes rout. Then follow the same method described below.
121
CHAPTER 3: SENSOR FUSION
The method to compute the amplitude of a ray is as follows:
•
•
•
•
Let rin be the ray from the transducer to the intersection point p.
Compute the angle ∆κ between rin and rref .
Compute the angle ∆η between rin and θ.
Compute the sensibility a (3.3).
∆θd =
θ0 −ln(a 0 )
2
(3.1)
n
d=
∑p
(3.2)
i
i
a=e
n
−2(
∆θ 2 ∆η 2 ∆κ 2
+
+
)
θ0 2 θ0 2 κ 0 2
(3.3)
This method produces quite good results because it can model the three sonar error
sources. Nevertheless, it does not take into account the main lobe effect in not completely
specular environments, which is the case in real world environments. Thus, when the
minimum distance ray is far from the centre of the lobe, the resulting time-of-flight is quite
conservative and produces distances shorter than the real ones. To produce more realistic
distances the previous method has been modified to take into account the angular distance
from the centre of the lobe. The new addition is as follows:
• Let tof i be the rays that reach the sensor, and tof min the one with the minimum distance.
• Let sε be the set of rays so that | tof i – tof min | < ε.
• Compute the corrected minimum distance tof out as a weighted average of the rays in sε
(3.4)
wgt i = 1 − (∆ηi − ∆ηmin )
2
∑ wgt ⋅ tof
∑ wgt
i
tof out =
(3.4)
i
i∈sε
i
i∈sε
In this case, ε has been empirically set to 10 centimetres, which produces fair results.
The original method gives results with more uncertainty in regular environments, while the
modified one produces more consistent measures (Figure 3.2).
122
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 3.2: Sonar-tracing original method (left) and modified method (right).
3.2.2 Infrared sensors
Quaky-Ant’s uses the Sharp GP2D02 High Sensitive Distance Measuring Sensor. These
sensors operate using an infrared LED emitter and a PSD light detector. The maximum
intensity width of the light beam is typically about 10 centimetres. The module can measure
from 10 centimetres up to 120 centimetres with accuracy and resolution depending on the
distance. There are three fundamental kind of errors:
• Object colour/reflectivity . As light reflectivity varies with colour, the darker the object,
•
•
the further away it appears. The accuracy degrades quickly with the distance. The object
can even remain undetected.
Global illuminance. As regular illumination contains light in the infrared band, the
brighter the light the further away the object appears.
Quantisation error. This sensor converts the measure to 8 bits. This conversion is not
linear in the full range, and thus, the further away the object, the less accurate is the
reading.
123
CHAPTER 3: SENSOR FUSION
Figure 3.3: Sharp GP2D02 distance output (left) and sensing range (right).
As there are not many infrared models in the literature, a simple model based on the
manufacturer’s datasheet has been developed. The angle of incidence is not as important as in
sonars, and thus it is simply ignored. The output of the infrared sensor is not linear with the
distance (Figure 3.3). Due to this non-linearity, the sensor response is modelled using the
following equations (Eq. 3.5), with the response decomposing into a piece-wise linear curve.
y = −0.25x + 95.00
x ∈[60,120]
y = −1.14 x + 148 .57
y = − 5.33x + 253 .33
x ∈[25, 60 ]
x ∈ [10, 25]
(3.5)
where y is the output of the sensor and x is the real distance to the wall. The output of
the sensor is an 8 bit value. This model is rather approximate for reflective objects between
18% and 90%, and the range of maximum accuracy is between 10 and 80 centimetres.
3.3 Sensor fusion using neural networks
As stated above, neural networks (see chapter 1) are to be used for solving the fusion
problem. Many authors have used neural networks as sensor fusion models with sonar sensors
(Chang, 1996) or acoustic sensors (Pearson et al., 1995). Others use them as interpreters
(Thrun, 1998a), (Santos, 1994) where information coming from different sonar sensors is
merged to produce a probability value of space occupancy. There are many antecedents in
sensor fusion with sensors of different natures as well. Buchberger (Buchberger, 1993)
compiles information coming from a laser radar and some sonar sensors, Courtney (Courtney,
1994) uses readings from ultrasonic, vision and infrared sensors to localise a mobile robot. In
this case, a standard multilayer perceptron is used. The following subsection describes the
training process as well as presenting some results.
124
HUMBERTO M ARTÍNEZ BARBERÁ
3.3.1 Experiments and results
To compare the neural network fusion model the simple but effective Flynn’s rules
model (Flynn, 1988) is used. This fusion method is as follows:
• If the sonar reading is greater than the maximum range for the infrared sensor, then ignore
•
•
the infrared sensor.
If the sonar reading is at its maximum value, then the real distance is greater.
Whenever the infrared sensor detects a change from non-detection to detection, and the
associated sonar reading is less than 10 feet, then a valid depth discontinuity has been
detected.
In this way, the original sonar boundary is redrawn to take into account features that are
found by the infrared sensor. Although this model works well in simulators and in some real
environments, it presents some problems:
• The rules do not exploit the fact that modern infrared sensors have a greater accuracy (like
•
those of Quaky-Ant), and the robot should rely on them to get accurate short distance
measures. A better fusion method should give more importance to the infrareds.
It does not take into account the physical characteristics of each pair of sensors: the sensor
manufacturing process is not really uniform and there are some calibration differences
between sensors (Gallardo, 1999).
Two different scenarios have been used for both collecting sensor data and for neural
network validation. The first one is heavily influenced by the basement of the Computer
Science Faculty (where our lab is located in). Its overall dimensions are 17 by 22 metres, with
corridors of 2 metres of width, and doorways of 80 centimetres (Figure 3.4). The other one is
based on the floor plant of the Trinity College Robot Fire Fighting Contest1. Its overall
dimensions are 3.5 by 3.5 metres, with corridors and doorways of 65 centimetres (Figure 3.5).
1
http://www.trincoll.edu/~robot
125
CHAPTER 3: SENSOR FUSION
Figure 3.4: Basement environment.
Figure 3.5: Trinity environment.
The data acquisition procedure consists of storing triplets of data (sonar, infrared, and
real measures) in different positions along two different, manually defined paths in both
scenarios (Figure 3.6). This data set is divided into two sets: the training-set, which is directly
used for neural network training (75% of the data set), and the test-set (25% of the data set),
which is used for neural network validation. It can be observed that the data set is not regular
and contains many contradictory values, as happens in the real world. This fact reinforces the
idea that learning with simulated data can be helpful in a real world scenario.
Figure 3.6: Training data set.
The neural networks are defined by the following parameters:
• The topology, defined by the number of input, hidden and output nodes.
• The learning rate, which governs the influence of each example in learning.
• The momentum, which indicates the strength of the vector pointing to the highest error
•
variation direction.
The number of epochs, which defines the duration of the training procedure.
126
HUMBERTO M ARTÍNEZ BARBERÁ
In order to select the appropriate values for the network parameters, an iterative
procedure is followed (Figure 3.7). Instead of training a single network, different networks
with different parameter values are trained and the best network (the one with the minimum
test-set error) is returned as the output of the learning procedure. To speed up the execution of
the neural network training procedure, four hosts (in a variety of different platforms) have
been involved, each one running a backpropagation algorithm, where the only difference
between them is the computational power of the machine they are running on.
Figure 3.7: Neural network training process.
The number of input and output nodes is fixed: two input nodes (sonar and infrared
sensors) and one output node (virtual sensor). The momentum and the number of epochs have
also been fixed. The momentum is not important for a coarse grained search, and the number
of epochs, if set high enough, is only an arbitrary measure. Thus, for each learning
experiment, a series of hidden nodes and learning rates have been tried: hidden nodes
numbers are in the set {5, 10, 15, ... , 80}, and learning rate values are in the set {0.01, 0.06,
0.11, 0.16}. A total of 16 x 4 = 64 learning experiments have been run. By applying the
automated training procedure to the acquired data set, the best test-set error is obtained
(Figure 3.8) with the following parameters: 10 hidden nodes and a learning rate of 0.04363 in
20,000 epochs.
Figure 3.8: Error evolution with different training parameters.
The output of the neural network approximates and smoothes the data set (Figure 3.9).
One point to notice is that the same neural network is used for each pair of sensors, with all of
127
CHAPTER 3: SENSOR FUSION
them exhibiting the same physical characteristics. If data is to be collected from real world, it
is necessary to obtain different data sets, and then produce a series of neural networks, one for
each pair of sensors. Gallardo (Gallardo, 1999) reported that there are noticeable differences
in a series of Polaroid 6500 sonar sensors.
Figure 3.9: Neural network fusion surface.
After the training process has finished, the robot is run in both scenarios along different
predefined paths, and taking measurements from both sensors. These measurements are fused
and compared to the real distance to calculate the error of the fused distance. The purpose of
this step is to test the generalisation capabilities of the neural network, and to quantify how
well it performs. An Euclidean metric is used to compute the absolute error of all the sensors,
measured in metres (Eq. 3.6).
err =
n
∑ virtu
i
(3.6)
− reali
i
where virtui is the fused distance and reali is the real distance to the wall. The results of
the tests are summarised in tables 3.1 and 3.2. They represent the average error (Avg), the
variance (Var), and the rooted mean square error (RMSE) using the previously defined error
metric. It can be noted that the neural network fusion outperforms better in all cases except in
the trinity scenario, in which the average error is higher. This is due to the fact that the walls
are always so close that most of time the infrared suffices to approximate the real measure,
and using Flynn’s rules, the sonar is only used for long distances.
Method
ANN
IR
SONAR
FLYNN
Avg
0.5499
0.4278
3.7716
0.4841
Var
9.5839
8.6730
192.4129
17.0875
RMSE
0.6638
0.5194
4.0186
0.6778
Table 3.1: Errors in the trinity scenario.
Method
ANN
IR
Avg
2.8014
4.4054
Var
153.7345
256.6320
128
RMSE
3.3804
5.0515
HUMBERTO M ARTÍNEZ BARBERÁ
SONAR
FLYNN
4.7821
3.2924
123.8674
143.5941
5.0812
3.7761
Table 3.2: Errors in the basement scenario.
3.4 Sensor fusion using fuzzy rules
In addition to simple neural networks, this thesis approaches the fusion problem by way
of fuzzy rule systems (see chapter 1). In this case, the rules are obtained using a hybrid
learning mechanism. The process is composed of three techniques, each one carrying out one
of the different phases of fuzzy modelling. First, a technique to find the number of rules that
compose the system (system structure) is needed. An auto-organised clustering method called
mountain clustering is used in order to find the number of different behaviours in the data set.
The output of this method is the number of initial centroids and it represents the number of
rules in the system. These centroids are the input to another clustering method whose goal is
to tune these centroids in order to obtain improved ones, which constitute the initial
parameters of the rules. The K-means, the fuzzy K-means or the fuzzy tabu clustering methods
will perform this step. An adaptive neural network called ANFIS carries out the tuning of the
rules as the final step in this schema. In the next subsections the different algorithms that are
part of the learning process are described.
3.4.1 Some background on clustering methods
Clustering is an important technique used in searching for structures in data. Given a
finite set of data, p, the problem of clustering in p is to find a collection of cluster centres,
which can properly characterise relevant classes of p. In classical cluster analysis, these
classes are required to form a partition of p, such that the degree of association is considered
strong for data lying in the blocks of the partition and weak for data in different blocks (Duda
et al., 1973). If this requirement is relaxed, a crisp partition on p is replaced with a weaker
requirement of a fuzzy partition on p, and therefore it falls in the area of fuzzy clustering.
The fuzzy clustering problem may be stated as follows: given mp patterns in Rn, allocate
each pattern to a cluster with a certain degree of membership such that an objective function
is minimised. This function has three parameters: the patterns, the clusters and the
memberships. Generally, this function is a sum of squared distances between each pattern and
each centroid of the clusters multiplied by the membership to that cluster. Several distances
may be used, like: Hamming, Euclidean, Mahalanobis, maximum, Minkowski, etc (Yuan et
al., 1995).
The fuzzy clustering problem can be mathematically described as follows (Eq. 3.7):
Minimise
J (p, c, u)
(3.7)
129
CHAPTER 3: SENSOR FUSION
mp
∑i uij = 1, j ∈[1,..., nc]
Subject to
0 ≤ uij ≤ 1
pi ∈ Rn , i ∈[1,. .., mp]
n
c j ∈R , j ∈[1, ..., nc]
where:
•
•
•
•
•
•
J is the objective function (given)
mp is the number of patterns (given)
p is the pattern vector (given)
nc is the number of clusters (given)
c is the centroids vector (to be found)
u memberships matrix (to be found)
and the memberships matrix is defined as follows:
• uij degree of membership of pattern pi to the cluster cj
The modified mountain clustering is an improvement of the mountain clustering method
proposed in (Yager and Filev, 1994). It features a computing time which is proportional to the
number of data points instead of their dimension. At the beginning, all the data in the training
set are candidates for clusters. The second step is to calculate a density function called
mountain for each candidate. In the third step, the point with maximum density is the next
cluster chosen. After this, the effect of the last cluster found must be eliminated, and so we
must subtract a gaussian component to its density function, centred in this cluster and
proportional to its height. If the density of the last cluster is higher than a predetermined
constant, the process continues in the second step.
The k-means clustering divides the data X={x1,...,xn} into a known number of groups
{C1,...,Ck}, and finds the point (centre) cj of each group that minimises a cost function J. The
groups are defined by a k x n matrix U, where u i,j=1 ⇔ cj ∈ Ci. If every cj is fixed, then the u i,j
values minimising J are (Eq. 3.8):
2
2

1 x j − ci ≤ x j − ck ∀k ≠ i
ui, j = 
0 otherwise
(3.8)
On the other hand, if every u i,j is fixed, the cj that minimise J are the centre of the area
of each group (Eq. 3.9):
cj =
1
Cj
∑x
k
(3.9)
xk ∈Ci
Therefore the clustering process is as follows:
130
HUMBERTO M ARTÍNEZ BARBERÁ
•
•
•
•
Initialise randomly the centres cj.
Obtain the U matrix.
Modify the centres cj.
If J does not reach a tolerance level, go to step 2.
The fuzzy k-means clustering is an improvement on the last method, in which the data
belong to each cluster in a fuzzy way. Then U matrix is not binary and the u i,j are values in the
range [0..1].
One technique of fuzzy clustering is the fuzzy c-means clustering method which is based
on fuzzy c-partitions where c designate the number of fuzzy classes in the partition. The fuzzy
c-means clustering algorithm was initially developed by Dunn and later generalised by
Bezdek (Bezdek, 1981). The fuzzy c-means algorithm is an iterative procedure for
approximately minimising the objective function J by using a Picard iteration. A
characteristic of this is that it converges q-linearly to either a local minimum or saddlepoint of
J.
3.4.2 Fuzzy tabu clustering
In order to treat the possible existence of multiple local minima, a new algorithm has
been developed based on a tabu search technique while preserving the fuzzy c-means
algorithm ideas. The tabu search technique is different from the well-known hill-climbing
local search techniques in the sense that it does not become trapped in local optimal solutions,
as is the case of the classical fuzzy c-means (Al-Sultan, 1995). The algorithm has been
implemented and tested on various problems, and the results are very encouraging.
3.4.2.1 The tabu search
The tabu search is a heuristic that can be used to solve combinatorial optimisation
problems. It allows moves out of a current solution that makes the objective function worse,
in the hope that it eventually will achieve a better solution (Al-Sultan, 1995).
The tabu search requires the following basic elements to be defined:
• Configuration is a solution or an assignment of values to variables.
• A move characterises the process of generating a feasible solution to the combinatorial
•
problem that is related to the current solution (i.e. a move is a procedure by which a new
trial solution is generated from the current one).
Set of candidate moves is the set of all possible moves out of a current configuration. If
this set is too large, one could operate with a subset of this set.
131
CHAPTER 3: SENSOR FUSION
• Tabu restrictions: these are certain conditions imposed on moves that make some of them
•
forbidden. These forbidden moves are known as tabu. To take these tabu moves into
account, a list of a given size records these forbidden moves. It is called the tabu list.
Aspiration criteria: these are rules that override tabu restrictions, i.e. if a certain move is
forbidden by tabu restriction, then the aspiration criterion, when satisfied, can make this
move allowable.
Given the above basic elements, the tabu search scheme can be described as follows:
start with a certain (current) configuration, evaluate the criterion function for that
configuration. Then, follow a certain set of candidate moves. If the best of these moves is not
tabu or if the best is tabu, but satisfies the aspiration criterion, then pick that move and
consider it to be the next current configuration; otherwise, pick the best move that is not tabu
and consider it to be the new current configuration. Repeat the procedure for a certain number
of iterations. On termination, the best solution obtained so far is the solution obtained by the
algorithm. The tabu list has a certain size, and when the length of the tabu reaches that size
and a new move enters that list, then the first move on the tabu is freed from being tabu and
the process continues (i.e. the tabu list is circular).
3.4.2.2 The fuzzy tabu clustering algorithm
In this section a new tabu search-based algorithm for fuzzy clustering is presented
(Delgado et al., 1997). However, before the algorithm is stated, some notation is needed. The
objective function (Eq. 3.10) is similar to that of the fuzzy c-means algorithm (Liu et al.,
1995), where the parameter m is the fuzziness of the clustering. The Euclidean distance is
always used in the calculations, although a different metric could be used.
This aim of this method is to divide the data P={p1,...,pm} into a known number of
groups C={C1,...,Cn}, and find the point (centre) cj of each group that minimises a cost
function J. The groups are defined by an m x n matrix U, where u i,j ∈ [0..1] is the degree of
membership of p i to cluster Cj. The cost function J is shown in (Eq. 3.10), where the
parameter m is the fuzziness of the clustering.
mp nc
J=
∑i ∑j pi − c j
2
(3.10)
⋅uijm
The centroid of each cluster is calculated as follows (Eq. 3.11):
mp
∑ p ⋅u
i
cj =
(3.11)
m
ij
i
mp
∑u
m
ij
i
The membership of a pattern to a cluster is calculated as follows (Eq. 3.12):
132
HUMBERTO M ARTÍNEZ BARBERÁ
u ij =

1

 p −c
j
 i
nc
∑k
1
(3.12)
 m−1

2


1

 pi − c k
1
 m −1
2


To simplify the explanations, let a configuration be a tuple composed of a centroids
vector and the corresponding degree of membership matrix, and it will be denoted as Ak. Each
tentative solution will be a specific configuration, calculated from equations (Eq. 3.11) or (Eq.
3.12) depending on the method used to generate these tentatives. These methods are discussed
later on. In addition, the following parameter names are used:
•
•
•
•
•
MTLS maximum tabu list size.
P probability threshold.
NTS number of trial solutions.
ITMAX maximum number of iterations.
TLL tabu list length.
The algorithm can be described, by means of four simple steps, as follows:
(s1) Initialisation step. Let A0 be an arbitrary solution and J0 be the corresponding
objective function value computed using equation (Eq. 3.10). Let Ab = A0 and Jb = J0 .
Select values for the following parameters: MTLS, P, NTS, and ITMAX. Let k = 1, let
TLL = 0, and go to step s2.
(s2) Generation step. Using Ab generate NTS trial solutions A1k , Ak2 ,... , AkNTS , and evaluate
their corresponding objective function values J 1k , Jk2 ,.. ., JkNTS , and go to step s3.
(s3) Selection step. Order J1k , Jk2 ,.. ., JkNTS in ascending order and denote them by
[ 2]
[ NTS]
J[1]
. If A[1]
is not tabu, or if it is tabu but J[1k ] < J b , then let Ac = A[1]
and
k , Jk ,..., Jk
k
k
J c = J [k 1] , and go to step s4; otherwise let Ac = A [k L ] and J c = J [k L ] , where J [kL ] is the best
[ 2]
[ NTS]
objective function of J [1]
that is not tabu and go to step s4. If all
k , Jk ,..., Jk
[1] [ 2]
[ NTS]
Jk , Jk ,..., Jk
are tabu, go to step s2.
(s4) Modification step. Insert Ac at the bottom of the tabu list and let TLL = TLL + 1 (if
TLL = MTLS + 1, delete the first element in the tabu list and let TLL = TLL - 1). If Jb >
Jc, let Ab = A0 and Jb = Jc. If k > ITMAX, stop (Ab is the best solution found and Jb is
the corresponding best objective function value); otherwise, let k = k + 1 and go to step
s2.
To generate the trial solutions, two methods have been tested:
133
CHAPTER 3: SENSOR FUSION
(t1) Move centroids. In this case the idea is to move the centroids over the space and
then to calculate the new memberships matrix, using equation (Eq. 3.12). For each
centroid, if a random number is greater than P, the move is performed. This move can
be done in two ways:
(m1) Complete space. For each coordinate of the centroid compute a random
number and if greater than P, move this coordinate across the complete
solution space, delimited by the patterns.
(m2) Decreasing radius. In the same way as the method described below
compute a random number, and if greater than P then move the coordinate
inside a n-dimensional sphere centred in the centroid. At the first iteration, the
radius is set to include the complete space and then is reduced lineally at each
iteration.
(t2) Shake memberships. In this method the idea is to swap the degrees of membership
of patterns to clusters and then to calculate the new centroids, using equation (Eq. 3.11).
For each pattern, a shake is performed if a random number is greater than P.
3.4.2.3 Experiments and results
Several experimental results are presented to demonstrate that with the tabu searchbased fuzzy clustering algorithm it is much easier to find the global or near-global optimum
solutions than when using the traditional fuzzy clustering algorithms. To show the influence
of the different parameters of the algorithm, the SIMPLE data set (Figure 3.10), which
consists of four differentiated and spaced clusters, and the well-known Anderson's IRIS data
set, which consists of three clusters unclearly differentiated, are used. Each test will be
identified by:
A set of values for the algorithm parameters (as described in the section below).
• Repeat: as stated in the previous section, the first step of the algorithm generates a random
•
initial cluster configuration, and thus each run will depend on this configuration. To
reduce the effect of this randomness, each test is run several times.
Move: the kind of movement generation method used in each test.
134
HUMBERTO M ARTÍNEZ BARBERÁ
6
4
2
0
-5
0
-2
5
-4
-6
Figure 3.10: The SIMPLE dataset.
The probability threshold (P) is a key parameter in obtaining good objective function
values. The lower the value of P, the more global the search performed is. The higher the
value of P, the more local the search performed is. There is a trade off between local and
global search. As is shown below (Figure 3.11), the range [0.55 ... 0,65] has been found to be
the best. The initial parameters are (Table 3.3)
MTLS = 350
ITMAX = 500
move = t1-m1
repeat = 10
NTS = 15
m = 1.5
mp = 52
nc = 4
Table 3.3: Initial parameters for the probability threshold experiment.
The following curve (Figure 3.11) represents the average of the objective function for
each run with different probability thresholds.
185
175
165
155
0,9
0,8
0,7
0,6
0,5
0,4
0,3
0,2
0,1
0
145
Figure 3.11: Values of the objective function versus the probability threshold (SIMPLE dataset).
The next parameter that will heavily condition the success of the algorithm is the
maximum number of iterations (ITMAX). Moreover, the performance of algorithms tends to
be measured in terms of number of iterations. The experiment will consist of a series of runs
of the SIMPLE data set with the parameters listed in (Table 3.4), varying ITMAX from 100 to
3000, and a series of runs of the IRIS data set with the parameters listed in (Table 3.5),
varying ITMAX from 100 to 4000. As can be seen, the performance grows with the maximum
number of iterations, as could be supposed.
MTLS = 70% of ITMAX
P = 0.6
move = t1-m1
repeat = 10
NTS = 15
m = 1.9
mp = 52
nc = 4
135
CHAPTER 3: SENSOR FUSION
Table 3.4: Initial parameters for the iterations experiment (SIMPLE dataset).
MTLS = 2500
P = 0.85
move = t1-m1
repeat = 10
NTS = 15
m = 1.5
mp = 150
nc = 3
Table 3.5: Initial parameters for the IRIS experiment.
The objective function values and percentages of well-classified elements for the
SIMPLE data set are shown in (Figure 3.12). The upper curve is the average objective
function, and the lower one is the percentage of well-classified elements. As can be seen, the
percentage of good classifications approaches 100% at approximately 1500 iterations.
3500
3000
2500
2000
1500
1000
500
100
180
160
140
120
100
80
60
Figure 3.12: Values of the objective function and percentage of well classified elements versus
iterations (SIMPLE dataset).
The percentages of well-classified patterns for the IRIS data set are shown in (Figure
3.13). The curve is the average of the percentage of well-classified patterns. All values below
the discontinuous line are acceptable classifications, with less than 15 per cent classification
error (Bezdek et al., 1992).
4000
3500
3000
2500
2000
1500
1000
500
100
95
90
85
80
75
70
65
Figure 3.13: Percentage of well classified elements versus iterations (IRIS dataset).
The objective function values for the IRIS data set are shown in (Figure 3.14). The
upper curve is the average objective function and the lower is the average deviation from the
average objective function. All values below the discontinuous line are acceptable
classifications (with less than 15 per cent classification error).
136
HUMBERTO M ARTÍNEZ BARBERÁ
200
150
100
50
4000
3500
3000
2500
2000
1500
1000
500
100
0
Figure 3.14: Values of objective function versus iterations (IRIS dataset).
Another important parameter is the fuzziness (m). The lower the value of m, the crisper
the partitions obtained. The higher the value of m, the fuzzier the partitions obtained. There is
a trade off between fully crisp and fully fuzzy clustering. At can be seen (Figure 3.15) the best
results are obtained with m in the range of [1.5 ... 1.9]. The experiment will consist of a series
of runs of the IRIS data set with the parameters listed in (Table 3.6), with m varying from 1.1
to 2.5.
MTLS = 850
P = 0.65
move = t1-m1
repeat = 10
NTS = 15
ITMAX = 1000
mp = 150
nc = 3
Table 3.6: Initial parameters for the fuzziness experiment (IRIS dataset).
90
85
80
75
2,5
2,3
2,1
1,9
1,7
1,5
1,3
1,1
70
Figure 3.15: Percentage of well classified elements versus fuzziness (IRIS dataset).
The movement generation methods have been tested with different data sets. The
experiment will consist of a series of runs of the IRIS data set with the parameters listed in
(Table 3.7), and the methods: shake clusters (S), move centroids constant radius (MC), and
move centroids lineal radius (ML).
MTLS = 70% of ITMAX
P = 0.6
m = 1.9
repeat = 10
NTS = 15
ITMAX = 500, ..., 3000
mp = 150
nc = 3
Table 3.7: Initial parameters for the movement experiment (IRIS dataset).
The results of the movement generation test (Figure 3.16) shows that the best method is
ML. The method MC presents a high degree of randomness (due to is global nature). The
method S presents a poorer degree of correct classifications in the long run, but with a few
137
CHAPTER 3: SENSOR FUSION
iterations it gets the better results (1000 iterations approximately). Thus, this method should
be used as the initialisation step of the algorithm in order to get a good first trial and reduce
the number of iterations significantly.
90
85
80
75
S
MC
ML
Figure 3.16: Percentage of well classified elements versus movement method (IRIS dataset).
It should be noted that although the number of iterations is greater than in the case of
the fuzzy c-means algorithm, the kind of search performed by the proposed algorithm needs
them if it is not to become trapped in a local minimum. The algorithm has been applied to
different examples (Figure 3.14) (Figure 3.17) showing comparable results to the well-known
fuzzy clustering algorithms available in the literature (Bezdek et al., 1992) (Yuan et al., 1995)
(Liu and Xie, 1995) (Pal et al., 1993).
90
80
2000
1500
1000
500
400
300
160
200
120
40
60
80
70
Figure 3.17: Percentage of well classified elements and objective function values versus PIPE
iterations (IRIS dataset).
3.4.3 Adaptive network based fuzzy inference system (ANFIS)
The ANFIS adaptive network (Jang et alt., 1997) obtains a model consisting of fuzzy
rules. It simulates the TSK reasoning mechanism (Kam, 1997) (Kurz, 1996), and it benefits
from its adaptive capabilities to tune the rules parameters, which may have been estimated
using some previous clustering algorithm. The output of clustering algorithms is a set of
centroids {c1,...ct} describing the training set points with similar features. These clusters
become the antecedent parameters of the fuzzy rules, which in TSK form are like (Eq. 3.13):
Ri ≡ if
(x1 ∈ A1i ) ∧ ...∧ (xn ∈ Ani ) then yi = poi + p1i x1 + ...+ p nix n
138
(3.13)
HUMBERTO M ARTÍNEZ BARBERÁ
where x1,..,xn are crisp values. Each consequent is a linear function with crisp inputs
x1,..,x n, crisp output yi, and crisp parameters p0i,...,pni (it can be observed that if p 1i,...,p ni are 0,
then the consequent is a crisp constant). Using the approximated reasoning theory and the
centre of area mechanism (CoA), the inferred output for a TSK model with t rules is obtained
using (Eq. 3.14):
t
∑τ ( p
i
y =
*
0i
(3.14)
+ p1i x1 + ... + pni xn )
i =1
t
∑τ
i
i= 1
where the fire strength for each rule is (Eq. 3.15):
n
(3.15)
τi = ∏ Aji (x j )
j =1
The ANFIS network architecture is composed of 4 layers (Figure 3.18). Nodes of the
same layer perform similar tasks. The nodes of layers 1 and 4 are adaptive nodes, and their
values are the parameters of the antecedents and consequents, respectively.
Figure 3.18: ANFIS network structure.
Learning is achieved by combining the backpropagation algorithm and the minimum
squares method. In each epoch the method runs a forward and a backward step. In the forward
step, for each input vector, the net is evaluated up to layer 4, and the parameters of the
consequent are estimated using minimum squares. Next, errors are calculated from each pair
of network and desired output. Last, in the backward step, the errors are propagated and the
parameters of the antecedents are modified (backpropagation).
3.4.4 Experiments and results
139
CHAPTER 3: SENSOR FUSION
The experimental setup for generating fuzzy rules is quite similar to that of the neural
network training (see previous section). Two different scenarios have been used for both
collecting sensor data and for neural network validation. The first one is the trinity scenario,
whose overall dimensions are 3.5 by 3.5 metres, with corridors and doorways of 65
centimetres (Figure 3.19). The other one resembles a corridor with different rooms. Its overall
dimensions are 5.2 by 4.9 metres, with corridors of 1.2 metres width, and doorways of 80
centimetres (Figure 3.20). The set of examples (Figure 3.6) and the error metric (Eq. 3.6) are
the same as in the neural network training.
Figure 3.19: Trinity learning environment.
Figure 3.20: Corridor learning environment.
To generate the fuzzy rules different combinations of clustering methods have been
tried, but only the three combinations that obtained the bests results are described. They use
mountain clustering to find the number of clusters (Table 3.9), a clustering method (tabu, kmeans or fuzzy k-means), and an ANFIS network (Table 3.11) to generate rules from the
clusters. The output of the ANFIS network is a series of TSK rules, which use bell fuzzy sets
in the antecedents and the product t-norm. The specific clustering methods of the three
combinations are shown below:
• Experiment ANFIS-3: uses the fuzzy k-means clustering (Table 3.9).
• Experiment ANFIS-5: uses the k-means clustering.
• Experiment ANFIS-8: uses the fuzzy tabu clustering (Table 3.10).
Parameter
α
β
γ
Meaning
Density function height
Last cluster effect elimination
Tolerance threshold
Value
1
1.5
0.001
Table 3.8: Mountain clustering parameters.
Parameter
f
δ
Meaning
Fuzziness
Tolerance threshold
Table 3.9: Fuzzy k-means clustering parameters.
140
Value
2
0.001
HUMBERTO M ARTÍNEZ BARBERÁ
Parameter
m
MTLS
P
NTS
ITMAX
TLL
Meaning
Fuzziness
Maximum tabu list size
Probability threshold
Number of trial solutions
Maximum number of iterations
Tolerance threshold
Value
1.5
300
0.65
5
1000
0.001
Table 3.10: Fuzzy tabu clustering parameters.
Parameter
λ
κ
n
m
p
q
epochs
Meaning
Forgetting factor
Initial step size
Decreases before augmenting
Oscillations before decreasing
Step increase ratio
Step decrease ratio
Iterations of the learning process
Value
1
0.2
2
4
0.1
0.1
10000
Table 3.11: ANFIS network parameters .
After the training process has finished, the robot is run in both scenarios along the paths
used to extract the examples (Figure 3.19) (Figure 3.20). The purpose of this process is to test
the learning capabilities of the different fuzzy rule sets. The results are summarised below
(Table 3.12) (Table 3.13), including the results of the neural network and the raw infrared and
sonar sensors for comparison purposes. They represent the average error (Avg), the variance
(Var) and the rooted mean square error (RMSE) using the previously defined error metric.
Method
ANFIS-3
ANFIS-5
ANFIS-8
ANN
FLYNN
IR
SONAR
Avg
0,2112
0,1807
0,1743
0,2052
0,1781
0,2498
0,5906
Var
7,7404
1,1832
5,7561
5,0959
6,3026
2,6540
20,8355
RMSE
0,3493
0,2617
0,2966
0,3050
0,3078
0,3779
0,9901
Table 3.12: Errors in the trinity learning scenario.
Method
ANFIS-3
ANFIS-5
ANFIS-8
ANN
FLYNN
IR
SONAR
Avg
0,5954
0,5901
0,5599
0,6024
0,6091
1,0938
1,8561
Var
65,7508
64,3527
65,3739
64,8455
71,5587
57,7467
342,3615
RMSE
1,0059
0,9958
0,9834
1,0056
1,0424
1,3318
2,6208
Table 3.13: Errors in the corridor learning scenario.
The results have been plotted (Figure 3.21) (Figure 3.22) in order from the lowest to
highest average error. It can be noted that the combination ANFIS-8 produces the best results,
both in Avg and RMSE. The fuzzy rules obtain better results (compared with the simple
141
CHAPTER 3: SENSOR FUSION
Flynn’s rules) in the corridor scenario because it is similar to a real world setup and the trinity
scenario has corridors which are too narrow for the robot size.
Figure 3.21: Trinity learning results.
Figure 3.22: Corridor learning results.
Finally, the robot is run in both scenarios along different paths (Figure 3.23) (Figure
3.24). The purpose of this process is to test the generalisation capabilities of the different
fuzzy rule sets. The results are summarised below (Table 3.14) (Table 3.15), including the
results of the neural network and the raw infrared and sonar sensors for comparison purposes.
They represent the average error (Avg), the variance (Var) and the rooted mean square error
(RMSE) using the previously defined error metric.
Figure 3.23: Trinity validation environment.
Method
ANFIS-3
ANFIS-5
ANFIS-8
ANN
FLYNN
IR
SONAR
Figure 3.24: Corridor validation environment.
Avg
0,3018
0,2657
0,2846
0,3130
0,2750
0,3547
1,2684
Var
11,9367
11,9246
11,0117
11,1385
14,2645
4,9095
47,5786
142
RMSE
0,4588
0,3590
0,4372
0,4575
0,4672
0,5240
1,7466
HUMBERTO M ARTÍNEZ BARBERÁ
Table 3.14: Errors in the trinity validation scenario.
Method
ANFIS-3
ANFIS-5
ANFIS-8
ANN
FLYNN
IR
SONAR
Avg
0,6920
0,6905
0,6282
0,6762
0,6829
1,1076
2,0324
Var
55,9554
55,5552
59,3456
56,8823
66,1706
60,4763
328,0456
RMSE
1,0190
1,0160
0,9940
1,0130
1,0621
1,3534
2,7223
Table 3.15: Errors in the corridor validation scenario.
The results have also been plotted (Figure 3.25) (Figure 3.26), in order from lowest to
highest average error. It can be noted that the combination ANFIS-8 produces the best result
in the corridor scenario, but in the trinity scenario ANFIS-5 and the Flynn’s rules obtain
slightly better results. This is because the trinity scenario can be navigated using only
infrareds and the validation path follows a very complicated trajectory. In any case, the best
overall results are obtained with the ANFIS-8 fuzzy rules, which have been obtained using the
fuzzy tabu clustering methods described above.
Figure 3.25: Trinity validation results.
Figure 3.26: Corridor validation results.
3.5 Conclusions
Two different methods have been developed for solving the sensor fusion problem in a
priori unknown environments. In this problem, model based techniques that make use of a
priori information are not applicable, and thus, neural networks and fuzzy rule based systems
are used instead. These fusion techniques require some examples for the learning stage.
Obtaining such examples from real sensors is an extremely complex problem. The approach
followed is to use examples obtained using a simulator. For this reason, models of sonar and
infrared sensors have been evaluated and integrated in a simulation environment, which is
also used for sensor fusion validation. The sonar model is based on the sonar-tracing
technique which has been modified to take into account the sonar lobe effects, and the
infrared model has been obtained by linearising experimental values.
143
CHAPTER 3: SENSOR FUSION
Once the neural network and fuzzy rules fusion methods have been trained, they are
tested on different simulated environments ranging from a small to a large environment. The
fusion methods are then compared with raw sensor data and with a simple and heuristic rule
based system. The results show that in most cases the proposed fusion methods perform better
than the raw sensors and simple rules. In addition, the fuzzy rule system obtained using fuzzy
tabu clustering has the best overall performance of all the different fusion methods tested.
The fuzzy tabu clustering is an algorithm based on the fuzzy c-means algorithm but
with the introduction of a global minimum search by means of a tabu search technique. The
algorithm has been successfully applied also to different examples showing comparable
results with the well-known fuzzy clustering algorithms available in the literature.
144
Chapter 4
Robot Navigation
4.1 Introduction
For a mobile robot, the ability to navigate is one of the most important of all. Staying
operational, for instance avoiding dangerous situations such as collisions, come first. But if
any tasks which relate to specific places in the robot’s environment are to be performed
navigation is a must. Navigation can be defined as the combination of three fundamental
competences (Figure 4.1):
• Map-building. This is the process involved in constructing a map from sensor readings
•
•
taken at different robot locations. The correct treatment of sensory information and the
reliable localisation of the robot are fundamental points in the map-building process.
Localisation. This is the process involved in obtaining or knowing the actual robot’s pose
or location from sensor readings and the current map. An accurate map and reliable
sensors are key points in achieving good localisation.
Path-planning. This is the process involved in generating a feasible and safe trajectory
based on the current map from the current robot location to a goal position. In this case it
is very important to have an accurate map and a reliable localisation.
As can be noted, these competences are tightly coupled. If one of them fails, the robot
performance is greatly degraded or reduced. A typical example is when the sensory system
does not detect a door and the path-planner cannot trace a path through the corresponding
wall.
Often, one or more of those competences may not be present in a given robot, usually
when it is working in known and structured environments and when the robot is provided
with a priori information or special sensor systems. One of the objectives of this thesis is to
work on unknown environments, and thus, the robot is provided with the three competences.
The following subsections describe some basic techniques to develop and implement these
competences.
The Quaky-Ant platform has been provided with algorithms and methods to address the
navigation problem. Initially, the system relied on the assumption of reliable odometry for
145
CHAPTER 3: SENSOR FUSION
short time and distance operations. In line with this assumption, it incorporated a map
building method and a path planner. The chosen algorithms proved to be adequate in terms of
performance and robustness. Then, to overcome that demanding assumption, the platform
incorporated a localisation method based on another map building method. The final
combination of methods proved to perform quite well. The following sections show these
methods used in the Quaky-Ant platform.
Figure 4.1: Navigational tasks.
4.1.1 Map representation methods
The most natural representation of the robot’s environment is a map (Dudek and Jenkin,
2000). In addition to representing places in an environment, a map may include other
information, including reflectance properties of objects, regions that are unsafe or difficult to
traverse, or information of prior experiences. In general, map representation methods can be
divided into two main groups: those that rely primarily on an underlying metric representation
and those that are topological. Metric maps can also be decomposed into spatial and
geometric representations.
The most straightforward spatial representation is to sample discretely the two- or
three-dimensional environment to be described. The simplest method is to sample space in
cells of a uniform grid. Samples taken at points in the lattice express the degree of occupancy
at the sample point. In two dimensions the grids are known as occupancy grids. In three
dimensions the sampling elements are known as voxels. The main advantage of a regular
lattice representation is its extreme generality: no strong assumptions are made regarding
object type, and thus these grids can represent anything. The main disadvantage of this
representation is that grid resolution or fidelity is limited by the cell size, and the grid is
storage intensive even if much of the environment is empty or occupied. In addition to
representing the level of occupancy at particular locations in space, elements of the lattice can
also be tagged with other attribute information such as the confidence of the occupancy value,
the safety or desirability of occupying the cell, terrain information and so forth.
146
HUMBERTO M ARTÍNEZ BARBERÁ
One alternative to the storage problem is to represent space using cells of nonuniform
shape and size, although a recursive hierarchical representation is used more commonly. The
most common of these methods is the quadtree, which is a recursive data structure for
representing a square two-dimensional region. It is a hierarchical representation of cells that
can potentially reduce storage and aid in certain types of computations. Cells that are neither
uniformly empty nor full are subdivided into four equal subparts. Subparts are subdivided in
turn until they are either uniformly empty or full or until a resolution limit is reached. The
three-dimensional analogy of the quadtree is an octree. In general, the number of cells varies
roughly with the areas of the obstacles being described. Thus, for environments where most of
the space is free or occupied, quadtree-like representations are very suitable, at the expense of
some computational overhead.
Quadtrees are restrictive in some way because, unfortunately, not all space is well
characterised by this power of two representation, nor are all environments aligned with
power of two boundary planes. An alternative spatial decomposition method, which is not
quite restrictive, is binary space partitioning trees (BSP trees). It is a hierarchical
representation within which a rectangular cell is either a leaf or is divided into two BSP trees
by a plane parallel to one of the external boundaries of the environment. BSP trees provide
the same binary space-splitting characteristics of quadtrees but do not enforce the same
exactly even subdivision. A BSP representation of a given space is not necessarily unique and
radically different representations of the same space are obtained.
Geometric maps are those made up of discrete geometric primitives: lines, polygons,
points, polynomial functions and so forth. Such maps have the advantage of being highly
space-efficient because an arbitrarily large region of space can be represented by a model with
only a few numerical parameters. In addition, geometric maps can store occupancy data with
almost arbitrarily high resolution and without becoming liable to the storage penalties
incurred by techniques based on spatial sampling. Geometric maps are composed of the union
of simple geometric primitives. Such maps are characterised by two key properties: the set of
basic primitives used for describing objects, and the set of composition and deformation
operators used to manipulate objects. The primary shortcoming of geometric model-based
representations relates to the fact that they can be difficult to infer reliably from sensor data.
Three fundamental modelling problems are regularly encountered: lack of stability, lack of
uniqueness, and lack of expressive power.
The lack of stability is related to the representation, which may change in a drastic way
given just a small variation in the input. One approach to addressing the instability is to add a
stabiliser that acts to damp variations in the modelling system with variations in the input.
The stabiliser is a criterion that serves to bias the model-fitting process toward specific types
of solutions. The lack of uniqueness is due to the fact that many different environments may
map to the same representation. This problem occurs because many modelling systems,
especially those that have substantial expressive power and that approximate the input data
rather than expressing it exactly, can express a single set of observations in more than one
way. Some approaches associate a cost with each model used and its approximation errors,
and attempt to find a single model that minimises the total cost. The lack of expressive power
is due to the fact that it may be difficult (or impossible) to represent the salient features of the
environment within the modelling system because they are not well suited to expressing
147
CHAPTER 3: SENSOR FUSION
distributions of measurements and the associated error models. One approach is to explicitly
represent aspects of the original distribution that are important, such as means or standard
deviations of uniform or Gaussian distributions. Although better than ignoring the data
misfits, such error models may fail to capture the complex distributions associated even with
simple sensors.
Geometric representations rely on metric data as the core of the representation.
Unfortunately these metric data are likely to be corrupted by sensor noise at the very least. To
avoid reliance on error-prone metric data, a non-metric topological representation may be
used. The key to a topological relationship is some explicit representation of connectivity
between regions or objects. In its purest form this may involve a complete absence of metric
data. A topological representation is based on an abstraction of the environment in terms of
discrete places with edges connecting them. In some cases, the edges have length and the
edges are oriented with respect to the nodes.
Typically, a robot’s environment is modelled as a graph whose vertices correspond to
landmarks, which can be recognised using sensors, placed within the environment. Each
vertex corresponds to one of the unique landmarks, whereas edges correspond to known
straight paths between landmarks. Each edge may be labelled with the distance that needs to
be travelled along the edge to arrive at the next landmark. Although the robot has no real
understanding of the geometric relationship between locations in the environment, the
representation does encode sufficient information for the robot to conduct point-to-point
navigation. The representation is also extremely compact.
4.1.2 Path planning
The basic path planning problem refers to determining a path in configuration space
between an initial configuration of the robot and a final configuration, such that the robot does
not collide with any obstacle in the environment, and that the planned motion is consistent
with the kinematic constraints of the vehicle (Dudek and Jenkin, 2000). Configuration space
is a formalism for motion planning, in which a configuration is a specification of the physical
state of a robot with respect to a fixed environmental frame. The basic formulation can be
augmented in several ways, such as considering the minimum length path, moving obstacles,
minimum time path, etc. To render the planning problem tractable it is often necessary to
make a variety of simplifications with respect to the real environment. There are two groups
of techniques for path planning: discrete and continuous state space search.
Searching a discrete state space involves a set of possible problem states and a state
transition function to determine the states directly reachable from any given state. A search
method is an algorithm to control the exploration of the state space in order to identify a path
from some initial state to the goal. Many graph search methods exist in the literature. They
assign a node to each possible state and establish arcs between nodes that are transversable,
that is, there is a free path from one to another. Given this state transition function, a graph
search algorithm will find a path from an initial node s to any of a final set t, if any of such
paths exists. Some graph search formulations make use of evaluation functions to obtain
148
HUMBERTO M ARTÍNEZ BARBERÁ
optimal or heuristically optimal paths. Usually, these formulations try to reduce the number of
nodes that are evaluated in order to obtain faster searches. If no path exists from the start to
the goal, these algorithms eventually will evaluate every possible reachable state before
returning false. This failure can take a considerable amount of time to detect. Although
effective, simple graph search can be too slow for robot path planning for a reasonably sized
search space. The most widely used graph search algorithm for path planning is A*, which
finds optimal paths from s to t, and can be tuned to expand a minimum number of nodes.
Dynamic programming is another general purpose technique that can be used for path
planning. It is a recursive procedure for evaluating the minimum cost path to any point in the
environment from some source. To use this method the problem adhere to a principle of
optimality, which can be stated as an assertion that given three points A, B, and C, the optimal
path from A to C via B can be computed by determining the optimal path from A to B and
from B to C. The discretised environment is referred to as a cost table. The idea is to compute
the cost of each cell in the table by computing an increment to the cost of an adjacent cell for
which the cost is already known. To efficiently compute the path from s to t without
completing the entire cost table heuristic functions are often used to expand cells
preferentially. In general, these methods produces similar results (optimality and search-time)
than graph search techniques. The difference usually resides in the simplicity to define or
compute the evaluation function, which hardly depends on the nature of the environment and
the specific problem.
Given a technique to search a discrete state space for a path, it remains to take an
environment and construct a state space to represent it. The most straightforward approach is
to represent the space using a grid map and assign anode to each cell and then connect
adjacent cells. Other mechanisms for mapping a robot’s environment onto a discrete
searchable space include visibility graphs and Voronoi diagrams. A visibility graph produces
a minimum length path from a start to a goal by solving a simple graph traversal algorithm.
The visibility graph is defined such that a set of vertices is made up of the union of all the
vertices of the polygonal obstacles in the environment as well as the start and end points. The
edges of the graph connect all vertices that are visible to one another. One unfortunate feature
of a path found using a visibility graph is that the path passes through vertices and is
arbitrarily close to the edges of obstacles. For non circular robots this method is not complete,
it may fail to find a path even if one exists. The visibility graph brings the robot too close to
the boundaries of the obstacles. An alternative approach is a Voronoi diagram, which keeps
the robot as far as possible from all the obstacles in the environment. A Voronoi diagram is
the locus of points equidistant from the closest two or more obstacle boundaries, including the
workspace boundary. This set of points has the useful property of maximising the clearance
between points and obstacles. Given a start and end position, the robot could move directly
away from its nearest obstacle until it moves to a point laying on the Voronoi diagram. It can
then follow the set of arcs and straight line segments that compose the diagram until it reaches
the nearest collision-free position to the goal. The major problem with this type of planning is
that paths a relatively long, because this technique is often too conservative to be attractive.
Rather than searching through a discrete space, an alternative is to model the space of
the robot as a continuum space and to consider path planning as the determination of an
appropriate trajectory within this continuum. This is known as continuous state space
planning. Two of the most well known techniques for continuous state space are the potential
149
CHAPTER 3: SENSOR FUSION
fields and the vector field histogram methods. These are methods that model the world as
forces that drive the robot towards the goal while driving it away from obstacles (see chapter
5 for a detailed description of these methods). These methods are very fast to compute at the
expense that they can drive the robot to local minima, although some heuristics try to
overcome this problem. Alternatively, there is a class of path planners that are based on the
bug algorithm and are guaranteed to find a path from the start to the goal if such path exists.
These methods operate by switching between two simple behaviours: moving directly
towards the goal location, and circumnavigating an obstacle. The drawback of these methods
is that the generated trajectories can be arbitrarily worse than the optimal path to the goal.
4.1.3 Localisation
For numerous tasks a mobile robot needs to know where it is either on an ongoing basis
or when specific events occur. This problem has many different connotations. In the strongest
sense, knowing where the robot is involves estimating some global representation of the
space. This is usually referred to as strong localisation. The weak localisation problem, in
contrast, involves merely knowing if the current location has been visited before. Between the
extremes of the weak localisation and strong localisation problems exist a continuum of
different problem specifications that involve knowing where the robot is or estimating the
robot’s pose.
In certain circumstances it may be necessary to infer the robot’s position without an a
priori estimate of its location. This type of positioning is referred to as global localisation. A
more common version of the localisation problem is the need to refine an estimate of the
robot’s pose continually. This task is known as pose maintenance or local localisation. A key
step in the process of performing either local or global localisation involves matching the set
of current observations to some established map. Standard matching methods can be broadly
classified into the following categories:
• Data-data matching. Directly matching the current raw data with predicted raw data
•
•
(extracted from the map either by predictive modelling or using stored data sets).
Data-model matching. Matching the observed data to more abstract models stored in the
map (based on model of how models and data are associated).
Model-model matching. Matching models stored in the map to models generated from
current observations.
In general matching with raw data can reduce dependence on a priori assumptions about
the environment but tends to be less robust unless the matching technique itself is very
sophisticated (in which case it resembles model-model matching).
Most devices for measuring position and distance are relative measurement tools. By
counting the number of rotations executed by a vehicle’s drive wheels, for example, and using
knowledge of the wheel’s size and the vehicle’s kinematics, an estimate of the rate of position
150
HUMBERTO M ARTÍNEZ BARBERÁ
change can be obtained. Computing absolute coordinates thus involve the integration of such
local differential quantities. This technique is known as dead reckoning, and when using
odometers it is called odometry. Dead reckoning can have acceptable accuracy over sufficient
small steps given a suitable terrain and drive mechanism. For larger or complex motions,
however, the unavoidable errors in the individual position estimates have a major
consequence. Simple scaling errors can be corrected, but more complex errors are essentially
impossible to eliminate.
Another different broad set of techniques is based on the use of external referencing
systems. These are usually known as landmark-based localisation. In principle, the problem
is related to pose estimation of a landmark with respect to a fixed sensor. Important variations
of the problem arise when the landmarks are unlabelled, when the landmarks are difficult to
detect, or when the measurements are inaccurate. Landmarks can be active or passive, natural
or artificial. Active landmarks are typically transmitters that emit unique signals and are
placed about the environment. Active landmarks avoid many of the problems commonly
associated with passive, naturally occurring landmarks. In general, the use of an artificial
landmark, either active or passive, can extend the operational range of the underlying sensor
technology relative to a natural landmark.
Triangulation refers to the solution of constraint equations relating the pose of an
observer to the positions of a set of landmarks. Pose estimation using triangulation methods
from known landmarks has been practised since ancient times, and it is the basis of multiple
landmarks localisation. Sensor-based servoing is a technique based on storing a specific
sensor image I(qgoal) associated with a target position for the robot qgoal and using the
difference between I(qgoal) and I(q c) to move the robot from its current position q c to the goal.
As a result of the approximations used and sensor noise, actually attaining the target position
usually involves an iterative approach. Sensor based servoing usually lacks a robust and stable
feature space, quantitative positioning, and can only be used to return to previously visited
places. On the other hand, the technique has low overhead and makes only simple
assumptions about the environment.
To accomplish position estimation using external referencing, the disparity between the
known and observed locations of obstacles provides an estimate of positions to be used in
correcting the drift in estimates from internal sensors. A Kalman filter is one of the most
widely used techniques to combine pose estimates from dead reckoning and other landmarkbased measures in an optimal manner. In the context of mobile robotics, the criteria that
assure an optimal result are often violated. In practice, to account the different nonlinearities
of the sensors response functions a variant of the Kalman filter, the extended Kalman filter
(EKF), is obtained by linearising the model of the system. Although the Kalman filter is a
powerful technique that often performs well under suitable conditions, it does have its
shortcomings. Fundamentally, it depends on the acceptability of the linearisation of the
system being modelled (extended), an acceptable estimate of the sources of error in the
system, and on well behaved error distribution (e.g. Gaussian). In addition, the need for a
simple relationship between geometric structures and observed measurements makes it
difficult to apply such methods in some environments.
151
CHAPTER 3: SENSOR FUSION
4.3 Map building and path planning
The first navigation procedure relies on the assumption of reliable odometry for short
time and distance operations. With this assumption in mind, the robot is provided with a map
building method and a path planning algorithm. The robot does not know the environment a
priori, and thus, it has to perform both tasks simultaneously. A fuzzy grid map is used for map
building, while the path planning is based on the A* algorithm. As opposed to other
navigation schemas, the robot is required to build the maps while it is moving, and thus
approaches based on non-continuous movement (the robot maps the area, generates a short
distance map, and then moves that distance) are not desired. The rationale for this requisite is
to exhibit some kind of reactivity (see chapter 5) so that the robot seems to perform more
animal or human like.
The proposed navigation procedure is as follows. Each time new sensor values are
available, the map is updated. This step, which is not very time consuming, is performed at a
given frequency (approximately every 300 ms). The robot simultaneously is moving and with
a certain periodicity a new path is computed based on the current map. This step is performed
at a slightly lower frequency (approximately every 1200 ms), because the path planning is
comparatively more time consuming. The different algorithms are explained below.
4.3.1 Fuzzy grid map
The choice of internal map representation depends on various characteristics but mainly
on the environment itself (obstacles density, rooms dimension, etc.). Lee (Lee, 1996) makes a
classification attending to the range of geometric properties that can be derived from the map.
Considering the properties of an indoor, high density environment, a metric map may be
considered as the most appropriate choice because of the compact representation of the
environment. The square cells grid maps are the most widely used metric maps (Moravec,
1985), (Borenstein, 1988), (Lim, 1992). Although they have many shortcomings, the
advantage of employing grid maps is that path planning is quite direct and simple. In recent
years, different tools based on neural networks (Kurz, 1996), (Thrun, 1998) or fuzzy logic
(Oriolo et alt., 1998) have been introduced to build grid maps.
The method used for map building is based on fuzzy grid maps (Oriolo et alt., 1998),
and is described below. Each cell in the map has two values associated: the degree of
certainty that the cell is empty εij and the degree of certainty that the cell is occupied ωij.
These values are calculated for each cell in the beam of a single sensor. As sonar and infrared
sensors have different aperture widths, that of the wider sensor (sonar) is used. Let τ be the
fused value for a given pair s of sonar and infrared sensors. Let cs be cells in the aperture of s.
For each cs, whose polar coordinates from s are θ and ρ, the value of its corresponding εij and
ωij is computed using equations (Eq. 4.1) and (Eq. 4.2).
ε(θ,ρ) = kεg b gcg d
(4.1)
152
HUMBERTO M ARTÍNEZ BARBERÁ
ω(θ , ρ) = kω ga gc gd
(4.2)
These equations combine some possibilities and beliefs about the cells, taking into
account their position along the sensor beam. In the equations described below, the following
values have been used: ∆ρ = 0.15, kε = 0.1, k ω = 0.5, h1 = 1.2, h2 = 1.0, and h3 = 0.1.
•
•
•
•
The possibility ga that cells in the arc of the sensor range τ are obstacles (Eq. 4.3).
The possibility gb that cells in the circular sector of radius τ are empty (Eq. 4.4).
The belief g c that shorter distances are more accurate (Eq. 4.5).
The belief g d that cells near to the beam axis are empty (Eq. 4.6).
 
 2
 1 −  ρ− τ  ∀ρ: ρ− τ < ∆ ρ
g a ( ρ) =    ∆ρ  


 0
otherwise
(4.3)
1 ∀ρ < τ − ∆ρ
gb (ρ ) = 
0 otherwise
(4.4)
gc (ρ) = min(1,h1e − h2 ρ + h3 )
(4.5)
  12  2
π

gd (θ) = 1−  π θ  ∀θ : θ < 12
0
otherwise
(4.6)
The εij and ωij obtained in this way are then aggregated to the previous values of each
cell, to obtain the aggregated degree of certainty that the cells are empty E(i,j) or occupied
O(i,j), using equations (Eq. 4.7) and (Eq. 4.8).
As these
is combined in
way, if the cell
4.9): the degree
4.10).
E(i , j ) = U εk (i, j)
(4.7)
O(i, j) = U ωk (i, j )
(4.8)
E(i,j) and O(i,j) correspond to fuzzy degrees of membership, such information
an additional value for each cell M(i,j), which represents, in a conservative
is occupied. The meaning can be defined precisely based on its definition (Eq.
of certainty that the cell is empty, is not occupied, and is not ambiguous (Eq.
C(i, j) = E(i, j)I O(i, j )
(4.9)
M(i, j) = E(i, j )IO (i, j)I C(i, j )
(4.10)
Using this method, the robot uses and maintains three different maps:
• The map of cells that are likely to be empty (Figure 4.2), which is constructed using the
E(i,j). Darker cells correspond to a higher possibility of being empty, while lighter cells
correspond to a lower possibility.
153
CHAPTER 3: SENSOR FUSION
• The map of cells that are likely to be occupied (Figure 4.3), which is constructed using the
•
O(i,j). Darker cells correspond to a higher possibility of being occupied, while lighter cells
correspond to a lower possibility.
The navigation map (Figure 4.4), which is constructed using the M(i,j). This map is the
one that is used later for path planning. Darker cells correspond to a higher certainty of
being an obstacle, while lighter cells correspond to a higher certainty of being empty.
Figure 4.2: Empty cells map.
Figure 4.3: Occupied cells map.
Figure 4.4: Navigation map.
One important point to be noted here is derived from the fact that the original method
(Oriolo et alt., 1998) only takes into account sonar sensors, modelling the uncertainty directly
on the map. In this case, virtual sensors are used to build the map instead of using the sonars
directly. These virtual sensors fuse sonar and infrared sensors to obtain a more reliable
measure (see chapter 3). The fusion method works as a sonar with distances greater than 80
centimetres. With smaller distances it works as an infrared. In this case the infrared beam is
more or less similar to the sonar one if it is projected on a cell map, because the beam is
discretised on 10-20 centimetres cells. Thus, the map model is still valid.
154
HUMBERTO M ARTÍNEZ BARBERÁ
Another important difference is the map initialisation. The original method was
considered for use in a sense-map-plan-move approach, and the unexplored area is modelled
as an ambiguous (εij = 1.0, ωij = 1.0) or occupied area (εij = 0.0, ωij = 1.0). As has been stated
above, this behaviour is not desired and consequently the path-planning procedure is different.
This requires that the unexplored area is modelled as empty area (εij = 1.0, ωij = 0.0) so that
tentative paths may cross unexplored areas if necessary.
Finally, all the cells occupied by the robot are set to the maximum values of certainty
that the cells are empty (εij = 1.0) and not occupied (ωij = 0.0). Trivially, if the robot is over
these cells, there is not any obstacle there. The influence of this strategy can be noted in the
maps shown above (Figure 4.2) as very dark cells over the robot’s trajectory.
4.3.2 A *-based planning
The most direct method for path-planning using grid maps is to consider each cell as a
node in a graph and then apply a graph search algorithm. The A* algorithm (Hart et alt., 1968)
allows heuristic information to be incorporated when available, resulting in an efficient
search. Let s be the cell which the robot is currently located on. Let t be the desired objective
cell. Let f(n) be an evaluation function that heuristically decides how likely the node n is to be
expanded. The A* algorithm proceeds as follows:
(s1) Mark s as open, and calculate f(s).
(s2) Select the open node n whose value f(n) is smallest. Resolve ties arbitrarily, but
always in favour of any node n = t.
(s3) If n = t, mark n as closed and terminate the algorithm.
(s4) Otherwise, mark n as closed and apply the successor operator to n. Calculate f(n)
for each successor n and mark as open each successor not already marked as closed.
Remark as open any closed node n i which is successor of n and for which f(n i) is
smallest now than it was when ni was marked closed. Go to step s2.
The evaluation function f(n) can be written as a sum of two parts (Eq. 4.11):
(4.11)
f(n) = g(n) + h(n)
where g(n) is the actual cost of an optimal path from s to n, and h(n) is the actual cost of
an optimal path from n to the goal t. As f(n) is not known a priori, an estimate f *(n) is used.
The new evaluation function can be written as a sum of two parts (Eq. 4.12):
f *(n) = g*(n) + h*(n)
(4.12)
where g *(n) is the cost of the path from s to n with minimum cost so far found, and h*(n)
is an estimate of the cost of an optimal path from n to the goal t. From the definition, the
following condition (Eq. 4.13) must hold true:
155
CHAPTER 3: SENSOR FUSION
g (n)
g*(n)
(4.13)
0
A* is said to be complete under the admissibility condition (Eq. 4.14):
0
h*(n)
(4.14)
h (n)
Moreover, the heuristic function h(n) is said to be locally consistent if, for any pair of
adjacent cells (a, b) the following condition (Eq. 4.15) holds true:
0
h*(a)
h*(b) + w(a,b)
(4.15)
where w(a,b) is the cost of the arc between the cells a and b. The choice h*(n) = 0 is
trivially admissible and locally consistent, resulting, however, in a non-informed search.
Taking a fuzzy grid map M(i,j), as described in the previous subsection, the path
planning problem can be stated in the following way: the robot has to navigate from an initial
cell s’ to a goal cell t’. Instead of looking for a path starting at the robot location, the A* is
applied to look for a path starting in the goal cell. In this way, the starting position s’
corresponds to the A* goal t, and the goal position t’ corresponds to the A* start s. There are
approaches to solve this problem taking into account sensor noise, the robot size and
minimum length paths (Oriolo et alt., 1998), but not simultaneously. In this thesis a different
approach has been taken to find minimum length paths while providing safe clearance of
obstacles. Using the previous problem statement, the h*(n) function (Eq. 4.16) is defined as
follows:
*
2
2
h (n) = (s x − nx ) + (s y − n y )
(4.16)
where (sx, sy) and (n x, ny) are the coordinates of the cells s and n respectively. Trivially,
the condition (Eq. 4.14) holds true as do the local consistency criteria (Eq. 4.15), because this
h*(n) is the Euclidean distance of the most direct path between s and n. The definition of the
g*(n) is more complex as it is computed in a two step process: first, the fuzzy map is dilated
locally around the cell a, and second, the resulting cost is augmented. The dilation step dil(a)
is performed as follows (Eq. 4.17):
min
{M }
i∈[a x − δ , a x +δ ]
dil(a) =
[
j ∈ a y −δ ,a y +δ
(4.17)
ij
]
kdil
where δ is the size of the dilation window, and k dil is the dilation weight. The
augmentation step aug(x) is performed as follows (Eq. 4.18):
k aug ⋅x
aug(x) = e
(4.18)
where k aug is the augmentation weight. Finally, the g*(n) function is recursively
computed as follows (Eq. 4.19):
156
HUMBERTO M ARTÍNEZ BARBERÁ
v(n) = aug(dil(n))
(4.19)
v(n)
n x = (n + 1)x ∧ n y = (n + 1)y
*
g inc
(n,n + 1) = 
 2 ⋅ v(n) otherwise
g * (s) = 0
*
g (n) =  *
*
g (n + 1) = g* ( n) + g inc
(n,n + 1)
*
where v(n) is the composition of the dilation and augmentation steps, and g inc
(n,n + 1) is
*
*
the incremental step for the computation of g (n). The incremental function ginc
(n,n +1)
computes the cost of traversing from cell n to n+1, where n+1 is the next cell in the path to
the goal. Basically, the function returns v(n), if both cells are in the same horizontal or
vertical, or 2 v(n ) if the cells are diagonally aligned.
The dilation function serves to fill small gaps in the maps so that tentative paths do not
cross walls or small openings. By choosing different values of δ, different robot radii or sizes
can be taken into account. The augmentation function exponentially spreads the degrees of
certainty that cells are occupied or empty so that small certainty values give similar results
while high certainty values give much bigger results. The idea is to avoid the use of occupied
cells for path planning, and thus, this augmentation helps the use of lower certainty values
M(i,j) when possible. The dilation function is applied only locally in order to avoid applying
the transformation to the whole map because plans usually involve small portions of the map
and the dilation process is time consuming. A sample of the effects of the evaluation function
f *(n) is shown below (Figure 4.5). The dilation effect is observed as peaks with half the height
of their surrounding. The augmentation effect is observed as the difference between cells with
different certainty values.
Figure 4.5: Evaluation function f*(n).
Using the previous A* based planning scheme, the navigation strategy uses the
computed path to set a look-ahead point which is used as a short-term goal location. This
look-ahead point is defined as the point of the path from s to t which is at a distance of δlook
cells from s. The distance δlook depends on both the robot’s average speed and the navigation
control cycle frequency. For the Quaky-Ant robot, the following navigation parameters are
used: k dil = 2.0, δ = 2, kaug = 5.0, δlook = 10. A sample path is shown below (Figure 4.6), where
157
CHAPTER 3: SENSOR FUSION
the dilation effect (the path leaves enough clearance near the walls) and the look-ahead point
(small circle in the front of the robot) can be noted.
*
Figure 4.6: A based path.
4.3.3 Experiments and results
Different experiments have been conducted to test the simultaneous map-building and
path-planning. One of these tests is shown below (Figure 4.7). The Quaky-Ant robot is
simulated in autonomous mode (see chapter 6), and has to navigate from a given home
position (bottom-right corner) to a goal location (top-left corner) in an a priori unknown
environment. The different slides show the evolution of the fuzzy grid map (top image) and
the A* evaluation function f *(n) (bottom image). The world map is superimposed on map
images for reference. Cells that are not expanded are represented with the maximum value of
f*(n) found so far.
The robot starts (position “a”) with no previous knowledge, and thus the path to the goal
is an almost straight line. As the robot adds new features to the fuzzy grid map, the path
consequently avoids the modelled walls (position “b”). Due to sensor noise, parts of the walls
are not modelled, and in this way paths can be generated through them if the corresponding
holes are bigger than the robot (position “c”). If more evidence about the wrong walls is
acquired, even if they are not completely continuous, the dilation step prevents the method
from crossing the walls with a path (position “d”). When the local environment is almost fully
modelled (position “e”), the path is traced over safe positions until the robot reaches the goal
(position “f”).
158
HUMBERTO M ARTÍNEZ BARBERÁ
a) Starting the run.
b) Crossing the 1st doorway.
d) In the corridor.
e) Crossing the 2 doorway.
nd
c) Entering the corridor.
f) Reaching the goal.
Figure 4.7: Simultaneous map-building and path-planning (simulated).
159
CHAPTER 3: SENSOR FUSION
The experiment area has a dimension of 5.4 x 4.95 metres, and it is tessellated in 0.075
metres wide cells, resulting in a grid map that contains 4,752 cells. The following table (Table
4.1) shows the number of cells in the path, the number of expanded nodes and the time taken
to expand the nodes in each of the previous navigation slides (Figure 4.7). The test has been
performed in a Mac PowerPC G3 running at 300 MHz on top of the BGen environment (see
chapter 5), which has been developed in Java.
Position
a
b
c
d
e
f
Path
52
53
48
40
20
5
Expanded
52
814
579
374
99
5
Time (ms)
5
100
148
23
7
1
Table 4.1: Number of expanded cells.
In addition to simulated experiments, the navigation approach has also been tested in
real world environments (see chapter 6). One of these is shown below (Figure 4.8). The
mapped area has a dimension of approximately 8 x 8 metres, and it is composed of wooden
walls (bottom-left room) and standard-building walls (top-right corridor). The cleanness of
the map and the safe path generated are points to be noticed.
Figure 4.8: Simultaneous map-building and path-planning (real world).
4.4 Map building and localisation
The first navigation procedure relied on the assumption of reliable odometry for short
time and distance operations. This is clearly an important constraint in large environments or
in operations requiring long time. To overcome this limitation, the previous system is
complemented with a localisation method based on a fuzzy segments map. Global grid maps
160
HUMBERTO M ARTÍNEZ BARBERÁ
are not suitable for localisation, while local grid maps (based on sonar signature matching)
require a precise compass reference to obtain reliable results (Duckett and Nehmzov, 1998).
In addition, the previous approach, if provided with a reliable localisation, performs fast and
produces safe paths. Taking into account these facts, it was decided to improve the system by
adding a separate technique for localisation, which did not need an a priori environment
model.
The new navigation procedure is as follows. The fuzzy grid map and the A* path
planner are executed in a similar way as described in the previous section, but instead of using
positions returned by the odometry, they use the corrected localised positions.
Simultaneously, a local fuzzy segments map is updated each time a sensor buffer is full
(approximately every 1200 ms). This is compared (approximately every 2500 ms) to a global
fuzzy segments map to compute the differences and thus obtain the current odometry drift,
which is then corrected. Both the local and global maps are constructed from scratch, without
any prior knowledge. The different algorithms are explained below.
4.4.1 Fuzzy segments map
The method used for map building is based on the fuzzy segments map (Gasós and
Martín, 1996b)(Gasós, 2000), which is a geometric representation composed of line segments
only. By carefully managing the uncertainty related to the robot’s position and sensor noise,
the method tries to overcome some of the problems typically related to geometric models. In
this way, a fuzzy segments map vastly reduces the lack of stability by decreasing the
expressive power of the line fittings which keep the uncertainty all the way along the process.
The method is described below.
Let {ou ...o v },v ≥ u + 3 be a set of at least four consecutive collinear sensor observations,
and let {(xu ,y u )K (x v , yv )} the corresponding points in the plane. A segment is a tuple (Eq.
4.20):
B = θ , ρ ,(xi , yi ),(x j , yj ), k
(4.20)
where θ, ρ are the parameters of the equation x ⋅ cos(θ ) + y ⋅ sin(θ ) = ρ , obtained by
eigenvector line fitting of the points in the plane k = v – u + 1, and the limits of the segment
(xi, yi),(xj, yj) are the perpendicular projections on the line of the first and last points (xu, yu)
and (xv, yv) respectively.
The uncertainty in the real location of a segment B is due to three different groups of
factors: the lack of knowledge about the way the sensor signal reflects in the objects, the
distance between the sensor and the object, and the uncertainty of the robot location due to
estimation by dead reckoning. Given a segment B, the uncertainty of its real location due to
the first group of factors is expressed by representing the value of ρ as a trapezoidal fuzzy set
fρ1 which depends on the scatter of the points. The trapezoidal fuzzy set is built by assigning
the confidence interval with confidence level 0.68 to the α-cut in 1, and the interval with
161
CHAPTER 3: SENSOR FUSION
confidence level 0.95 to the α-cut in 0. For a normal distribution of observations, the values
0.68 and 0.95 generate limits of the interval that are located at a distance equal to one and two
times the standard deviation. These intervals are computed from a sample of size k drawn
2
from a normal distribution with sample mean ρ and variance s ρ of the mean given by the
following equation (Eq. 4.21):
ρl2
∑= k(k − 1)
l u
v
(4.21)
where ρl is the perpendicular distance from the point (xl, yl) to the line. Symmetrical 1-β
confidence limits are given by ρ ± t β / 2 sρ , in which tβ/2 is the β/2 percentage point of the t
distribution with k-1 degrees of freedom. Thus fρ1 is defined as follows (Eq. 4.22):
f ρ1 = ( − t 0.025 sρ , − t 0.16 s ρ , t 0.16 sρ , t0.025 sρ )
(4.22)
The uncertainty of the real location of B due to the second group of factors is
represented by a trapezoidal fuzzy set fρ2 (Eq. 4.23):
f ρ2 = ( − k2 d, −k1 d,k1 d,k2 d),
k2 > k1
(4.23)
where k1 and k 2 are constants with a value that depends on the particular sensor and
working environment, and they are calculated as bounds of the uncertainty due to these
factors in normal working conditions. The d value is the average distance of the observations
used to build the segment.
Finally, the uncertainty in the real location of B due to the third group of factors is
represented by a trapezoidal fuzzy set fρ3 (Eq. 4.24):
f ρ3 = (− k4 a, − k 3a, k3 a, k4 a),
k4 > k3
(4.24)
where k 3 and k 4 are constants with a value that depends on the particular robot and
terrain properties, and a measures the displacement of the robot since the last time its location
was updated using an absolute positioning system.
Since the three sources of uncertainty are mutually independent, the total uncertainty on
the real location of the segment fρ is obtained by addition of the corresponding trapezoidal
fuzzy sets (Eq. 4.25):
f ρ = f ρ1 ⊕ fρ 2 ⊕ fρ 3
(4.25)
A fuzzy segment is a segment in which information regarding the uncertainty on the
parameter ρ has been added as a trapezoidal fuzzy set (Eq. 4.26):
{
}
F = θ , ρ, fρ ,(xi , yi ),(x j , yj ),k
(4.26)
162
HUMBERTO M ARTÍNEZ BARBERÁ
Given a fuzzy segment F, the uncertainty of the real position has been represented
previously as a fuzzy set fρ, and the uncertainty of the real orientation θ is now computed as a
trapezoidal fuzzy set fθ (Eq. 4.27):
f ρ = (− ρ0 , − ρ1 , ρ1 , ρ 0 )
(4.27)

2ρ
2ρ
2ρ
2ρ 
f θ =  θ − atan( 0 ),θ − atan( 1 ),θ + atan( 1 ),θ + atan( 0 )

l
l
l
l 
where l is the length of the segment.
Each fuzzy segment provides information about the position and orientation of an object
boundary, and also, about the uncertainty in the robot’s knowledge about these values. Since
the segments are obtained from different sensors or from different positions of the robot, a
key aspect of the process of map building is detecting and combining the segments that might
have come from the same object boundary. Fuzzy segments that are found to be collinear are
combined and their respective uncertainties are propagated in the integration process. Two
fuzzy segments F1 and F2 are considered to be collinear if the following criterion is satisfied
(Eq. 4.28):
M( fθ1 , f θ2 ) ≥ 0.5
∧
M( fρ1 , f ρ2 ) ≥ 0.5
(4.28)
where
M(X,Y ) =
AX + AY
AXY
2AX AY
(4.29)
where AX and AY denote the area enclosed by the fuzzy sets X and Y respectively, and
AXY denotes the area of the intersection of X and Y. Once two collinear segments have been
detected, they have to be combined to form a boundary and their associated uncertainty has to
be propagated.
A fuzzy boundary is a fuzzy segment obtained by successively combining all collinear
segments that intersect. Given two collinear fuzzy segments F1 and F2, the combination is a
new fuzzy segment Fr (Eq. 4.30):
Fr = {θr ,ρr , fρr ,(xir , yir ),(x jr , y jr ).k1 + k2 }
(4.30)
where
k1 ρ1 + k 2 ρ2
k1 k2
k fρ ⊕ k2 fρ2
fρr = (2 − M ( fρ1 , fρ 2 )) 1 1
k1k2
θr =
k1θ 1 + k2θ 2
k1 k2
ρr =
163
(4.31)
CHAPTER 3: SENSOR FUSION
and (xir, yir), (xjr, yjr) are the extreme points of the perpendicular projections of (xi1, yi1),
(xj1, yj1), (xi2, yi2), (xj2, yj2) on the line (θr, ρr). A fuzzy segments map is a collection FM (Eq.
4.32) of fuzzy boundaries.
FM = {F1 ,F2 ,K, Fm}
(4.32)
The original work included a method to generate segments based on single sensor scans.
The idea was to obtain some measurements form a single sensor. These measurements were
then preprocessed to eliminate easily detectable misreadings and to group together smoothly
changing measurements. Finally, each group of measurements was fitted to a straight line to
obtain fuzzy segments. This approach (Figure 4.9) combines the information from different
sensors at the segment level. If two segments from two different sensors are collinear, the
segments are combined and merged into a new segment.
Figure 4.9: Fuzzy segments map from single sensors.
While the original approach generates good maps, it lacks the ability to combine
measurements from different sensors at a sensor level. This makes the map generation process
slow, because a lot of time is needed to obtain good sensor measurements to fit a straight line.
The previous example (Figure 4.9) shows the trajectory followed by the robot (yellow line),
the fuzzy segments (blue and grey lines), and some sonar scans (orange dots). As can be
clearly seen, the bottom-right measurements could be fitted with a line, but as the points come
from different sensors, the previous approach does not generate a fuzzy segment there. To
overcome this problem, a new segment generation method has been developed.
The new method builds and maintains a circular sensor buffer in a heuristic way, which
is described below. Let n be the number of sensors, and let m be the number of measurements
stored in the buffer for each sensor. The number of entries in the sensor buffer is t = n x m.
When a series of new values for the n sensors is available, those measurements that are
smaller than a given threshold δlen overwrite the oldest previously stored values sequentially.
The sensor buffer is shown below (Figure 4.10), where light grey cells are the old values and
dark grey cells are the most recent sensor readings. The buffer thus constructed serves to
extract candidate points for the line fitting procedure.
164
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 4.10: Raw sensor buffer (left) and sensor buffer with segments (right).
To generate a fuzzy segment, at least four points are required to be considered as a line.
The algorithm to generate a line fitting is as follows:
(s1) Initialisation. Let O = {o1 ,o2 ,K, ot } be the sensor buffer. Let P be the set of
candidate points that constitute a segment. Let i be an arbitrary cell in the sensor
buffer, which for simplicity is set to 1. Let δmax be the maximum distance between
consecutive points on a line. Let δgap be the maximum number of outliers in a
segment, that is, the number of cells in a sequence that do not belong to a line. Let δang
be the maximum difference in orientation between two lines for them to be considered
as collinear.
(s2) First point. Let ilast be equal to i, let inext be equal to i + 1, and let g be equal to 0.
The value oi is set as the first candidate point to form the line, and consequently
P = {oi }. Increment the index i.
(s3) Second point. If oi , P1 < δ max , then let ilast be equal to i, let P = P ∪ {o i },
increment i, and go to step s4. Otherwise, increment i and g. If g > δgap then let i be
equal to inext, and go to step s2. Otherwise, repeat step s3.
(s4) Third point. Let α be the orientation of the line P1 P2 , and β be the orientation of
the line P2 oi . If o i , P2 < δ max and α− β < δang , then let ilast be equal to i, let
P = P ∪ {o i }, increment i, and go to step s5. Otherwise, increment i and g. If g > δgap
then let i be equal to inext, and go to step s2. Otherwise, repeat step s4.
(s5) Extra points. Let u be the number of point in P. Let α be the orientation of the
line Pu −1 Pu , and β be the orientation of the line Pu oi .
(s5.1) If o i , Pu < δ max and α − β < δ ang , then let ilast be equal to i, let
P = P ∪ {o i }, increment i. If i = t, then generate a fuzzy segment with P and go
to step s2. Otherwise, repeat step s5.
(s5.2) Otherwise, increment i and g. If g > δgap and u 4, then generate a fuzzy
segment with P and go to step s2. Otherwise, if g > δgap then let i be equal to
inext, and go to step s2. Otherwise, repeat step s5.
165
CHAPTER 3: SENSOR FUSION
This method obtains heuristically more consistent segments in a given environment than
the original approach while makes fewer movements. The method has been tested with the
following values: n = 16, m = 10, δlen = 2 metres, δmax = 0.2 metres, δgap = 10, δang = 15º. An
example shows below (Figure 4.11) the trajectory followed by the robot (yellow line), the
fuzzy segments (blue and red lines), and some sonar scans (grey dots). As can be seen, the
trajectory followed by the robot is much smaller than in the previous example (Figure 4.9)
given the same environment, while there are many more consistent segments. This is a key
point for achieving a good localisation (see next subsection).
Figure 4.11: Fuzzy segments map from sensor buffer.
4.4.2 Fuzzy segments localisation
The fuzzy segments map representation is used for self-localisation (Gasós and Martín,
1996a). While the robot is navigating in an environment in order to accomplish a task, the
sensor observations are used to build a partial fuzzy segments map of the robot surroundings.
This map is compared to the global fuzzy segments map, which has been previously
constructed. When the matching results provide enough evidence of the coincidence of both
maps, the robot location is updated, based on their differences. This localisation task is run as
a continuous process during all the time the robot is moving in order to continuously update
its position.
The global map is built from scratch while keeping track of the areas that have been
already visited. For this task a coarse binary grid (1 x 1 metre cells) represents the areas
visited by the robot. While the robot moves through areas not yet visited, it adds new
segments to the global map and the cells are marked as visited. When the robot arrives at
visited areas, it adds the new segments to the local map. The local and global maps are
compared until a significant number of local segments match those in the global map. Then,
the transformation that brings both maps together is used to estimate the dead reckoning
errors accumulated in the time span between the construction of the maps. In this case, the
166
HUMBERTO M ARTÍNEZ BARBERÁ
local map is added to the global map, allowing changes in the environment to be completed
and updated, and the process continues in the same way.
One key point in the localisation process is the correct matching between the local and
global map. If there are not enough perpendicular segments, the matching is not considered.
In other words, if only parallel segments match, there is not enough certainty in the position
and the localisation is not considered. Clearly, the greater the number of correct segments that
are generated, the better the localisation results. This is the main benefit achieved with the
segment generation scheme described in the previous subsection.
Figure 4.12: Fuzzy segments localisation.
An example is shown above (Figure 4.12), which represents the robot trajectory (yellow
line), the current local map (blue segments), the visited areas map (light grey boxes), and the
global map (dark grey segments). The important odometric error in the fifth complete round
in the environment can be noticed. Previous errors were in the same magnitude. The
localisation succeeds in finding the transformation between the local map and the global map.
4.4.3 Experiments and results
Different experiments have been conducted to test the simultaneous map-building and
self-localisation. One of these tests is shown below (Figure 4.13). The Quaky-Ant robot is
simulated in autonomous mode (see chapter 6), and has to navigate from a given home
position (bottom-right corner) to a goal location (top-left corner) and then return to the home
position in an a priori unknown environment. The different slides show the evolution of the
sensor buffer (top image) and the local and global fuzzy segments maps (bottom image).
167
CHAPTER 3: SENSOR FUSION
a) Starting the run.
b) In the corridor.
c) Reaching the goal.
d) Returning to the home.
e) Crossing a doorway.
f) Reaching the home.
Figure 4.13: Simultaneous map-building and self-localisation.
168
HUMBERTO M ARTÍNEZ BARBERÁ
The robot starts (position “a”) with no previous knowledge, and updates the global map
while it is moving through unexplored areas (positions “b” and “c”). When the robot reaches a
visited area it starts matching the local and global map (position “d”) and correcting the
odometry (position “e”) until it reaches the home position (position “f”).
4.5 Conclusions
Two navigation schemes have been developed to allow the Quaky-Ant robot to navigate
in a priori unknown environments. The first procedure relies on the assumption of reliable
odometry for short time and distance operations. Bearing this assumption in mind, the robot is
provided with a map building method and a path planning algorithm, which are run
simultaneously and thus the robot is required to build the map while it is moving, resulting in
a continuous movement. This approach is based on a fuzzy grid map for map-building and an
A* based algorithm for path-planning. The fuzzy grid map has been modified to accomplish
the required task: the virtual sensors are used instead of sonars, the initialisation procedure
has been changed, and some heuristics have been incorporated to modify the certainty of
visited cells. The A* based planner has been defined with a new evaluation function to
generate paths that leave enough space between the robot and the obstacles, and also to be
efficient in time, so that the node evaluation is fast enough to run the planner at a high
frequency. The navigation scheme has been successfully used in different, both simulated and
real world, scenarios.
The second procedure overcomes the limitation of the previous approach by adding a
localisation method. This is accomplished by building fuzzy segments map to model the
previously visited environment and the current sensed environment. The transformations
needed to match these maps are used to correct the current robot’s position. The performance
of the localisation method is greatly affected by the number and quality of the fuzzy
segments. The segments generation procedure has been modified to make use of a sensor
buffer, which produces more consistent segments. Although the motivation behind the
navigation procedure was to make the overall work of the system more robust, it also proved
to be capable of solving the SLAM (Simultaneous Localisation And Map-building) problem
in the different, both simulated and real world, tests carried out. More tests are still needed to
verify or prove in which situations or environments this set-up is valid for the SLAM
problem.
169
Chapter 5
Control Architecture
5.1 Introduction
Previous chapters have dealt with robot hardware, sensor processing and navigation
algorithms. This chapter deals with frameworks to group those systems, concepts and
algorithms into a working system by providing a processing structure. This section starts by
introducing some concepts and previous works related to control architectures. Other sections
then describe:
• A new control architecture putting emphasis on its characteristics, and relating it to
•
•
previous works.
A new robot programming language presenting its syntax and semantics and a comparison
with other robotic languages.
A sample control system, which has been developed using the architecture and language
proposed, intended to control the Quaky-Ant robot.
Finally, some examples are presented using the development environment built upon
those language and architecture, and some remarks about the implementation of the system
are given.
5.1.1 Deliberative and reactive control architectures
The operation of a mobile autonomous robot in an unstructured environment, as occurs
in the real world, needs to take into account many details. A main function of the controller is
to be able to operate under conditions of imprecision and uncertainty. For instance, the a
priori knowledge of the environment is, in general, incomplete, uncertain, and approximate.
Typically, the perceptual information acquired is also incomplete and affected by noise.
Moreover, the execution of a control command is not completely reliable due to the
complexity and unpredictability of the real world dynamics.
171
CHAPTER 5: CONTROL ARCHITECTURE
Early mobile robots were more concerned with the process of actually getting the robot
to move rather than with considering the limitations of the control architecture on the
performance of the vehicle. As an example, the Stanford CART (Figure 1.2) was capable of
performing limited point-to-point navigation in an unknown but static environment. The
operational cycle (Moravec, 1990) started with the robot in a halted position, taking some
pictures, and taking a long time to process them. Then it planned a new path, executed a little
bit of it, and paused again. This process was repeated until the robot achieved its goal. The
structure of the software that controlled the CART can be thought of as a sequence of separate
computational units or modules, each one of which is processed serially with the output of
one action forming the input to the next. Although the overall system performance seems low
by modern standards, the structure of the underlying system architecture is the basis of many
today’s mobile robot systems. This horizontal or functional decomposition of the robotic task
forms the basis of deliberative architectures (Figure 5.1). A deliberative architecture (Nilsson,
1980) breaks the problem of driving the robot into separate functions or components, each of
which must be processed in turn, with the output of one module acting as the input to the next.
Traditional deliberative architectures are based on sensing followed by planning. This is also
known as the sense-plan-act or sense-model-plan-act cycle. These deliberative architectures
must deal with the problem of organising information and its representation within the robot.
Two standard mechanisms have emerged to address this issue: hierarchical and blackboard
systems.
Figure 5.1: Deliberative architecture.
Hierarchical systems decompose the control process by function. Low-level processes
provide simple functions that are grouped together by higher-level processes in order to
provide overall vehicle control. The NASA-NBS Standard Reference Model for Telerobot
Control System Architecture (NASREM) is a classical example of a hierarchical control
system (Albus et al., 1994), although it originally included both a hierarchical and a
blackboard control architecture. The Autonomous Benthis Explorer (Bradley et al., 1990)
underwater robot provides a another typical example of hierarchical control. It utilises a
distributed hierarchical control architecture organised in two broad layers. The lower layer
provides low-level control of the robot’s sensors and actuators, whereas the higher layer
provides mission planning and operational activity monitoring of the lower level.
172
HUMBERTO M ARTÍNEZ BARBERÁ
Blackboard systems (Hayes-Roth, 1985) decompose the control task into separate units
and attempt to minimise the communication between units. Blackboard based systems rely on
a common pool of information (or blackboard) shared by the independent computational
processes that act on the blackboard. Fundamental to any blackboard-based system is a
mechanism to provide efficient communication between the various computational processes
and the blackboard. These systems provide a loose coupling between each subsystem,
allowing the straightforward exchange of information between the subsystems, and they
provide a clear and consistent representation of information that can be used by each
individual subsystem. The Ground Surveillance Robot (GSR) system (Harmon, 1987), an
autonomous robotic system designed to navigate from one known geographical location to
another over unknown natural terrain, is a typical blackboard-based system. GSR uses a
blackboard to represent information and to pass information from one software module to
another. Although blackboard based systems offer a natural mechanism for sharing
information to a number of parallel computational tasks, the shared database can lead to
bottlenecks in terms of processing. In addition, the asynchronous nature of the blackboard
system can make software development difficult and can lead to subtle timing errors between
the various computational modules.
By 1985 it was becoming clear that deliberative model based systems had many
shortcoming. The standard criticism of these systems is the structure’s inability to react
rapidly, for example in the case of an emergency. These systems tend to have a long latency
time built into the system, although there have been attempts to address this by specifying
temporal constraints (Tsotsos, 1997). In addition to latency problems, deliberative systems
often make the traditional assumption that the world remains static between successive
activations of the perception module. Conditions that violate this assumption can cause
problems for the robot, for example moving obstacles. As quoted from Gat (Gat, 1998):
“This problem often manifests itself as running researcher syndrome,
characterised by having to chase the robot to push the emergency stop
button after it makes a wrong turn.”
Some researches in the mid 80s suggested that a different execution mechanism was
needed (Payton, 1986). The reason: mobile robots were increasingly required to navigate and
perform purposeful autonomous tasks in more complex domains, in real world like
environments. These requirements demanded some reactive capacity in their navigation
systems.
Whereas deliberative systems are based upon planning, reactive architectures (Figure
5.2) directly couple sensors and actuators. The reactive control paradigm is based on animal
models of intelligence: the overall action of the robot is decomposed by behaviour rather than
by a deliberative reasoning process. Common to most reactive systems is that goals may not
be represented explicitly, but rather they are encoded by the individual behaviours that
operate within the controller. One key point in reactive systems is how to coordinate the
different behaviours. There are two modes of behaviour coordination: cooperative and
competitive. The cooperative model (Figure 5.4) provides the ability to concurrently use the
outputs of more than one behaviour at a time. Since the behaviours are combined with a set of
gains, the output tends to be smooth with sensor noise. However, it is prone to local minimum
173
CHAPTER 5: CONTROL ARCHITECTURE
traps or deadlock situations arising from possible conflicts among the different behaviours.
On the other hand, the competitive model (Figure 5.3) selects only one behaviour as the final
output. Consequently, it bypasses the local minimum problem but at the expense of not being
smooth with sensor noise and also having the problem of limit cycles.
Figure 5.2: Reactive architecture.
To address the coordination problem some different reactive architectures have
emerged such as subsumption (Brooks, 1986), motor schema (Arkin, 1989), reflexive
behaviour (Payton, 1986) and situated agents (Agre et al., 1987).
Subsumption control systems (Brooks, 1986), perhaps the best known of the reactive
architectures, consist of a number of behaviour modules arranged in a hierarchy. Different
layers in the architecture take care of different behaviours. Lower layers control the most
basic behaviours of the device, while the higher behaviour modules control more advanced
functions. The overall structure is similar to hierarchical motor patterns as they are used in
ethology. The subsumption architecture introduced the concept of providing levels of
competence, where each level includes all earlier levels. The behaviours are organised in a
static hierarchy in which lower levels of competence have an implicit priority over higher
ones. Then, the final emergent behaviour of the system is the set of reactions that emerge
from the various modules provided to achieve each of these levels of competencies. Lower
level layers interface to higher level ones by suppressing the results of the higher level
computations and superseding their results. That is, it is based on a competitive coordination
model (Figure 5.3).
Figure 5.3: Competitive behaviour coordination.
174
HUMBERTO M ARTÍNEZ BARBERÁ
In its original presentation, subsumption is, at least ostensibly, not a radical departure
from deliberative control at all, but rather an attempt to make deliberative architectures more
efficient by applying tasks-dependent constraints to the subsumption layers. Where
subsumption radically departs from deliberative models is in its repudiation of plans, and
indeed of symbolic representation in general. Time-consuming computations like planning
and world modelling generate an internal state whose semantics reflect world states, whether
they are past, present or future. Deliberative approaches get into trouble when the robot’s
internal state loses sync with the reality that it is intended to represent. The reactive solution
to this problem is to minimise the use of internal state as much as possible. If there is no state,
then it cannot lose sync with the world, sometimes stated as:
“The world is its own best model.”
Subsumption reached a pinnacle with a robot called Herbert, which was programmed to
find and retrieve soda cans in an office environment. While Herbert’s capabilities were
impressive even by today’s standards, it also appeared to represent the limits of what could be
achieved with subsumption. Herbert was very unreliable and no subsumption-based robot
seems to have ever matched its capabilities since. One possible cause of subsumption’s
apparent capability ceiling is that the architecture lacks mechanisms for managing
complexity. The years following the introduction of subsumption saw a profusion of new
robot control architectures, some developed more or less independently (Arkin, 1990) and
others introduced as a direct response to subsumption’s shortcomings (Rosenblatt et al.,
1989).
Motor schema systems (Arkin, 1989) are characterised by their neuroscientific and
psychological plausibility, the absence of hard arbitration between behaviours, the fusion of
behavioural outputs through the use of vector summation, and the inherent flexibility due to
the dynamic instantiation and deinstantiation of behaviours on an as-needed basis. Motor
schemas consist of individual motor behaviours, each of which reacts to sensory information
obtained from the environment. The output of each schema is a velocity vector that represents
the direction and speed at which the robot is to move for that schema given the current sensed
view of the environment. Individual schemas are simply the vector output of a computation
based on a local internal state and the current input sensor values of the robot. Each schema is
straightforward to compute. At any point in time a given set of schema is active, and the
outputs of the members of the active set of motor schema are combined as a normalised
vector sum. That is, the behaviour coordination model is cooperative (Figure 5.4), opposed to
the winner-takes-all approach of subsumption architecture.
175
CHAPTER 5: CONTROL ARCHITECTURE
Figure 5.4: Cooperative behaviour coordination.
5.1.2 Hybrid control architectures
If a reactive system can be built to provide control for a limited task, one obvious
extension is to build different reactive modules for different portions of the robot’s mission
and to swap between the different reactive modules as required. Various mechanisms have
been proposed to allow the specification of which set of behaviours should be operational and
for how long. These include BDL (Rochwerger et al., 1994), RAP (Firby, 1994), DAMM
(Payton, 1986), RPL (McDermott, 1991), and many others.
The Behaviour Description Language (BDL) was designed for the integration of
multiple behaviours on a task-specific basis (Rochwerger et al., 1994) in an autonomous allterrain robot (based on an HMMWW platform) for driving on and off roads. Overall control
of the robot is accomplished via a collection of independent perception-action processes that
are executed concurrently. Each process converts sensory data into some type of action. A
global blackboard is used for communication between the processes. The overall behaviour of
the robot is defined by the set of processes currently running. BDL is based on an augmented
finite state machine model. Transitions between states are controlled by external event, and
associated with each state is a run set (the minimum set of behaviours to be executed) and a
kill set (the behaviours that must not be executed when the robot is in this state). This model
provides a mechanism to allow different sets of behaviours to be enabled under specific
circumstances.
The Reactive Action Packages system (RAP) provides an architecture to integrate and
interface between reactive control modules (Firby, 1989)(Firby, 1994). It allows continuous
reactive primitives to be grouped together and executed in sequence or in parallel to perform
complex tasks. Underlying the RAP system is the concept that individual reactive behaviours
are executed in parallel and the state of the individual behaviours, the robot and the
environment can be queried by the RAP controller. On the basis of this state information and
on asynchronous events generated by the reactive behaviours, different behaviours can be
scheduled and sequenced to generate more complex plans.
These systems, in general, are not taskable, that is, it is not possible to change the task
they perform without rewriting their control program. This is due to the fact that the
176
HUMBERTO M ARTÍNEZ BARBERÁ
behaviour combination mechanism implicitly specifies the task. The natural evolution of these
systems is what is called hybrid architectures, which combine reactive controllers and
deliberative modules. At least three different groups of researchers working more or less
independently came up with very similar solutions to the tasking problem at about the same
time (Connel, 1992)(Gat, 1991)(Bonasso, 1991). All three solutions consisted of control
architectures that comprised three main components: a reactive feedback control mechanism,
a slow deliberative planner, and a sequencing mechanism that connected the first two
components. Connel’s sequencer was based on subsumption, Bonasso used the REX/GAPPS
system (Kaebling, 1987), and Gat’s was based on RAP. Bonasso’s group later adopted RAP
as their sequencing component, while Gat’s sequencer was developed into the new language
ESL (Gat, 1997). The RAP-based architecture is called 3T, Connell’s subsumption-based
architecture is called SSS, and Gat’s architecture is called ATLANTIS.
Figure 5.5: Three-Layer hybrid architecture.
These architectures, known as three-layer architectures (Gat, 1998), organise
algorithms according to whether they contain no state, contain state reflecting memories about
the past (world models), or contain state reflecting predictions about the future (plans). The
three-layer architecture consists of three components (Figure 5.5): a reactive feedback control
mechanism, a reactive plan-execution mechanism, and a mechanism for performing timeconsuming deliberative computations. These components run as separate computational
processes. This is most easily accomplished by using a multitasking or multithreading
operating system, but it can also be done by carefully coding the code so that the components
are interleaved within a single computational process. Several names have been proposed in
the literature for these layers. For instance, in the 3T architecture they are called skill layer,
sequencing layer, and planning layer, and in the ATLANTIS architecture they are called
controller, sequencer, and deliberator. In the context of this thesis they are referred as
controller, sequencer, and planner.
The controller consists of one or more execution threads that implement one or more
feedback control loops, tightly coupling sensors to actuators. The transfer function(s)
computed by the controller can be changed at run time. Usually the controller contains a
library of transfer functions called basic reactive behaviours. Which ones are active at any
given time is determined by an external input to the controller. Classic examples of basic
reactive behaviours are wall following, collision avoidance, moving to a close destination, etc.
There are several architectural constraints on the algorithms that go into the controller:
177
CHAPTER 5: CONTROL ARCHITECTURE
• Computing one iteration of the transfer function should be constant-bounded time and
•
•
•
space complexity, and this constant should be small enough to provide stable closed-loop
control for the desired behaviour.
Algorithms in the controller should be designed to detect any failure to perform the
function for which they were designed. Rather than attempting to design algorithms that
never fail (which is impossible on real robots), it is more worthwhile to design algorithms
that never fail to detect a failure. This allows other components of the system to take
corrective actions to recover from the failure.
The use of internal state should be avoided whenever possible. An important exception to
this rule is filtering algorithms, which can be used in the controller although they rely on
internal state. If internal state is to be used for other purposes, it should expire after some
constant-bounded time.
Internal state in the controller should not introduce discontinuities in a behaviour, in the
mathematical sense. It is the responsibility of the sequencer to manage transitions between
regimes of continuous operation.
A number of special-purpose languages have been developed for programming the
controller, like ESL (Gat, 1991). Most of the special-purpose languages for programming the
controller were developed at a time when robots could support only very small processors for
which no other development tools were available. One current trend is simply to program the
controller in C.
The job of the sequencer is to select which basic reactive behaviour, or set of basic
reactive behaviours, the controller should use at a given time, and to supply parameters for the
behaviour. By changing primitive behaviours at strategic moments, the robot can be lead into
performing useful tasks. The problem is that the outcome of selecting a particular basic
reactive behaviour (or set) in a particular situation might not be the intended one, and so a
simple linear sequence of behaviours is unreliable. The sequencer must be able to respond
conditionally to the current situation, whatever it might be. One approach to the problem is to
enumerate all the possible states the robot can be in and pre-compute the correct primitive to
use in each state for a particular task. In some constrained environments this may be a
tractable problem. However, this universal plan approach has two drawbacks: it is often not
possible for a robot to know its current state, and it does not take into account the previous
history, which often contains useful information.
A practical alternative is conditional sequencing, which is a more complex model of
plan execution motivated by human instruction following. Conditional sequencing systems
include constructs for responding to contingencies and managing multiple parallel interacting
tasks. It is possible to construct a conditional sequencing system in a traditional programming
language like C, but because the control constructs are so much more complex than those
provided by such languages, conditional sequencing is much more effectively done with
special purpose languages. These include RAP (Firby, 1989), PRS (Georgeff et al., 1987),
REX/GAPPS (Kaelbling, 1987), Behaviour Language (Brooks, 1989), or ESL (Gat, 1997).
There are two major approaches to the design of conditional sequencing languages.
They can be complete languages on their own (e.g. RAP and PRS) or they can be layered on
178
HUMBERTO M ARTÍNEZ BARBERÁ
top of a syntactically extensible programming language like LISP (e.g. Behaviour Language
and ESL). In any case, the sequencer should not perform computations that take a long time
compared to the rate of environment change at the level of abstraction presented by the
controller. Exactly how long they take depends on both the environment and the repertoire of
behaviours. Usually this implies that the sequencer should not perform any search or temporal
projection.
The planner is the locus of time-consuming computations. Usually this means such
things as exponential search-based algorithms, but it could also include polynomial-time
algorithms with large constants (such as certain vision processing algorithms). The key
architectural feature of the planner is that several behaviour transitions can occur between the
time a deliberative algorithm is invoked and the time it produces a result. There are no
architectural constraints on algorithms in the planner, which usually are written in standard
programming languages. The planner can interface to the rest of the system in two different
ways. It can produce plans for the sequencer to execute (e.g. 3T architecture) or it can respond
to specific queries from the sequencer (e.g. ATLANTIS architecture), although these two
approaches are not mutually exclusive.
5.1.3 Behaviour definition
The three-layer architecture specifies how to organise different algorithms and impose
some constraints on them. One key point is how to formulate the behaviours that compose the
controller while sticking to the restrictions imposed: constant-bounded execution time, failure
detection, no internal state usage, and continuous transfer function. In fact, behaviours can be
viewed from a classical control point of view: they have to provide closed-loop feedback
control (Figure 5.6). There are many formulations to deal with the reactive behaviour
formulation problem, namely from control theory, potential field based methods, neural
networks, and fuzzy logic, to cite just a few. Examples and the applicability to mobile
robotics of some of them are presented below.
Figure 5.6: Typical closed-loop feedback based behaviour.
Fuzzy logic provides tools that are of potential interest to mobile robot control (Saffiotti,
1999). Most applications of fuzzy logic in this field concern the use of fuzzy control
techniques to implement individual behaviour units. Fuzzy controllers are a convenient choice
when an analytical linear model of the system to be controlled cannot be easily obtained.
Fuzzy controllers have shown a good degree of robustness in the face of large variability and
179
CHAPTER 5: CONTROL ARCHITECTURE
uncertainty in the parameters, and they lend themselves to efficient implementations,
including hardware solutions. These characteristics fill the requirements for reactive
behaviours well: a mathematical model of the environment is usually not available, sensor
data is uncertain and imprecise, and bounded time execution (that is, real time operation) is of
essence. For this reason fuzzy control has been the first application of fuzzy logic techniques
in this domain since Sugeno and Nishida’s model car (Sugeno et al., 1985), and it is still the
most common (Voudouris et al., 1994)(Surmann et al., 1995). Fuzzy logic has later been
applied for path-tracking (Sánchez et al., 1997)(Ollero et al., 1998), simulated (Biris and
Shen, 1997) and real world navigation (García-Alegre et al., 1993)(Topalov et al., 1998) (Xu
and Tso, 1999), low-level car motor control (Reyes et al., 1999), obstacle avoidance (Stanev
et al., 1998)(Martínez et al., 1994), and wall-following (Urzelaiz et al., 1997)(García-Rosa
and García-Alegre, 1990).
Fuzzy modelling, as a complement to the conventional modelling techniques, has
become an active research topic and found successful applications in many areas. However,
most present fuzzy models have been built based only on operator's experience and
knowledge, but when a process is complex there may not be an expert (Wang et al. 1996).
This can be the case when developing fuzzy behaviours for robot control, for example when
there are many input variables and the control of the plant is not simple. In this kind of
situation the use of learning techniques is of fundamental importance. The problem can be
stated as follows. Given a set of data for which we presume some functional dependency, the
question arises whether there is a suitable methodology to derive fuzzy rules from these data
which characterise the unknown function as precisely as possible. Recently, several
approaches have been proposed for automatically generating fuzzy if-then rules from
numerical data without domain experts (Wang and Mendel, 1992)(Ishibuchi et al., 1995)(Joo
et al., 1997)(Gómez-Skarmeta and Jiménez, 1997a)(Gómez-Skarmeta and Jiménez, 1997b).
Mobile robotics applications using fuzzy modelling include, for instance, wall-following
(García-Cerezo et al., 1997), obstacle avoidance (Fukuda and Kubota, 1999), moving
obstacles avoidance (Pratihar et al., 1999), parking of truck and trailer (Denna et al., 1999),
and robot stabilisation (Kwon et al., 1998).
Neural networks have an important application area in the field of mobile robotics
because of their adaptive and learning properties (Kröse and van Dam, 1997). Neural
networks can be used for the control of systems with unknown or complex behaviour, where
the operator’s experience can be used to obtain a set of examples to train the network. In the
case of robotics, neural networks are applied for the control of nonlinear systems because of
their ability to learn a nonlinear model from examples. Moreover, neural networks can be
adaptive, in the sense that they continue learning during operation. This allows for continuous
improvement of the controller while in operation. One of the first examples of neural sensormotor mapping is the Autonomous Land Vehicle in a Neural Net, ALVINN (Pomerleau,
1991). The neural network controlled the CMU’s Navlab vehicle on a road-following task
without having a model of the road. It learned associations between visual patterns and
steering commands. There are many successful uses of neural networks for controlling mobile
robots in the literature (Opitz and Lernfahrzeug, 1990)(Glasius et al., 1995)(Nguyen and
Widrow, 1990)(Aguilar y Contreras, 1994)(Zalama et al., 1995). Nevertheless, the
effectiveness of the neural networks is completely restricted by the training patterns, since the
robot is unable to react properly to an environment which does not fall in the space specified
180
HUMBERTO M ARTÍNEZ BARBERÁ
by the training patterns, on account of the poor exploration property of neural networks (Xu
and Tso, 1999).
Potential field methods (PFM) for obstacle avoidance have gained increased popularity
among researchers in the field of mobile robots. PFM methods are based on the idea of
imaginary forces acting on a robot (Andrews and Hogan, 1983)(Khatib, 1985). In these
approaches obstacles exert repulsive forces onto the robot, while the target applies an
attractive force to the robot. The sum of all forces, the resultant force, determines the
subsequent direction and speed of travel. The popularity of PFMs is due to their exhibiting
some advantages: PFMs are simple and elegant, while they can be implemented quickly and
initially provide acceptable results without requiring many refinements. Early implementation
of PFMs was in purely reactive robots (Brooks, 1986)(Arkin, 1989) and in off-line systems
for path planning (Thorpe, 1985). The generalised PFM were proposed as an attempt to
combine local and global path planning (Krogh, 1984)(Krogh and Thorpe, 1986).
One interesting implementation of PFM is the virtual force field (VFF) method
(Borenstein and Koren, 1989). The VFF method is based on the force field (FFM) method,
another early PFM implementation (Brooks, 1986)(Brooks and Connel, 1986), and the
certainty grid (see the section on maps in chapter 4)(Moravec and Elfes, 1985). As the robot
moves, range readings are taken and projected into the certainty grid, called histogram grid.
The algorithm uses a small square window of the grid, such that the robot is centred in the
window. Each occupied cell inside the window applies a repulsive force to the robot, which is
inversely proportional to the square of the distance. Thus the resultant repulsive force is the
vectorial sum of all individual repulsive forces. At any time during the motion, a constant
magnitude force pulls the robot towards the target. Finally, the resultant of the vectorial sum
of the global repulsive force and the goal’s attractive force is used as the reference for the
robot’s steering rate command.
Despite the initial soundness of the VFF method, there are four problems that are
inherent to PFMs, independently of their particular implementation (Koren and Borenstein,
1991):
•
•
•
•
They produce trap situations due to local minima.
They fail to find a passage between closely spaced obstacles.
There are oscillations in the presence of obstacles.
There are oscillations in narrow passageways.
The experience with PFMs leaded to the development of the vector field histogram
(VFH) method for point robots (Borenstein and Koren, 1991) and later for non-point robots
(Borenstein and Raschke, 1992). The VFH method builds the histogram grid the same way as
VFF does. However, VFH introduces an intermediate data representation, called the polar
histogram, that retains the statistical information of the histogram grid, but reduces the
amount of data that needs to be handled in real-time. The polar histogram is an array, where
each cell represents an arc sector. The certainty values of the active window are mapped onto
the polar histogram, and the resulting values can be viewed as the instantaneous polar
obstacle density. The algorithm then computes the required steering direction in the same way
181
CHAPTER 5: CONTROL ARCHITECTURE
as the VFF, from the goal’s orientation and the polar histogram. Some modifications to the
original VFH have been published recently, namely the VFH+ (Ulrich and Borenstein, 1998)
and the VFH* (Ulrich and Borenstein, 2000) methods. There are several more approaches to
the PFM problem in the literature. Some of them introduce fictitious goals or attractors
(Maravall et al., 2000), and others are based on different concepts, like variable thermal
conductivity (Wang and Chirikjian, 2000).
5.1.4 Behaviour coordination
Behaviour-based control systems have been a major corner stone in mobile robotics,
and have achieved notable success when incorporated in hybrid architectures. Since the first
appearance of behaviour-based approaches, researchers have noticed the importance of the
problem of behaviour coordination: how to coordinate the simultaneous activity of several
independent behavioural components to obtain an overall coherent behaviour that carries out
the intended task. The firsts approaches were based on a competitive coordination model
(Brooks, 1986) (Payton, 1986) (Firby, 1987). A common example to illustrate that this model
may sometimes be inadequate is the following: a robot navigates following a path and
suddenly encounters an obstacle. At this point it has two options, to turn to the left or to the
right. This choice may be indifferent for the obstacle avoidance behaviour, but from the path
following perspective one might be dramatically better than the other. In competitive
coordination schemes, usually, the obstacle avoidance behaviour knows nothing about it, and
therefore, takes an arbitrary decision. Of course, a more elaborate coding of the behaviours
could solve the problem, but at the expense of computational cost, behaviour maintenance due
to lateral effects, or behaviour independence.
In order to overcome these deficiencies several researchers have proposed cooperative
coordination schemes that allow parallel execution of different behaviours, as in motor
schemas (Arkin, 1989), and context-dependent blending (Saffiotti, 1997b). The behaviour
coordination problem can be divided into two conceptually different sub-problems: how to
decide which behaviour should be activated at a time (behaviour arbitration), and how to
combine the result from different behaviours that are simultaneously active (command
fusion).
The behaviour arbitration policy determines which behaviour or set of behaviours
should influence the operation of the robot at each moment, and thus ultimately determines
the task actually performed by the robot. Subsumption based architectures rely on a fixed
arbitration policy, hard-wired into a network of suppression and inhibition links (Brooks,
1986). These simple arbitration strategies that only allow one behaviour at a time to be active
may incur in serious limitations. Hence, a further step ahead is to decide which behaviours are
active dynamically, based on both the current goal and the environmental contingencies, but
without allowing the concurrent execution of behaviours (Payton, 1986)(Firby, 1987). Both
fixed and dynamic arbitration policies can be implemented using the mechanisms of fuzzy
logic (Saffiotti, 1997b). Fuzzy logic provides some advantages: the ability to express partial
and concurrent activations of behaviours, and the smooth transition between behaviours. The
first appearances of fuzzy logic for arbitration were based on fixed priority schemes (Berenji
182
HUMBERTO M ARTÍNEZ BARBERÁ
et al. 1990)(Pin and Watanabe, 1994). More flexible arbitration policies were obtained using
fuzzy meta-rules, or context rules, where more than one behaviour can be activated at a time
and their output fused (see below for a discussion on output fusion). Fuzzy context rules
appeared to switch between flight modes in a fuzzy-controlled unmanned helicopter (Sugeno
et al., 1993) and to combine behaviours in the robot Flakey (Saffiotti et al, 1993). Fuzzy
context rules have been applied successively and with success in an increasing number of
robots (Voudouris et al., 1994)(Surmann et al., 1995)(Tunstel, 1996)(Goodridge and Luo,
1994)(Michaud et al., 1996). It should be noted that the use of fuzzy context rules is
independent of the way in which individual behaviours are implemented, that is, behaviours
do not need to be fuzzy.
Some researchers have proposed coordination schemes where several behaviours can be
concurrently activated. Command fusion relates to how the outputs of the different active
behaviour are combined to produce just one output. Initial approaches to command fusion
were based on vector summation methods (Arkin, 1989)(Khatib, 1986). In these approaches, a
force vector represents each command, and commands from different behaviours are
combined by vector summation. The robot responds to the force resulting from the
combination. In this case, fuzzy logic can be effectively used to perform a more general form
of weighted command fusion. The first proposal of fuzzy command fusion appeared as a
reaction to the existing single behaviour arbitration schemas (Rosenblatt and Payton, 1989),
and later it was re-stated in terms of fuzzy logic (Yen and Pfluger, 1995). Several other
authors have used fuzzy logic to perform fuzzy command fusion. Some of these actually
implement a trivialised form of fuzzy command fusion, using weighted singletons as fuzzy
outputs and CoG defuzzification (Goodridge at a., 1994)(Goodridge et al., 1997), or
symmetric rectangles and CoG defuzzification (Pin and Watanabe, 1994). Both these methods
can be shown to be equivalent to a vector summation scheme. The majority of authors have
opted for full implementation of fuzzy command fusion, through the combination of arbitrary
fuzzy sets followed by a defuzzification step (Yen and Pfluger, 1995)(Saffiotti et al.,
1993)(Tunstel, 1996)(Voudoris et al., 1994)(Michaud et al., 1996)(Michaud, 1997).
The most general form of behaviour combination using fuzzy logic is obtained by using
both fuzzy behaviour arbitration (with fuzzy context rules) and fuzzy command fusion (with
fuzzy combination rules). This approach subsumes all the approaches listed as special cases,
and it is known as context-dependent blending or CDB in short (Saffiotti, 1997b). The idea of
context-dependent blending was initially suggested by Ruspini (Ruspini, 1991) and later
developed by Saffiotti (Saffiotti et al., 1993). CDB can be characterised as follows:
• Each behaviour generates preferences from the perspective of its goal.
• Each behaviour has a context of activation, representing the situations where it should be
•
•
used.
The preferences of all behaviours, weighted by the truth value of their contexts, are fused
to form a collective preference.
One command is chosen from this collective preference.
183
CHAPTER 5: CONTROL ARCHITECTURE
Figure 5.7: Context-dependent blending as a hierarchical fuzzy controller.
The preferences are represented by fuzzy sets, which correspond to control actions
generated by fuzzy controllers. Contexts are represented by formulae in fuzzy logic, which
serve as the antecedent in the fuzzy arbitration rules. Fusion and choice are respectively
performed by a fuzzy combination operator and by defuzzification. CDB can be implemented
in a hierarchical fuzzy controller (Figure 5.7), where the behaviours correspond to the lower
level rule bases and the context rules correspond to the meta-rule base of the hierarchical
controller. This is the implementation originally used in the Flakey robot (Saffiotti et al.,
1993). Hierarchical fuzzy controllers have been proposed and applied to building complex
robot behaviours using CDB (Voudouris et al., 1994)(Congdon et al., 1993). When using
CDB, behaviours take their input from the local perceptual space (LPS), a common storage
structure that stores perceptual data, interpretations of these data, and the representation of
subgoals created by the planner. This way, the use of the LPS makes the integration of
reactivity and long-term planning possible (Saffiotti, 1993). The planner generates a set of
subgoals that are later exploited by the controller to orient the robot’s activity.
In general, CDB can be used to execute a full plan for action, represented by a set of
context rules. The rules tell which behaviours should be used in each situation. CDB is used
to blend the recommendations from different concurrent behaviours, each weighted by the
corresponding context. Fuzzy context rules provide a flexible means to encode behaviour
arbitration strategies. Like fuzzy control rules, context rules allow to write complex strategies
in a modular way using a logical format. Fuzzy command fusion can then be used to combine
recommendations from concurrent behaviours. The resulting context-dependent blending is
strictly more general than other coordination schemes commonly used in robotics. The fact
that the same format is used for control rules and the arbitration rules allows increasingly
complex behaviours to be written in a hierarchical fashion. Following its implementation in
Flakey, CDB has been used by several researchers in autonomous mobile robots, both in real
robots (Goodridge and Luo, 1994)(Surmann et al., 1995)(Voudouris et al., 1994)(Arrúe et al.,
1997) and simulators (Tunstel, 1996)(Michaud et al., 1996)(Aguirre et al., 1998). It is
interesting to note that learning techniques have also been applied to CDB (Bonarini and
Basso, 1997). A detailed discussion is included in chapter 7.
184
HUMBERTO M ARTÍNEZ BARBERÁ
Several other methods exist: case-based reasoning (Ram et al., 1997), utility fusion
(Rosenblatt, 2000), neuro-fuzzy approaches: general architecture (Li et al., 1997), simplistic
behaviour linear combination (Benreguieg et al. 1998), to cite just a few.
5.1.5 Multiagent systems
In recent years, the term agent (Rusell and Norvig, 1995)(Hendler, 1999) has grown in
significance in many different fields, and especially within the artificial intelligence
community (AI). Initially the term intelligent agent was applied to planning systems that
interact with a changing environment in real time. The underlying idea was that agents have
deliberative capabilities while being able to work in real world like applications. When things
like e-commerce appeared, the term bounded rational agent was used to refer to systems with
decision-making capabilities that are restricted in terms of time or computational resources.
The underlying idea was that agents are capable of taking decisions on real world like
applications as well. This new idea was particularly well received and reinforced an important
part of the agents literature.
Meanwhile, both the embedding of an agent in an environment and the notion of a
decision-making agent come to the forefront when working with mobile robots, particularly
autonomous mobile robots that operate in natural environments. As the capabilities of such
robots improved (they seemed more intelligent), more agent terminology entered the robotics
literature. Thus roboticists also started using the term agent to refer their work. The
underlying idea was that agents are also physical systems that take decisions on real world
like applications. Many current agent architecture ideas, particularly those of multiple-level
architectures and emergent behaviour, grew largely from the robotics field.
The field of artificial intelligence has recognised some key aspects (although they are
not required simultaneously) related to the notion of agent: changing environments, decisionmaking capabilities, and physical systems. Much of the work on artificial intelligence is
related somehow to agents. Hendler relates an interesting case on the effect of agents in
artificial intelligence (Hendler, 1999):
“David Miller […] was fond of telling government agencies that they
should stop funding AI (artificial intelligence) and start funding IA
(intelligent agents).”
Now, several other communities are using the term agent to refer to many different
concepts. For instance, with the growth of Internet applications a new model of software, that
takes encapsulation beyond the object level and gives software modules a temporal extent, has
raised. These agents are particularly useful when combined with mobile code. When these
mobile agents are working through the net to perform purposeful tasks for their users, some
new terms appear: search agent, which helps a user to find things on the net; routing agent,
which helps packets to move faster; and information-filtering agent, which looks for
particular conditions in a network or data resource, and alerts the user or takes other actions.
In addition the term agent also relates to advanced user interfaces. Thus, the term agent, even
185
CHAPTER 5: CONTROL ARCHITECTURE
limited to an information technology context, refers to many different software concepts.
Although, agents have been deployed in many domains, the functions that researchers are
exploring have some commonalities. In this thesis the term agent is used to refer to decisionmaking entities that operate in real world like environments to control mobile robots, that is,
the robotics and artificial intelligence domain.
The distributed artificial intelligence (DAI) field has existed for less than two decades.
Traditionally, distributed artificial intelligence is broken into two sub-disciplines: distributed
problem solving (DPS) and multiagent systems (MAS). The main topics considered in DPS
are information management issues such as task decomposition and solution synthesis. For
example, a constraint satisfaction problem can often be decomposed into several not entirely
independent sub-problems which can be solved on different processors. Thus these solutions
can be synthesised into a solution of the original problem. Multiagent systems aim to provide
both principles for construction of complex systems involving multiple agents and
mechanisms for coordination of independent agents. Thus, MAS allow the sub-problems of a
constraint satisfaction problem to be subcontracted to different problem solving agents with
their own interests and goals.
MAS are applicable in many domains (Stone and Veloso, 2000), particularly when there
are different elements with distinct (possibly conflicting) goals and proprietary information
and when a system to handle their interactions is needed. MAS are also conceivable in
systems that are not distributed, for example a domain that is easily broken into components
so that independent tasks can be handled by separate agents. Another benefit of multiagent
systems is their scalability. Since they are inherently modular, it should be easier to add new
agents in a multiagent system than it is to add new capabilities to a monolithic system.
Moreover, from a programmer’s point of view the modularity of multiagent systems can lead
to simpler programming, thus enhancing performance and/or reliability (following the “keep
it simple, stupid!” principle).
In the mobile robotics field, MAS are particularly useful both in single robot systems
and in multi-robot systems. As this thesis is on single-robot systems, the latter case is not
considered. For example, the three-layer architecture can be broken into different agents that
can be laid out as a multiagent system. This topic is discussed in the following section.
Several taxonomies, which make use of more than three dimensions, have been
presented for DAI (Decker, 1987) and MAS (Parunak, 1996). Typical MAS have coarse agent
granularity and high-level communication, while they vary across the whole range in the
dimensions of heterogeneity and distributed control. In fact, degree of heterogeneity is a
major MAS dimension, and all the methods of distributing control appear in the literature as
major issues. Recently, Stone and Veloso presented a new MAS taxonomy based solely on
heterogeneity and communication characteristics (Stone and Veloso, 2000). In all these cases,
communicating agents (be they homogeneous or heterogeneous) need a framework within
which
to
communicate
(Brenner,
1998).
They
should
define
both
a
communication/coordination method, which defines how agents communicate with each other
to share information (the architecture of the communication infrastructure), and an agent
communication language, which defines what agents communicate to the others (the format
of the messages that are interchanged).
186
HUMBERTO M ARTÍNEZ BARBERÁ
There are a number of fundamentally different communication/coordination methods
(Brenner, 1998)(Cabri et al., 2000), but only non-mobile approaches are considered. The
simplest case is, when referring to software-based agents, is the invocation of an agent’s
procedure by another agent. The invoking agent explicitly uses calling parameters to inform
its communication partner of a specific wish and of its intentions. The return values of the
procedure represent the answer of the addressed agent. Depending on where the addressed
agent is physically located, the procedure invocation can take place either locally or remotely.
This simple method lacks the ability to support complex communication patterns.
The second method arises out of DAI and is known as the blackboard metaphor. The
blackboard (Hayes-Roth, 1985) represents an extension to the agenda of traditional AI
systems and rule-based expert systems. A blackboard provides all agents within a MAS (Kirn,
1996) with a common work area in which they can exchange information, data, and
knowledge. The blackboard can be used both for task sharing, where an agent requests other
agents for help in solving specific subtasks, and result sharing, where an agent which has
solved a specific subtask communicates with other agents to make its results available. No
private areas exist within simple blackboards. All agents have access to every information
item and they have the complete responsibility in deciding which information they actually
access. For example the Ambit system is a MAS formal model based on a blackboard for
agent coordination (Cardelli and Gordon, 1998).
Blackboards provide a very flexible concept for the cooperation and communication of
problem-solving units. However, the central structure of a blackboard is a growing obstacle to
the continuing expansion of network-centric applications. All agents registered on the
blackboard are forced to place their information directly on the blackboard, irrespective of
their position in the network. In some cases this might produce a bottleneck because of high
network load. The third method of agent communication, message passing, tries to overcome
this limitation. In this case, the messages that agents exchange with each other can be used to
establish communications and cooperation mechanisms using defined protocols. In message
passing systems an agent, called the sender, transfers a specific message to another agent,
called the receiver. In contrast to blackboard systems, messages are directly exchanged
between two agents. One important issue is that all the agents have to know the semantics of
the content of the messages. An agent should not treat a received message just as a fact, but
rather analyse it for implicitly associated actions and intentions. The speech-act theory
(Austin, 1962) attempts to cater for these demands. A speech act designates a message that
contains not only a true or false statement but also exercises a direct influence on its
environment by causing changes within its environment. The communication within a typical
MAS does not usually consists just of sending unrelated messages, and dialogs take place, in
which an exchange of several related messages takes place between two or more conversation
partners. For example D’Agents is a Java-based MAS which uses a message-passing scheme
for agent coordination (Karnik and Tripathi, 1998).
Although message-passing communication based MAS are more efficient in terms of
network bandwidth usage, agents must know somehow about the others, their capabilities and
their knowledge. Otherwise, in blackboard based MAS, agents do not need to know each
other, but only the blackboard. The use of one of these methods is dictated by the current
application. There are other possible coordination strategies in the field of MAS, mainly
187
CHAPTER 5: CONTROL ARCHITECTURE
related to mobile agents, for instance meeting oriented coordination and Linda-like
coordination (Cabri et al., 2000), that will not be discussed here because they are outside the
scope of this thesis.
Just as the term agent has been overused, the term agent communication language
(ACL) refers to several different concepts. Hendler proposes a classification of ACLs
attending to their level of use (Hendler, 1999):
• Performative. This level provides the protocols with which agents can establish some
•
•
•
form of knowledge interchange. Many current proposals, for standard ways for agents to
communicate, focus primarily on the design of unambiguous performative systems.
Service. When agents work across a distributed network, they might need to advertise
their capabilities, find agents that can provide capabilities which they cannot, or both. This
level provides the infrastructure upon which such tasks can be performed.
Content. Once agents are communicating (using performative terms and having found
each other through the service advertisements), they need a deep level of problem-solving
knowledge, often domain-dependent. The content level is in charge of providing a
framework so that different agents can understand or extract knowledge from the
messages.
Control add-ins. Where multiple agents might compete for resources, they need
mechanisms that let them represent or reason about their own computational needs or
those of other agents. This level is in charge of providing a framework for the exchange of
information related to their processing requirements (or even to negotiate mobility).
Many different languages have been used for the exchange of information and
knowledge between applications (sockets, remote procedure calls, remote method invocation,
etc). What distinguishes ACLs from such attempts are the objects of the discourse and their
semantic complexity (Labrou et al., 1999). ACLs stand a level above such methods for two
reasons: ACLs handle propositions, rules, and actions instead of simple objects with no
semantics associated with them, and an ACL message describes a desired state in a
declarative language, rather than a procedural method. When using an ACL, agents transport
messages over a lower level protocol (for example TCP/IP, HTTP, IIOP, etc). The ACL itself
defines the types of messages, and their meanings, which agents can exchange.
Currently, most well known ACLs are KQML, with its many dialects and variants, and
FIPA ACL. KQML (Labrou and Finin, 1994)(Labrou and Finin, 1997) is a high level,
message oriented communication language and protocol for information exchange
independent of content syntax and applicable ontology. Thus, KQML is independent of the
transport mechanism (TCP/IP, IIOP, etc), independent of the content language (KIF, SWL,
PROLOG, etc), and independent of the ontology assumed by the content. Conceptually there
are three layers in a KQML message: content, communication, and message. The content
layer bears the actual content of the message in the program’s own representation language.
Every KQML implementation ignores the content portion of the message, except to determine
where it ends. The communication layer encodes a set of features to the message that describe
the lower-level communication parameters, such as the identity of the sender and recipient,
and a unique identifier associated with the communication. The message layer, which encodes
188
HUMBERTO M ARTÍNEZ BARBERÁ
a message that one agent would like to transmit to another, is the core of KQML. This layer
determines the kinds of interactions one can have with a KQML-speaking agent. The message
layer’s primary function is to identify the network protocol with which to deliver the message
and to supply a speech act or performative that the sender attaches to the content. This speech
act indicates whether the message is an assertion, a query, a command, or any other set of a
set of known performatives. In addition, since the content is opaque to KQML, the message
layer also includes optional features which describe the content language, the ontology it
assumes, and some type of description of the content.
The Foundation for Intelligent Physical Agents ACL (FIPA ACL), like KQML, is based
on speech-act theory (Austin, 1962): messages are actions or communicative acts, as they are
intended to perform some action by virtue of being sent (FIPA, 1997). The FIPA ACL
specification consists of a set of message types and the description of their pragmatics, that is
the effects on the mental attitudes of the sender and receiver agents. FIPA ACL is
superficially similar to KQML. Its syntax is identical to KQML except for different names for
some reserved primitives. Thus, it maintains the KQML approach of separating the outer
language (that of the message format) from the inner language (that of the content format). In
FIPA ACL the communication primitives are called communicative acts, instead of
performatives as in KQML.
5.2 The BGA control architecture
In this thesis a new control architecture, called BGA, is proposed. BGA is a multi-agent
architecture that combines both reactive and deliberative agents. These agents play an
important role as they are the structural elements of the architecture, and BGA uses a
blackboard scheme as the mechanism to communicate the different agents. Agents can be
layered and distributed according to their functionalities, and in this way some examples are
presented using a three-layer hybrid architecture schema. As some of these agents may
implement different reactive behaviours, the fusion of their outputs is accomplished by a new
behaviour blending method derived from the context-dependent blending.
This section introduces the antecedents of the BGA architecture, an early study on
reactive architectures for small micro-robots (MadCars’s architecture). Then the BGA
architecture is described in detail, with emphasis on its structure and how the different
elements are interconnected. Finally, there is also a discussion on the characteristics of the
architecture and how it relates to previous works and concepts (those described in the
previous section).
5.2.1 Antecedents: MadCar’s architecture
Early works on the BGA architecture were based on reactive architectures for small
micro-robots, when the available robotics platform was the non-holonomic robot called
189
CHAPTER 5: CONTROL ARCHITECTURE
MadCar (see chapter 2), which possesses four sonar sensors, one wheel encoder and one
electronic compass. The intended task was a simple safe-wander while maintaining a given
magnetic heading. The MadCar’s processing unit is a small 8 bit microcontroller
(MC68HC11) which is programmed using a pre-emptive multi-tasked interpreted C based
language (Interactive-C).
Purely reactive architectures provide a valuable framework when working with robots
with a reduced set of sensors and low processing capabilities. The reactive process of
perception, behaviour generation and behaviour fusion can easily be implemented in such
small systems. Moreover, these architectures (compared to deliberative ones) perform well
when there is not enough information to model the world. For instance, MadCar’s processing
unit has 32 Kb of RAM and a clock at 4 MHz. Evidently, with such features, an occupancy
grid with A* path-planner was not viable (see chapter 4).
A first look into existing reactive architectures showed that subsumption (Brooks, 1986)
and motor schema (Arkin, 1989) did not provide any means of communication between the
different behaviours, at least from a theoretical point of view (particular implementations may
share some global information). In addition when the control system becomes complex, the
modification of individual behaviours without introducing side effects is very difficult. This is
related to granularity. For instance, in the following example (Figure 5.8) the granularity of
the behaviours is high, implying that some tasks are unnecessarily duplicated. A careful
division of the two behaviours into a set of more elaborated ones can solve the problem at the
expense of obtaining a more complex behaviour inhibition/suppression network.
Battery Level
Dock
IR Detector
Sonar
Avoid
S
Motors
Figure 5.8: Sample subsumption based behaviours.
Blackboard systems (Hayes-Roth, 1985) provide a common poll shared by independent
computational processes and a communication mechanism between them. Blackboard
architectures typically have three different elements:
• The blackboard: structure that stores information, which typically is implemented as
•
•
shared memory. The blackboard is the only way to share data between the different
processes.
The knowledge bases (KB): the different computational processes, each one possibly
implementing its own problem resolution method.
The control module (CM): the process in charge of sequencing and/or arbitrating the
execution of the different KBs.
190
HUMBERTO M ARTÍNEZ BARBERÁ
It has been shown that blackboard based architectures are well suited for the integration
of multiple paradigms and real-time execution (Vranes and Stanojevic, 1995). Real-time can
be achieved assigning a given quantum to every KB. Then, the CM can schedule the
execution of the different KB to meet the desired real-time constraints. In principle,
blackboard systems can be classified as deliberative architectures (see the section above).
With some minor modifications, mainly in the way that the different KBs are specified,
hybrid architectures can be developed using the blackboard paradigm as the basis, and thus
reactivity can be achieved. The architecture proposed for MadCar (Figure 5.9) was based on
this hybrid blackboard-based architecture (Gómez-Skarmeta and Martínez-Barberá, 1997),
where the different KBs are now called behaviours. The main differences with respect to
standard blackboard systems are:
• Most of the behaviours are direct mappings between sensor perceptions and motor
•
•
actions, and thus they are referred as reactive behaviours. These exhibit bounded-time
execution, so the system’s latency can be sufficiently small, with a low granularity being
obtained as well.
The control module is no longer explicit, that is, there is no special process in charge of
activating/deactivating the different behaviours. The responsibility lies in the different
behaviours, which are in charge of specifying which should be the following behaviour to
be executed. This schema is similar to that of subsumption behaviour network. This
implicit control module is implemented using a priority-based stack, which stores pointers
to the behaviours that are to be executed (called events). When a behaviour is running it
pushes events onto the stack. When a behaviour finishes running, the new behaviour
scheduled for execution is that with the highest priority at the top of the stack.
Additionally, there are two asynchronous processes that manage sensory-motor
information. The sensing module is in charge of acquiring sensory information,
processing it, and then signalling important events (like collision against a wall). Events
are generated by inserting behaviours with high priority into the stack. The motor module
is in charge of implementing the closed-loop control of the physical robot’s actuators.
Figure 5.9: MadCar’s hybrid architecture.
Following this architecture, MadCar’s task (a safe-wander behaviour while maintaining
a given magnetic heading) is implemented using six behaviours (Figure 5.10), a sensing
module, and a motor module. Both the Forward and Backward behaviours are direct
mappings between the desired forward/backward velocity and the corresponding traction
motor controls. The Turn behaviour is a direct mapping between the desired heading and the
corresponding steering motor control. The Collision behaviour checks for the activation of
bumper sensors and then proposes a new direction to follow (it commands the robot to move
191
CHAPTER 5: CONTROL ARCHITECTURE
to the opposite direction to the current one). The Path behaviour, using the information
produced by the compass, tries to maintain the desired robot heading. Lastly, the Avoid
behaviour performs obstacle avoidance, using range information produced by the sonars, and
modifies the desired speed and heading. The sensing module implements the interface to fire
the four sonar modules. The motor module implements a MISO fuzzy inference system, with
trapezoidal sets and singleton outputs. Its rule base, which takes as input the desired heading
and speed, produces control actions that are then applied to the steering and the traction
motors using PWM signals.
Figure 5.10: MadCar’s behaviours implementation (des = desired).
To illustrate how the stack-based controller is implemented, some pseudo-code listings
are shown. The sensing module (Listing 5.1) operates as an asynchronous process that reads
in raw sensory data (read function), processes and filters them (filter function), and then
writes data onto the blackboard (write function). In case of a collision, the module puts a high
priority event (HIT) into the blackboard (add function).
do forever
{
read
read
read
read
sonars;
encoder;
compass;
bumpers;
filter sonars;
write encoder;
write compass;
write bumpers;
write sonars;
if (collision)
add (HIT, PRI_CRITICAL);
}
Listing 5.1: Sensing module.
The stack control (Listing 5.2) runs as another asynchronous process that continually
pops events from the stack (FORWARD, BACKWARD, HIT, ..., etc.), using a priority based
extraction method (extract function), and then executes their corresponding reactive
behaviours (parse_forward, parse,_backward, ..., etc.).
add (FORWARD, PRI_NORMAL);
do forever
{
192
HUMBERTO M ARTÍNEZ BARBERÁ
switch (extract ())
{
case FORWARD:
parse_forward ();
break;
case BACKWARD:
parse_backward ();
case REVERSE:
parse_reverse ();
case HIT:
parse_hit ();
case STEER:
parse_steer ();
}
}
Listing 5.2: Stack control process.
When a behaviour is executed, it occurs in the same thread as the stack control process.
Behaviour implementations should do different tasks. They should add the required events to
the stack, perform their own processing, and then send commands to the motor control
module. For example, the collision reaction behaviour (Listing 5.3), depending on where the
robot collided (front, back, left or right), drives the robot in the opposite direction by pushing
events on to the stack, i.e. add (REVERSE, PRI_NORMAL), and sending commands to the
steering motor, i.e. servo (RIGHT).
if (collision | HIT_MASK_FORWARD)
{
add (REVERSE, PRI_NORMAL);
add (BACKWARD, PRI_NORMAL);
add (REVERSE, PRI_NORMAL);
}
else if (collision | HIT_MASK_BACKWARD)
{
add (REVERSE, PRI_NORMAL);
add (FORWARD, PRI_NORMAL);
add (REVERSE, PRI_NORMAL);
}
else if (collision | HIT_MASK_LEFT)
{
servo (RIGHT);
add (current_sense, PRI_NORMAL);
}
else if (collision | HIT_MASK_RIGHT)
{
servo (LEFT);
add (current_sense, PRI_NORMAL);
}
Listing 5.3: Collision reaction behaviour.
The behaviour to drive the robot forward (Listing 5.4) is very simple. First it signals
two events (that will be processed afterwards), and then it sets traction motor speed (motor
function) to full forward.
add (FORWARD, PRI_NORMAL);
add (STEER, PRI_NORMAL);
motor (FULL_FORWARD);
Listing 5.4: Drive forward behaviour.
193
CHAPTER 5: CONTROL ARCHITECTURE
The proposed architecture enabled MadCar to achieve the intended task successfully,
despite its mechanical and electronic (see chapter 2) constraints. Some facts affected the
operation of MadCar negatively:
• The sonar subsystem was not very reliable in operation, producing lots of false readings
•
•
due to voltage drop of the sonar battery.
The minimum speed was extremely high compared to the sonar firing rate.
The steering system was all-or-nothing, producing jerky paths.
Although these problems caused many collisions in highly cluttered environments, the
reactive nature of the system and its ability to drive the non-holonomic platform backwards
performed acceptably well. The architecture had an important drawback, inherent to most
reactive architectures: the behaviour coordination scheme. It was very easy to implement but
suffered, as subsumption and others do, the tasking problem. To change the robot’s intended
task, it is necessary to re-code the behaviour sequencing scheme, by introducing new events
in the stack. This is not hard as long as the task is not complex. Nevertheless, complex tasks
need to take into account the inter-dependence of many behaviours and side effects do occur
(Saffiotti, 1997b), and thus this simple mechanism does not scale appropriately in this
scenario. As a result, the BGA architecture was proposed to address more complex tasks,
integrating reactivity and deliberation.
5.2.2 BGA: a multiagent architecture
MadCar’s reactive architecture was well suited for small systems of few sensors and
which exhibit simple reactive behaviours, and, in fact, it was used successfully both in
MadCar and La-Garra. But the more evolved Quaky-Ant robot (see chapter 2) allow for new
and more complex tasks, and consequently, the architecture was modified to meet the
following requirements:
• Seamless integration of reactive (i.e. obstacle avoidance) and deliberative modules (i.e.
•
•
•
path-planning), each being a semiautonomous and independent entity that is in charge of a
given task, so that encapsulation, isolation, and local control can be achieved.
There is no assumption about the knowledge each subsystem has or the resolution method
it implements. In addition, extensive use of fuzzy logic should be made.
Flexible and dynamic organisation, allowing the layout of modules depending on
processing needs and enabling and disabling modules at run-time.
Learning techniques are incorporated in the architecture at different levels (behaviours,
blenders, tasks). For a discussion of learning in BGA see chapter 7.
In this context and to achieve the desired objectives, the initial simple blackboard based
architecture was developed into a multi-agent system that uses a blackboard to share
194
HUMBERTO M ARTÍNEZ BARBERÁ
information. The new architecture, called BGA, is built using three different computational
basic units:
• Behaviour: the behaviour is the simplest computational unit in BGA. Behaviours are
•
•
concurrent processes that have a near constant bounded execution time. Behaviour units
are intended for reactive purposes.
Blender: when there are two or more behaviours, the blender is in charge of behaviour
coordination, both behaviour arbitration and command fusion. The blender is not
mandatory, although it is recommended.
Task: the task is the more complex computational unit in BGA. Tasks are asynchronous
processes that do not have constant bounded execution time. Task units are intended for
deliberative purposes.
Figure 5.11: BGA architecture.
Using behaviours, blenders, and tasks as the basic building blocks, the BGA
architecture ( Figure 5.11) is composed of the following different types of elements:
• Agent: an agent is simply a canvas that serves to group behaviours, blenders and tasks. An
•
•
agent should not simultaneously contain behaviours and tasks, and thus two types of
agents can be distinguished: reactive agents, those that only contain behaviours, and
deliberative agents, those that only contain tasks. Agents run in an asynchronously
distributed fashion.
Blackboard: the blackboard serves as the inter-connection infrastructure that links the
different agents of the system. All the agents use a common protocol to communicate to
the blackboard. This communication can be realised using network or shared-memory
based resources.
Sensor processing: sensor processing units are in charge of accessing the different raw
sensor signals, filtering and processing them, and then sending that data to the blackboard.
There is typically only one sensor processing module in the architecture.
195
CHAPTER 5: CONTROL ARCHITECTURE
• Actuator control: actuator control units are in charge of accessing the blackboard to read
motor commands and then sending them to the different actuators. There is typically only
one actuator control module in the architecture.
5.2.2.1 Re active agents
A key idea in BGA is that each reactive agent is decomposed into very simple
behaviours. These access the BGA blackboard by way of some input and output variables. As
two different behaviours may modify the same variable, a behaviour coordination method is
needed to arbitrate behaviours and fuse commands. BGA blenders are based on the contextdependent blending concept, a cooperative behaviour coordination method. In addition to the
characteristics depicted in section 5.1.4, this method presents some advantages related to
scalability and learning. In competitive behaviour coordination methods, like subsumption
(Brooks, 1986) and MadCar’s architecture (see previous subsection), the selection of which
behaviour is active depends on how they are implemented (Gómez et al., 1999c). If the
overall task needs to be changed, some effort needs to be applied in changing the
suppression/inhibition network (subsumption) or modifying the order events are pushed onto
the stack (MadCar’s architecture). Nevertheless, with cooperative approaches the
management resides on the fusion method, which makes the process of defining individual
behaviours a simple task, while much effort was to be applied to the definition of the blending
rules. It may appear to be a lot more work, compared to competitive approaches, but this is
not so. Firstly, once the behaviours and blending rules have been defined it is not very
difficult modify the individual behaviours or the rules. In this case, if the overall task needs to
be changed, it often only requires the adding of additional behaviours and blending rules.
Secondly, if learning techniques are used in conjunction with context-dependent blending, the
modification of the overall task only requires the adding of the required behaviours and the
retraining of the blending rule base (Gómez et al., 1999c). For a detailed discussion on the
learning of blending rules see chapter 7.
196
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 5.12: BGA reactive agent.
A BGA reactive agent (Figure 5.12) is composed of three kinds of modules that are
placed sequentially inside the agent's execution loop:
• A common module, which is executed before any behaviour, builds or updates the local
•
•
perceptual space (LPS) of the agent (see section 5.1.4).
A set of behaviours that are executed concurrently. These are standard reactive modules
that directly map input (from sensors or LPS) to actions.
A blender module, a new behaviour blending method that is based on the contextdependent blending (CDB), which specifies what behaviours are active and how the
outputs of these behaviours are fused (see section 5.1.4).
The reactive agent’s execution loop starts in the common module. It reads sensory
information, performs whatever processing is needed, and then builds or updates the LPS. All
the modules that compose the agent share this LPS. When the common module ends, all the
behaviours are executed concurrently, providing their computation time is small and bounded.
They can use both information stored in the blackboard and the LPS. When finished, they
produce some output that should be stored in the blackboard. Instead of accessing directly the
blackboard, the blending module selects which outputs will be taken into account and how
they will be fused.
The BGA blending mechanism works as follows. At development time, each behaviour
is provided with a given fixed weight (wgt), which determines how important the output of
the behaviour is (actually it is used to weight different outputs). Each behaviour, based on its
sensory input (blackboard and LPS), produces actions or output variables (ov) that are meant
to be put in the blackboard. Instead of working directly with output variables, behaviours
work with local copies of the output variables (lov). The blender implements a fuzzy metarule base (crisp rule bases are merely particular cases), where each rule (eq. 5.1) has inputs
(iv) from the LPS as the antecedent and degrees of activation of some behaviours (beh) as the
consequent.
IF iv1 is ISET 1 and … and ivn is ISET n THEN beh1 is OSET1 and … and behm is OSETm
(5.1)
where:
•
•
•
•
ivi is the ith input variable.
ISETi is the ith fuzzy set for the ith input variable.
beh j is the jth behaviour of the rule.
OSETj is the jth fuzzy set for the jth output variable.
The fuzzy inference mechanism then produces the degree of activation of the different
behaviours. These degrees define the behaviour arbitration schema. When the degree for a
certain behaviour is zero, then the behaviour is not active. If the degree is one, the behaviour
is fully active. In any other case, the behaviour is only partially activated. The last step in the
197
CHAPTER 5: CONTROL ARCHITECTURE
blending procedure is the command fusion (eq. 5.2). The local copies of the output variables
are weighted taking into account both the weight and the degree of activation of the behaviour
that produced them.
α j = wgtj ⋅ behj
∑ α ⋅ lov
=
∑α
j
ovi
(5.2)
ij
j
j
j
where:
•
•
•
•
ovi is the ith output variable.
lov ij is the local copy of the ith output variable in the jth behaviour.
wgtj is the weight of the jth behaviour.
beh j is the result of the fuzzy inference for the jth behaviour.
These fuzzy context rules provide, as they do in CDB, a flexible means of encoding
behaviour arbitration strategies. Like fuzzy control rules, context rules allow complex
strategies to be written in a modular way using a logical format. In addition, context rules
overcome the limitations of typical fixed arbitration strategies, like subsumption, in which
relationships between behaviour are hard coded in advance, and thus, the modification of one
behaviour produces side effects because of behavioural dependence. These context rules
provide a framework to define independent behaviours and a way to combine them.
While the activation degrees produced by the context rules vary according to
environmental changes, fixed weights act as scaling factors. Each particular application will
dictate which combination of weight and inference is the best (to use only weights, to use
only inference, or to combine both). Weights differentiate the BGA blending scheme from
CDB, playing their role by assisting in the development of the control system, and expressing
the natural importance of the behaviours. Some behaviours appear to be more important than
others, for instance, it seems rational that the behaviour in charge of collision-avoidance has a
higher weight than the behaviour in charge of path-following. The developer sets in advance
the overall importance of each behaviour. This can be viewed, in some sense, as a competitive
coordination approach, because these weights are fixed a priori. When the behaviours are
running, the LPS is modified based on the sensory input, and the blending mechanism adapts
or reacts to the environment by changing the degree of activation of the different behaviours.
This can be viewed as a cooperative coordination approach, because the degree of activation
of the different behaviours dynamically changes according to the environment, and some
behaviours are active at the same time. Thus, the BGA behaviour coordination mechanism
mixes both the competitive and the cooperative parts to produce the final result. When all the
behaviour’s weights are set to the same value, the result is approximately equivalent to the
CDB method, and in some cases, the same result could be achieved with CDB as long as the
context rules are carefully designed, at the cost of losing the expressiveness of the weights.
Weights are also important when learning techniques are used in the development of the
198
HUMBERTO M ARTÍNEZ BARBERÁ
blending rule base. Instead of performing a blind search over a wide parameter space, weights
allow the learning mechanism to converge faster because the search space is reduced. Of
course, this approach works because it uses the developer’s knowledge about the problem (the
importance of the different behaviours). When this knowledge is unavailable (or partially
unavailable), the method reduces to the CDB scheme. Chapter 7 includes a detailed
discussion on how learning techniques are applied to BGA blending.
It is important to note here that there is a subtle difference, regarding defuzzification,
between BGA blending and CDB: the order in which defuzzification is performed when
behaviours are implemented using fuzzy rules. The simplest approach is to perform
defuzzification at each behaviour, just before command fusion takes place (Figure 5.13a). In
this case each behaviour has the same importance, irrespective of the shape of its output fuzzy
sets, and the output is averaged.
Figure 5.13a: Defuzzification before command fusion.
The more complex approach is that of CDB, which is intended for behaviours
implemented as fuzzy inference systems that do not perform the defuzzification step (Figure
5.13b). The output variables of the different behaviours are fuzzy sets instead of crisp values
(as in the latter example), and the defuzzification takes place just after the command fusion
step (Saffiotti, 1997). In this case the shape of the behaviours does matter, and in fact
influences the output of the system.
Figure 5.13b: Defuzzification after command fus ion.
Finally, the approach of the BGA blending scheme, which is intended for behaviours
not only implemented as fuzzy inference systems, tries to make a trade-off between the two
previous schemes (Figure 5.13c). The first exhibits the simplest implementation while the
second has a more appealing or “natural” response. In this case defuzzification is performed
at each behaviour, just before command fusion takes place, as in the first example. It is here
199
CHAPTER 5: CONTROL ARCHITECTURE
where weights play their role, by weighting the output of the behaviours. Now different
behaviours, when activated, influence the overall system with different strengths.
Figure 5.13c: Defuzzification before weighted command fusion.
In CDB the outputs of the different behaviours reflect both preferences (the commands
to be executed) and desirability measures for these preferences (the shapes of the output fuzzy
sets). Combining preferences thus uses more information than a simple vector-weighting
schema. This desirability depends on the current behaviour interests (it depends on the
behaviour’s goal and the sensory input), and it dynamically changes as the behaviour changes
its preferences. In BGA blending, the outputs of the different behaviours reflect only
preferences, and the weights of the behaviours are fixed all the time and depend on the overall
importance of the behaviours (in some sense they no longer are desirability measures, but
have a similar effect as fixed ones). As has been discussed above, this is by no means more
restrictive, and in fact, it directly expresses some of the developer’s knowledge about the
behaviours.
5.2.2.2 Deliberative agents
BGA deliberative agents perform typical deliberative tasks, like reasoning, planning,
and world modelling. Just as reactive behaviours use the concept of local perceptual space,
deliberative tasks use the extended perceptual space (EPS). The EPS is a storage structure that
stores global and high-level perceptual data, interpretations of these data, and the
representation of goals and subgoals. Thus, deliberative agents take their input both from the
blackboard and their EPS and put their output both in their EPS and on the blackboard. Tasks
can also share their EPS or parts of them through the blackboard, allowing for more
decentralised computing paradigms. An example of EPS can be the associated information for
a grid map and a path-planner. In this case the EPS stores the position of the goal and it is
updated with the fusion of low-level sensory information (sonars, infrareds, or other range
sensors), and the resulting heading of the path-planner. Occasionally, a deliberative agent may
want to communicate some command to a reactive agent. The deliberative agent takes the
appropriate information from its EPS, and then sends the command to the blackboard. The
reactive agent, at each execution cycle updates its LPS with information from the blackboard.
This is how the BGA architecture enables the asynchronous transmission of commands
between reactive and deliberative agents. In short, the LPS is related to data of local nature,
while the EPS is related to data of global nature.
200
HUMBERTO M ARTÍNEZ BARBERÁ
A BGA deliberative agent (Figure 5.14) is composed of two kinds of modules that are
placed sequentially inside the agent's execution loop:
• A common module executed before any task and which usually updates one of some of
•
the extended perceptual spaces (EPS) of the different tasks.
A set of tasks that are executed sequentially. These are standard deliberative modules that
do not directly map input to actions, but instead model the world (through the use of the
EPS) or reason about the world.
Figure 5.14: BGA deliberative agent.
The deliberative agent’s execution loop starts in the common module. It reads data from
the blackboard, performs whatever processing is needed, and then builds or updates one or
some of the EPS. All the modules that compose the agent have their own EPS (although they
can share them by using the blackboard). When the common module ends, the first task in the
list is executed. It uses information stored in the blackboard and its EPS to perform a
deliberative computation. When finished, the following task in the list performs its
processing. Deliberative agents do not need to perform time-bounded computations, and their
tasks can be arranged hierarchically using the task sequencing mechanism. Thus traditional
AI deliberative schemes, like the sense-plan-act model, can be achieved.
One important point that differentiates reactive and deliberative agents is that tasks do
not have explicit arbitration schemes. It is the designer’s responsibility to avoid conflicts
between the different tasks. For example, a typical case where two deliberative agents, using
different techniques, produce commands or goals to reactive agents. These goals can be
contradictory, and a direct access to the blackboard might produce undesirable results. The
solution is that these agents do not access the same variable in the blackboard, but instead use
their own variables in the blackboard. Then a third agent can combine or fuse the results of
these two agents and effectively perform command fusion at the agent level. Following this
201
CHAPTER 5: CONTROL ARCHITECTURE
example, it is easy to see that the BGA deliberative agent approach is by no means restrictive
and can group a variety of possible arbitration and computational schemes.
5.2.2.3 Discussion
BGA relies on and takes advantage of various well-developed concepts. It is based on a
blackboard architecture as the mechanism to communicate the different subsystems. These
subsystems are based on the notion of agent, which plays an important role as the structural
element of BGA. Agents can be layered and distributed according to their functionalities, and
thus different cognitive models can be implemented. In following sections an example based
on the three-layer hybrid architecture schema is presented. As some of these agents may
implement different reactive behaviours, the fusion of their outputs is accomplished by a
blending method derived from the context-dependent blending. The main contributions of the
BGA architecture are the following:
• BGA effectively groups different concepts and paradigms of control architectures (namely
•
blackboard architectures, hybrid architectures, and multi-agent systems) by extending and
combining their characteristics.
BGA incorporates a new blending mechanism that is based on the context-dependent
blending that effectively addresses the behaviour coordination problem.
Blackboard architecture (Hayes-Roth, 1985) is a proven mechanism to coordinate
different processing elements. It is simpler to implement and more efficient than a message
passing system when coordinating loose-coupled subsystems. Moreover, the blackboard
architecture does not assume any special problem solving technique in the elements that it
coordinates (Vranes et al. 1995). It is important to note that when hard timing constraints exist
blackboard architectures are not appropriate for the coordination of distributed systems
(although there are some real-time single processor systems successfully working which use
blackboard architectures). Under this paradigm, in BGA, a series of elements shares the
available information about the system and the environment. Each element can read the
relevant information from the blackboard, perform its own processing, and then write each
possible result onto the blackboard. These elements run in the same or different machines as
asynchronous processes (or execution threads) allowing an effective distributed architecture.
In BGA, these elements are in fact agents that are capable of taking decisions and also
of serving as a way to modularise and organise the robot control software. In this way these
BGA elements can also be considered intelligent agents (Hendler, 1999), and both reactive
and deliberative agents can coexist in BGA. Additionally, the BGA architecture has a sensorprocessing unit and an actuator-control unit, which can be viewed as specialised agents. BGA
architecture can be classified as a heterogeneous communicating multiagent system (Stone
and Veloso, 2000), as there are agents of different natures that communicate with the others
via the blackboard.
202
HUMBERTO M ARTÍNEZ BARBERÁ
Reactive agents can exhibit different reactive behaviours simultaneously. Hence, they
need to rely on a behaviour blending mechanism to address the behaviour coordination
problem, by providing a way to arbitrate the behaviours (enabling and disabling their outputs,
both totally and partially) and to fuse the commands they produce (by somehow weighting the
outputs). The BGA blending mechanism is based on the context-dependent blending concept
(Saffiotti, 1997) modified to fit into the architecture and augmented to allow for and assist in
the use of learning techniques (see chapter 7).
The distinction between reactive and deliberative agents allows for a layered
distribution of functionalities in the architecture. Although BGA does not impose any
constraints on the organisation of agents, the usual way to organise agents is to follow the
three-layer architecture (Gat, 1998) schema. A three-layer architecture can be implemented in
BGA by using only one deliberative agent (planner layer), which implements some tasks, and
one reactive agent, which implements some behaviours (controller layer) and contains a
behaviour blender (sequencer layer). In this case, a BGA instance is functionally equivalent to
a three-layer architecture. In a typical BGA system there are usually more than two agents
(see following sections and chapters), and the corresponding agent organisation can be
viewed as an augmented and distributed three-layer architecture, where the planner, controller
and sequencer layers are partitioned and distributed across the different agents.
5.2.3 Inter-agent communication in BGA
The BGA architecture relies heavily on the communication services provided by the
blackboard (Figure 5.11). Agents communicate by sending data to the blackboard and
receiving data from the blackboard. Thus, the BGA architecture needs an ACL to support this
data interchange, and both KQML and FIPA ACL are, in principle, good candidates. They are
almost identical with respect to their basic concepts and the principles they observe (Labrou et
al., 1999)(Vasudevan, 1998). The two languages differ primarily in the details of their
semantic frameworks. In one sense this difference is substantial: because of the different
semantic framework it would be impossible to come up with exact mappings of
transformations between KQML performatives and their completely equivalent FIPA
primitives, or vice versa. Both languages assume a basic non-commitment to a reserved
language, although in FIPA ACL an agent must have some limited understanding of a
semantic language to process a received message properly. Another difference between the
two ACLs is in their treatment of the registration and facilitation primitives. In KQML these
tasks are associated with performatives that the language treats as first-class objects. FIPA
ACL, intended to be a pure ACL, does not consider these tasks in their own right. Instead, it
treats them as requests for action and defines a range of reserved actions that cover the
registration and life-cycle tasks. Moreover, FIPA ACL does not provide facilitation
primitives.
There are two important points regarding the use of FIPA ACL and KQML as the ACL
for the BGA architecture:
203
CHAPTER 5: CONTROL ARCHITECTURE
• From a functional point of view, both ACLs provide more facilities that those really
•
needed for BGA. BGA itself is not intended as a general-purpose agent architecture, but
an architecture for integrating reactive and deliberative agents in a mobile robotics
framework. This way, the number of performatives required in BGA is limited and
application specific.
There are not many implementations of FIPA ACL, but there are many implementations
of KQML. Moreover, some of these KQML implementations are for Java. One of the
requirements of the BGA architecture is that it needs to be platform independent, and the
way to achieve this is through the use of Java.
Although FIPA ACL seems to be the natural successor to KQML, it was decided to use
KQML as the ACL for the BGA architecture for the above reasons. There are three important
Java based agent development systems which implement KQML, namely JATLite, Jackal,
and JAFMAS. These are described below.
JATLite 2 (Java Agent Template Lite), developed at Stanford, is a package of Java
programs that allows users to create communicating agents quickly. Agents run as applets
launched from the browser, and for that reason all agents register with an agent message
router facilitator that handles message delivery. This way, JATLite is a centralised system,
and peer-to-peer communication between agents is not possible.
Jackal3, developed at the University of Maryland, is a Java package that provides a
comprehensive communications infrastructure for Java-based agents, while maintaining
maximum flexibility and ease of integration (Cost et al., 1998). Jackal, which strongly
emphasises conversations between agents, provides an environment for a planning and
scheduling project for manufacturing (CIIMPLEX). It provides a flexible framework for
designing agents around conversations and includes extensive support for registration,
naming, and control of agents, through its own protocol (KNS). Conversation specifications
can be downloaded and employed in the moment in Jackal's multi-threaded conversation
space. A blackboard style interface provides the agent application with a maximally flexible
interface to the message stream.
JAFMAS4 (Java-based Agent Framework for Multi-Agent Systems), developed at the
University of Cincinnati, is a set of Java classes that supports the implementing of
communicating agents (Chauhan, 1997)(Chauhan and Baker, 1998). JAFMAS supports
directed (point-to-point) communication as well as subject-based, broadcast communications.
JAFMAS, which also includes some support for conversations, provides a working
environment for AARIA, a planning and scheduling project for manufacturing (Parunak et al.,
1997).
2
http://java.stanford.edu
3
http://jackal.cs.umbc.edu/Jackal
4
http://www.ececs.uc.edu/~abaker/JAFMAS
204
HUMBERTO M ARTÍNEZ BARBERÁ
Although both Jackal and JAFMAS are good candidates for general-purpose MAS
systems, the KQML implementation in BGA relies on JATLite for the following reasons:
• JATLite is an agent router centred implementation, instead of decentralised
•
implementations as Jackal and JAFMAS. This is not a drawback by any means because
BGA is a blackboard architecture and all the agents are in communication with the
blackboard. This way, agents do not need to know how to reach others, and those
decentralised capabilities are not needed.
JATLite is the simpler implementation, providing basic methods for constructing and
parsing KQML messages. JATLite does not provide conversation support, as Jackal and
JAFMAS do. The BGA architecture does not require all the features of a full generalpurpose KQML-based MAS, and hence, the “the simpler the better” principle supports the
use of JATLite.
There is a drawback common to all three implementations of KQML: they use TCP as
the message transmission protocol (although Jackal and JAFMAS can use multicast-UDP for
agent discovery). TCP (Tanenbaum, 1996) is a robust protocol designed to provide reliable
end-to-end byte stream over an unreliable internetwork. An internetwork differs from a single
network because different parts may have wildly different topologies, bandwidths, delays,
packet sizes, and other parameters. This reliability is achieved at the expense of a very high
latency compared to UDP (Tanenbaum, 1996), which is a connectionless transport protocol,
which provides unreliable end-to-end byte stream. The BGA architecture is intended for
mobile robotics applications, where agents typically run in a single network, and response
times are very important. Since lowering the latency lowers the response time, it was decided
to re-implement the JATLite transport layer to support both shared memory communication
(between threads in a simple Java Virtual Machine) and UDP communications.
A BGA connection is a one-way stateless link between an agent and the blackboard,
implemented as a UDP socket (Comer and Stevens, 1993). Each connection is referred by the
server-side address and UDP port. For a full-duplex transmission to occur between the
blackboard and an agent, two connections are needed. Thus, each connection is used to send
data from the client side to the server side of that connection. There is a facilitator process
(following KQML terminology) which is aware of the existence of agents and their respective
network addresses, and is in charge of maintaining a list of available connections. In BGA the
blackboard acts as the facilitator (Figure 5.15).
205
CHAPTER 5: CONTROL ARCHITECTURE
Figure 5.15: BGA transport layer infrastructure.
When the BGA system starts up, the blackboard process starts to listen on a given UDP
port, and each agent starts to listen on its own local UDP port as well. All the agents are
provided with the address and port which the blackboard process is listening to. When an
agent starts running, it should send a message (CONNECT) to the blackboard to indicate that
it is available. The blackboard process then updates its internal connections table with the
agent’s local UDP port, so that the blackboard can later send any answer to that agent. When
an agent stops running, it should send a message (DISCONNECT) to the blackboard to
indicate that it will be no longer available. There are two additional messages available to the
agents: sending data to the blackboard (PUT) and receiving data from the blackboard (GET).
These three types of BGA messages are implemented using the KQML (Labrou and Weber,
1993)(Labrou, 1997)(Labrou and Finin, 1997) performatives register, unregister, tell, and
ask-if. Both the tell and ask -if performatives need the specification of the ontology and the
language used for the content field. In this case, all the messages currently assume a given
ontology named ROBOT and a content description language called BGCL. The BG Content
Language (BGCL) is based on the s-expressions of the LISP language, which are balanced
parenthesis lists (it is the same syntax as KQML itself). The ROBOT ontology specifies the
semantics of the elements of the BGCL. Each element of a BGCL list should be a tuple
containing, at least, two elements: a variable name and its value.
The CONNECT message is implemented using the performative register. In the
example below (Listing 5.5), the agent named Agent-A wants to notify that it has just started
running, and thus it simply sends the register performative to the Blackboard process.
(register
:sender
:receiver
:reply-with
Agent-A
Blackboard
none )
Listing 5.5: CONNECT message.
The DISCONNECT message is implemented using the performative unregister. The
KQML standard specifies that every unregister performative should correspond to a
previously issued register performative. Agents are registered the first time they send a
register message to the facilitator (that is, the blackboard), and thus, the in-reply-to field is not
taken into account and can be left blank. When the facilitator receives an unregister
206
HUMBERTO M ARTÍNEZ BARBERÁ
performative, it simply eliminates the sender from the connection list and makes no additional
processing. In the example below (Listing 5.6) the agent named Agent-A wants to end and
simply sends the unregister performative to the Blackboard process.
(unregister
:sender
:receiver
:in-reply-to
Agent-A
Blackboard
none )
Listing 5.6: DISCONNECT message.
The PUT message is implemented using the performative tell. This message is utilised
to modify the values of a list of variables on the blackboard. In the example below (Listing
5.7), the agent named Agent-A wants to modify two variables corresponding to the speed and
direction the motors should run. The agent forms a valid BGCL list with the names of the
variables and the desired values and then sends a tell performative to the blackboard including
that list in the content field.
(tell
:sender
:receiver
:language
:ontology
:content
Agent-A
Blackboard
BGCL
ROBOT
((motL 50.0) (motR –35.0)) )
Listing 5.7: PUT message.
The GET message is implemented using the performatives ask -if and tell. This message
is utilised to get the values of a list of variables from the blackboard. In the example below
(Listing 5.8), the agent named Agent-A wants to receive the current value of two variables
corresponding to the speed and direction the motors are running. The agent forms a valid
BGCL list with the names of the variables and dummy values, and then sends an ask -if
performative to the blackboard which includes that list in the content field. The blackboard
then replies using a tell performative, including in the content field the list of the required
variables that do exist with their corresponding values.
(ask-if
:sender
:receiver
:reply-with
:language
:ontology
:content
Agent-A
Blackboard
id1
BGCL
ROBOT
((motL 0.0) (motR 0.0)) )
:sender
:receiver
:in-reply-to
:language
:ontology
:content
Blackboard
Agent-A
id1
BGCL
ROBOT
((motL 50.0) (motR –35.0)) )
(tell
Listing 5.8: GET message sequence.
207
CHAPTER 5: CONTROL ARCHITECTURE
5.2.4 Conclusions
BGA relies on and takes advantage of various well-developed concepts. It is based on a
blackboard architecture that coordinates a multiagent system. Agents can be layered and
distributed according to their functionalities, including three-layer hybrid architecture
schemas, where the behaviour coordination is accomplished by a blending method derived
from the context-dependent blending. The main contributions of the BGA architecture are the
following:
• BGA effectively groups different concepts and paradigms of control architectures (namely
•
•
blackboard architectures, hybrid architectures, and multi-agent systems) extending and
combining their characteristics.
BGA incorporates a new blending mechanism that is based on the context-dependent
blending that effectively addresses the behaviour coordination problem.
BGA makes use of KQML, which is typical of message-passing systems but is not
habitual in blackboard systems, implemented over UDP for performance.
5.3 The BG programming language
One of the objectives of this research is to have a simple mobile robot performing
delivery tasks in an indoor, office-like environment, without prior knowledge of the floor
plant, and without any modification or special engineering of the environment. This objective,
among other things, means that methods are needed to develop the control software based on
different robot behaviours. The BGA architecture, defined in the previous section, provides a
good framework to develop such control software. To support the development of control
programs further, a high-level robot programming language, called BG, has been defined and
its corresponding interpreter has been built. The language fully supports the BGA architecture
and allows the user to develop BGA based control systems in a friendly but powerful way.
This interpreter has been incorporated into a development environment, and it is also intended
for deployment.
This section discussed some related robot programming languages and development
environments. Then the BG language is described in detail, putting emphasis in its syntax and
its semantics, and presenting some examples. Finally, the development environment for the
BG language is described.
5.3.1 Antecedents
Traditionally, behaviour-based robotics researchers have traditionally avoided complex
control and data structures, in part because the need of efficiency. More recently, a number of
208
HUMBERTO M ARTÍNEZ BARBERÁ
C-like languages for robot programming have been developed (Horswill, 2000). The parallel
functional decision tress (PFDT) language (Schaad, 1998) is able to emulate many of the
popular behaviour-based programming constructs, including both parallel and serial control
structures. It is implemented as a C++ class library. The CES language (Thrun, 1998b)
extends C with probabilistic data types and support for gradient descent function
approximators. A particular appealing property of CES is that training information need not
be specified at the output of the function approximator. Instead, desired values can be
provided for other variables that are computed from the output of the approximator. The
compiler does the appropriate symbolic manipulations to compute the gradient of the derived
quantity with respect to the weights of the approximator. TDL (Simmons and Apfelbaum,
1998) is a C++ extension that provides support for subgoal decomposition and
synchronisation. It includes a graphical front-end for the language that allows a form of visual
programming. MissionLab (MacKenzie and Arkin, 1997) is another visual tool, which is
intended to allow novice programmers to efficiently task robots, although it does not allow
the development of new low-level behaviours. COLBERT (Konolige, 1997) is another C-like
language for low-level reactive systems that is designed to interface with the Saphira
architecture (Konolige et al., 1997). COLBERT makes use of fuzzy rule bases for the
definition of behaviours and behaviour blenders. Frob (Peterson et al., 199) is an embedded
language in Haskell which can be used for reactive control loops. GOLOG (Levesque et al.,
1997) is a logic programming language that combines the automatic inference capabilities of
logic programming with the explicit control operations of imperative languages. GRL
(Horswill, 2000) is a language that extends traditional functional programming techniques to
behaviour-based systems, in a similar way as Frob. GRL is an architecture-neutral language
embedded within Scheme and provides a wide range of constructs for defining data flow
within communicating parallel control loops. A simple compiler distils the code into a single
while loop containing straight line code and then emits it as Scheme, C, C++ or BASIC code.
Both Frob and GOLOG provide considerably more expressiveness than GRL at the cost of
increased execution time.
Each one of the languages above has its own pros and cons. One of the main goals of
the BGA architecture is to tackle the behaviour blending problem and the learning of the rules
that drive the blending process. The two languages that are more interesting from the BGA
point of view are COLBERT and GRL. CLOBERT uses a similar mechanism for behaviour
definition and blending, but it is intended for the implementation of the low-level software
only. GRL is a higher-level language that is flexible enough to implement different control
schemes. As new ideas and functionalities, especially the use of Java, were required to try, a
new high level programming language, named BG, was defined and implemented borrowing
ideas from COLBERT and GRL.
5.3.2 BG syntax and semantics
To support the development of BGA based control systems, a high level language,
called BG, along with its corresponding interpreter were specified and developed. The main
goal of BG is to aid in the development cycle of robotics applications by providing a single
and uniform framework in which to declare and specify all the elements of the BGA
209
CHAPTER 5: CONTROL ARCHITECTURE
architecture (behaviours, tasks, blenders and agents). Moreover, behaviour modelling is
supported by an extensive use of fuzzy logic. A BG program is made up of the definition of
some BGA agents and their interface with the blackboard. A BG program is intended to be
executed as a single process (which can contain different execution threads), and thus, a full
distributed BGA architecture is composed of many BG programs with each one running in a
different processor. BG, in its current version, incorporates many traditional elements and
control structures of classical imperative languages:
•
•
•
•
•
Local and global variables.
Expressions and assignments.
Arithmetical, logical and relational operators.
Conditionals.
C-like functions.
BG supports two data types: real numbers and fuzzy sets with standard shapes. In
addition to that, BG presents advanced non standard control structures like:
•
•
•
•
•
Deterministic finite state machines.
Fuzzy rule bases.
Approximate reasoning support.
Robot navigation support.
Agents, tasks, behaviours and behaviour blending support.
The BNF notation is used to specify the BG syntax, which is largely based on the C
language syntax. In BNF, lowercase names represent non-terminal symbols and uppercase
names represent terminal symbols. The non-terminal number represents any real value (either
positive or negative) and the terminal ID represents any correctly defined identifier. In both
cases their syntaxes correspond to that of C. All terminal symbols except ID are considered
reserved keywords and thus they cannot be used as identifiers. Moreover, the character string
corresponding to a terminal symbol is that composed of the same letters, but in lowercase. For
instance, the terminal FLOAT corresponds to the string “float”.
BG supports the use of comments as in C (“/* */”) or in C++ (“//”) and they are used
along the different examples to clarify the meaning of some elements of the language.
5.3.2.1 Variables, operators and expressions
BG supports only real numbers and fuzzy sets. One design criteria of BG was to make it
as simple as possible by only supporting a minimal set of types and instructions without
losing generality. As integers can be viewed as a particular case of real numbers, only the
latter type is available. Variables are declared by specifying their type. The vdecl non-
210
HUMBERTO M ARTÍNEZ BARBERÁ
terminal corresponds to a variable declaration (Listing 5.9). The FLOAT type corresponds to
real numbers coded using the IEEE 754 format, and they can be optionally initialised. Real
variables can be accessed to read or to modify their content and are only available in the
scope of a BG program. When these variables are not in the scope of a BG program (that is,
when they are used to access a BGA blackboard) they must be declared using modifiers.
Modifiers constrain the access to these variables in read or write operations to reduce the
communication overhead. Three types of modifiers are supported:
• Sensor: the variable is accessed in read-only mode.
• Effector: the variable is accessed in write-only mode.
• Status: the variable is accessed in read-write mode.
vdecl
::= SET ID “=” ftype fuzzy “;”
| FLOAT varlist “;”
| SENSOR FLOAT idlist “;”
| EFFECTOR FLOAT idlist “;”
| STATUS FLOAT idlist “;”
ftype
::= TRAPEZOID
| SIGMOID
| BELL
| TRIANGLE
| CRISP
| TSK
fuzzy
::= “[“ number ( “,” number )* “]”
varlist
var
idlist
::= var ( “,” var )*
::= ID | ID “=” number
::= ID ( “,” ID )*
Listing 5.9: BNF for variable declaration.
The SET type corresponds to fuzzy sets that must be initialised. Five types of fuzzy sets
are supported, as well as TSK rule consequents:
• Trapezoid. Corresponds to fuzzy sets with trapezoidal shape (Figure 5.16) defined by four
•
•
•
points [a, b, c, d]. It is mandatory to specify all four points so that they form a valid
trapezoid, although b and c are allowed to be coincident (this way a triangle is obtained).
Triangle. Corresponds to fuzzy sets with triangular shape (Figure 5.16) defined by three
points [a, b, c]. It is mandatory to specify all three points so that they form a valid
triangle.
Crisp. Corresponds to crisp sets (Figure 5.16) defined by a single number [a]. It is
mandatory to specify just a single point.
Sigmoid. Corresponds to fuzzy sets with sigmoidal shape (Figure 5.17) defined by two
points [a, b]. It is mandatory to specify both points so that they form a valid sigmoidal
function (Eq. 5.3).
211
CHAPTER 5: CONTROL ARCHITECTURE
• Bell. Corresponds to fuzzy sets with bell-like shape (Figure 5.17) defined by three points
•
[a, b, c]. It is mandatory to specify all three points so that they form a valid bell-like
function (Eq. 5.4).
TSK. Corresponds to TSK rule consequents defined by points [x0, x1, …, xn] so that x0 is
the independent term and x1 … xn are the coefficients corresponding to the n input
variables. It is mandatory to specify enough points so that they form a valid TSK
consequent (see chapters 1 and 3) at run-time, that is, the number of points is checked
only when executing the rules.
Figure 5.16: Trapezoid, triangle and crisp fuzzy sets.
Figure 5.17: Bell and sigmoid fuzzy sets.
µx =
µx =
1
1+e
(5.3)
−b ( x −a )
1
2b
x−c
1+
a
(5.4)
Real variables can be operated with standard arithmetical (+, -, *, /), logical (&&, ||, !),
and relational operators (==, !=, <, >, <=, >=). The logical and relational operators are
always evaluated and give the result as a real number (1.0 for true and 0.0 for false).
212
HUMBERTO M ARTÍNEZ BARBERÁ
Additionally, real variables and fuzzy sets can be operated with standard fuzzy operators
(&&, ||, !, IS) that represent t-norm, t-conorm, inverse and degree of membership,
respectively. By default, the semantics of the t-norm and t-conorm are the min and the max,
respectively. Variables, sets, numbers and operators can be combined to form expressions.
The expression non-terminal corresponds to a real variable expression (Listing 5.10), and the
fexpression non-terminal corresponds to a fuzzy expression (Listing 5.11).
expression
::= expression “+” expression
| expression “-” expression
| expression “*” expression
| expression “/” expression
| “!” expression
| expression “&&” expression
| expression “||” expression
| expression “==” expression
| expression “!=” expression
| expression “>” expression
| expression “>” expression
| expression “>=” expression
| expression “>=” expression
| “(” expression “)”
| ID
| number
| function
Listing 5.10: BNF for real expressions.
fexpression
::= fexpression “&&” fexpression
| fexpression “||” fexpression
| “!” fexpression
| fexpression IS fexpression
| “(” fexpression “)”
| ID
| number
| set fuzzy
| function
Listing 5.11: BNF for fuzzy expressions.
5.3.2.2 Commands and functions
The BG language, like C, allows the use of classical control structures as assignments
and conditionals, as well as finite state machines, fuzzy rule bases and navigation support
methods. These structures are called commands or instructions, and they can be grouped in
the so-called blocks. A block is a set of commands that are executed sequentially. The block
non-terminal corresponds to a block declaration, which is composed of one or many command
non-terminal symbols (Listing 5.12).
block
::= “{” (command)+ “}” | command
command
::= IF “(” expression “)” block
213
CHAPTER 5: CONTROL ARCHITECTURE
|
|
|
|
|
|
|
|
|
|
IF “(” expression
ID “=” expression
PRINTF “(” ID “)”
HALT “;”
function “;”
rules
grid
fsm
SHIFT ID “;”
RETURN expression
“)” block ELSE block
“;”
“;”
“;”
Listing 5.12: BNF for commands.
Conditionals (IF, and IF-ELSE) and assignments (“=”) are similar to their counterpart in
C. Additionally, the PRINTF command outputs the value of the variable that is its argument,
and the HALT command stops the execution of the BG interpreter, and thus no more actions
are sent to the robot. The special-cases fsm, grid and rules non-terminals are discussed below.
funcdecl
::= FUNCTION ID “(” idlist “)” block
function
::= ID “(” idlist “)”
Listing 5.13: BNF for functions.
BG supports the use of simple user-defined functions that are global to the different BG
modules, which correspond to the elements of the BGA architecture (agents, behaviours and
tasks). The implicit return type of BG functions is real numbers and the actual value that a
function returns is specified using the RETURN command (Listing 5.12). The funcdecl nonterminal corresponds to the declaration of a BG function and the function non-terminal
corresponds to the call of a BG function (Listing 5.13). The parameters of a BG function are
also explicit and must be real values. If no return command appears, by default, BG functions
return zero.
5.3.2.3 Finite state machines
The use of finite state machines (FSM) allows the programmer to sequence tasks and
actions, and define the conditions to change from one task to another. The BG language, as in
its current implementation, does not have instructions for loops (while, for, or repeat-until),
but these can be modelled using finite state machines. The main reason for this lack of loops
is to keep the duration of each control loop under control (so that an error cannot lock it). In
future, loops will be incorporated, as in COLBERT (Konolige 1997), loops which are
internally, in fact, finite state machines. Another use of FSMs is the implementation of simple
planners that merely chain simple actions. The fsm non-terminal corresponds to the
declaration of an FSM (Listing 5.14). Each state has a unique identifier and a block of
commands that are executed continually while the state is active. The SHIFT command
(Listing 5.12) can be used to change the current state of the FSM. The initial state of an FSM
is indicated in its definition (Listing 5.14).
fsm
::= FSM START ID “{” states “}”
214
HUMBERTO M ARTÍNEZ BARBERÁ
states
::= ( STATE ID “:” block )
+
Listing 5.14: BNF for finite state machines.
For example, a robot has to move from a home location (home_x, home_y) to a given
goal coordinates (gx, gy), and then return to home. It features a sensor to identify its position
(x, y), and a function (ingoal) that determines if the robot is in its current goal. The task
supervision could correspond to the following sample BG code (Listing 5.15), with its
corresponding state machine (Figure 5.18). This graph uses ellipsoids to represent states, a
thick arrow to represent the initial state and thin arrows to represent transitions. Each thin
arrow is labelled with the symbol that triggers the corresponding transition. A special symbol
(“*”) is used as a wildcard and denotes every possible input symbol except those
corresponding to the other thin arrows leaving the state. In the example below, the FSM starts
in the HOME state, which sets some variables in order to establish the home and goal
coordinates and then shifts to the LOOKING state. This state executes its code (labelled as
“user code”) repeatedly until the ingoal variable is set to 1.0 (that is, the robot is in the goal
location). When this occurs, some variables are modified in order to replace the goal
coordinates by the home position and then CHANGING is set as the new active state. This
state is executed until the ingoal is set to 0.0 (that is, the robot no longer is in the goal
location), and then it shifts to the RETURNING state. This state is executed until the ingoal is
set to 1.0 (that is, the robot is in the home location), and then it shifts to the GOAL state. This
state simply sets a flag variable to indicate that the task has been accomplished.
Figure 5.18: Sample FSM-based planner.
fsm start HOME
{
state HOME:
{
goal
home_x
home_y
goal_x
=
=
=
=
0.0;
x;
y;
gx;
// Goal not reached
// Set origin coordinates
// Set goal coordinates
215
CHAPTER 5: CONTROL ARCHITECTURE
goal_y = gy;
shift LOOKING;
}
state LOOKING:
{
/* User code */
if (ingoal == 1.0)
{
goal_x = home_x;
goal_x = home_y;
shift CHANGING;
}
}
state CHANGING:
{
/* User code */
if (ingoal == 0.0)
shift RETURNING;
}
state RETURNING:
{
/* User code */
if (ingoal == 1.0)
shift HALT;
}
state GOAL:
{
goal = 1.0;
halt;
}
// Change goal coordinates
// Goal reached
}
Listing 5.15: Example of an FSM.
5.3.2.4 Fuzzy rule bases
Fuzzy rule bases are intended for both control and behaviour blending (see previous
section on BGA architecture), and are one of the most noticeable features of the BG language.
BG fuzzy rules can be constructed by using any combination of BG fuzzy operators (t-norm,
t-conorm, inverse, and degree of membership), variables and fuzzy sets to define the
antecedent, and by using any combination of variables and fuzzy sets to define the
consequent. In this way BG supports multiple -input multiple -output (MIMO) fuzzy rules, as
well as any combination of input and output variables. The rules non-terminal corresponds to
a fuzzy rule base (Listing 5.16), which is composed of a set of fuzzy rules (the IF terminal)
and also of a special kind of rules, called background rules (the BACKGROUND terminal). A
background rule represents a fuzzy rule whose antecedent always evaluates to the same fixed
degree of membership, that of the parameter. These rules are useful to cover undefined input
space. This is typically achieved by specifying a very low degree of membership, which
dominates the output if no other rule fires and otherwise affects the output without any
significance. Background rules play an important role when used in conjunction with
automated learning systems, because these methods usually tend to leave uncovered input
space regions.
216
HUMBERTO M ARTÍNEZ BARBERÁ
In the current implementation, the t-norm and t-conorm operators can only be changed
globally for all the fuzzy rule bases. The examples in this thesis assume that the min and the
max operators are used as t-norms and t-conorms, except where explicitly indicated. Finally,
the center of gravity method is used for defuzzification.
::= RULES “{” (rule)+ “}”
::= IF “(” fexpression “)” postlist “;”
| BACKGROUND “(” expression “)” postcond “;”
::= postcond ( “,” postcond )*
::= ID
| ID IS ID
| ID IS ftype fuzzy
| ID IS number
rules
rule
postlist
postcond
Listing 5.16: BNF for fuzzy rule bases.
For instance, if a robot has to maintain a given distance to a wall on its right, the
following BG code performs the task (Listing 5.17). Right is the input variable that measures
the distance from the robot to the wall, which can be small (fuzzy sets CLOSE and NEAR),
medium (fuzzy set MED), or large (fuzzy set FAR). Turn is the output variable that controls if
the robot should steer to the left (fuzzy sets TSL and TL), to the right (fuzzy set TSR), or
should continue straight ahead (fuzzy set TN).
// Sets for ToF sensor range [0 .. 10] (metres)
set CLOSE
= trapezoid {0.0, 0.0, 0.17, 0.2};
set NEAR
= trapezoid {0.17, 0.2, 0.3, 0.4};
set MED
= trapezoid {0.3, 0.4, 1.3, 1.4};
set FAR
= trapezoid {1.3, 1.4, 10.0, 10.0};
// Sets for steering [-10 .. 10]
set TL
= trapezoid {-10.0, -9.0, -6.0, -5.0};
set TSL
= trapezoid {-6.0, -5.0, -2.0, -1.0};
set TN
= trapezoid {-2.0, -1.0, 1.0, 2.0};
set TSR
= trapezoid {1.0, 2.0, 5.0, 6.0};
set TR
= trapezoid {5.0, 6.0, 9.0, 10.0};
//
//
//
//
//
Left
Small left
None
Small right
Right
rules
{
if
if
if
if
(right
(right
(right
(right
is
is
is
is
FAR)
MEDIUM)
NEAR)
CLOSE)
turn
turn
turn
turn
is
is
is
is
TSR;
TN;
TSL;
TL;
//
//
//
//
Turn small right
No turn
Turn small left
Turn left
}
Listing 5.17: Example of a fuzzy rule base.
5.3.2.5 Robot navigation support
The BG language provides a basic structure and methods to manage two-dimensional
spatial information for constructing maps and for path planning. The grid non-terminal
corresponds to a grid map structure (Listing 5.18) that consists of a series of square cells
arranged in an N by M matrix. Each cell has two values associated: the degree of certainty that
the cell is empty eij, and the degree of certainty that the cell is occupied wij. These values are
217
CHAPTER 5: CONTROL ARCHITECTURE
calculated and fused for each cell in the beam of a single sensor to produce the final M(i,j)
values that are used for navigation purposes. A detailed description of the map-building and
path-planning processes is given in chapter 4. When a grid is initialised, the user provides
some parameters: the horizontal (N) and vertical (M) size of the grid map (the left-bottom cell
serves as the origin of coordinates), the length in metres of the cell sides, and the pathplanning method. Currently BG supports two types of path-planning methods, which produce
as a result the absolute heading that the robot should follow to reach the goal:
• A*. An A* algorithm uses the M(i,j) values of the grid to search for the shortest path from
•
the current robot position to the goal. This process is time consuming and obtains the full
free-of-collision path to the goal. This mode is used when the A_STAR terminal is
specified.
Direct. This method calculates the heading to the goal from the current robot position
without taking into account the grid map, which is useful to implement completely
reactive goal-seeking behaviours. This process is very fast and merely obtains the heading
to the goal. This mode is used when the DIRECT terminal is specified.
Each time the grid structure is processed, it can be updated with the following
information:
• The current absolute location (x, y) of the robot, using the LOCATION grid command
•
•
•
(Listing 5.18).
The current absolute location (x, y) of the goal place, using the GOALPOS grid command
(Listing 5.18).
The absolute location (x, y) of a possible obstacle, using the OBSTACLE grid command
(Listing 5.18).
A relative sensor reading (x, y, sensor, sensor_value) from the given absolute location,
using the UPDATE grid command (Listing 5.18).
grid
gmode
gbody
gblock
gcoord
gcommand
gaction
::= GRID gmode UPDATES ID gbody
::= A_STAR
| DIRECT
::= CELLS number BY number SIDE number gblock
| CELLS number SIDE number gblock
| gblock
::= “{” (gcommand) + “}”
::= ID “,” ID
::= IF “(” expresion “)” gaction
| LOCATION “(” gcoord “)” “;”
| gaction
::= GOALPOS “(” gcoord “)” “;”
| OBSTACLE “(” gcoord “)” “;”
| UPDATE “(” gcoord “,” ID “,” number “,” ID “)” “,”
Lis ting 5.18: BNF for grid map structures.
218
HUMBERTO M ARTÍNEZ BARBERÁ
An example of a grid map generated in a robot run is shown below (Figure 5.19). A
circular robot navigates in an indoor office like floor plant. It has ten sonar range sensors
(sonar0, … , sonar9) at different angles, each one identified by a number (0, .., 9). In each
control loop cycle, the current robot location and the current goal are updated. If a sonar
reading is less than a given threshold (5 metres) the grid is updated with the reading,
otherwise it is discarded. Finally, the variable heading is updated using the output of the A*
path-planner and the values of the cells that form the map are updated (Listing 5.19). It is
interesting to note here that “a_star” refers to ht e path-planning method, “updates heading”
refers to the variable that is updated with the result of the path-planning, and “update ()”
refers to the command to update the grid map.
Figure 5.19: Example of a BG grid map and the path obtained.
grid a_star updates heading cells 71.0 by 71.0 side 0.05
{
location (x, y);
goalpos (gx, gy);
if
if
if
if
if
if
if
if
if
if
(sonar0
(sonar1
(sonar2
(sonar3
(sonar4
(sonar5
(sonar6
(sonar7
(sonar8
(sonar9
<
<
<
<
<
<
<
<
<
<
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
update
update
update
update
update
update
update
update
update
update
(x,
(x,
(x,
(x,
(x,
(x,
(x,
(x,
(x,
(x,
y,
y,
y,
y,
y,
y,
y,
y,
y,
y,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
sonar0);
sonar1);
sonar2);
sonar3);
sonar4);
sonar5);
sonar6);
sonar7);
sonar8);
sonar9);
}
Listing 5.19: Example of a grid-based processing.
5.3.2.6 Agents, behaviours and tasks definition
219
CHAPTER 5: CONTROL ARCHITECTURE
BG was designed to serve as the development language of BGA architectures, and thus,
it supports the definition of the different elements of a BGA architecture: agents, tasks,
behaviours and blenders. The agent non-terminal corresponds to an agent (Listing 5.20)
which can be reactive or deliberative. The former is specified using the REACTIVE keyword.
A deliberative agent is declared by specifying an optional common block (the common nonterminal) and a series of tasks (the task non-terminal). A reactive agent is a little bit more
complicated. It is declared by specifying an optional common block (the common nonterminal), by a series of behaviours (the behaviour non-terminal), and by an optional blender
block (the blender non-terminal). All the different processing blocks (namely common, task,
behaviour, and blender blocks) have in common that they can make use of local real variables
(local fuzzy sets are not supported), and these can be used to store values that are not used
elsewhere along a BG program. In addition, behaviours can specify a weight optionally,
which serves in the BGA behaviour fusion procedure (see previous section). If the weight is
not specified, it is set to 1.0 by default. When using a BG program in conjunction with
learning techniques to develop a certain behaviour blender, reactive agents can specify what
input variables will be used inside the blender block along with their respective ranges. This
way the learning method can limit the search space (see chapter 7 for an in depth discussion).
The blendl non-terminal corresponds to a list of real variables that are used in the blender
block.
agent
agbody
::= AGENT ID REACTIVE agbody
| AGENT ID “{” common (task) + “}”
::= “{” blendl common (behaviour)+ blender “}”
blendl
blenv
::= BLENDING blenv ( “,” blenv ) * “;”
::= ID RANGE “(” number “,” number “)”
| ID
local
common
blender
::= ( FLOAT varlist “;” )?
::= ( COMMON “{” local (command) + “}” )?
::= ( BLENDER “{” local (command)+ “}” )?
behaviour
behbody
::= BEHAVIOUR ID ( WEIGHT number )? behbody
::= “{” local (command) + “}”
task
::= TASK ID “{” local (command)+ “}”
Listing 5.20: BNF for agent definition.
A simple reactive agent definition is shown below (Listing 5.21). The agent has three
behaviours: one which avoids obstacles (avoid ), one which aligns the robot to the left wall
(alignL), and one which aligns the robot to the right wall (alignR). These behaviours based on
sensory data (left, right, and front, the input variables that represent the distance to objects on
the left, right and front respectively) propose control commands (turn and speed). The blender
block specifies how the output of these behaviours is fused in each time step (Figure 5.20),
based on its sensory input (left, right, and front). The example shows how these elements are
specified using BG. The agent is specified to be reactive (“reactive” keyword), and thus it is
composed of a common module, some behaviours and a blender module. The actual
implementation of the behaviours and the common module is not shown in the example. The
blender simply contains a fuzzy rule base (“rules” keyword), where the consequents are the
behaviours previously mentioned. The declaration of the fuzzy sets is not shown, and they
should be declared before the agent declaration. The fuzzy sets that appear in the antecedents
220
HUMBERTO M ARTÍNEZ BARBERÁ
are similar to that of previous examples (Listing 5.17), while the fuzzy sets that appear in the
consequents (LOW, HALF, and HIGH) represent low, medium and high degrees of activation.
Figure 5.20: Result of the behaviour fusion.
agent simpleControl reactive
{
blending
left, right, front;
common
behaviour avoid weight 1.0
behaviour alignL weight 0.5
behaviour alignR weight 0.5
{
{
{
{
/*
/*
/*
/*
Code
Code
Code
Code
goes
goes
goes
goes
here
here
here
here
*/
*/
*/
*/
}
}
}
}
blender
{
rules
{
background (0.01)
background (0.01)
background (0.01)
avoid is LOW;
alignR is LOW;
alignL is LOW;
if (left is CLOSE) avoid is HIGH;
if (front is CLOSE) avoid is HIGH;
if (right is CLOSE) avoid is HIGH;
if (left is NEAR)
if (front is NEAR)
if (right is NEAR)
avoid is HALF;
avoid is HALF;
avoid is HALF;
if (left is MED)
if (right is MED)
alignL is HALF;
alignR is HALF;
if (left is FAR)
if (right is FAR)
alignL is HIGH;
alignR is HIGH;
}
}
}
Listing 5.21: Example of a reactive agent definition.
5.3.2.7 BG programs definition
A BG program is simply a collection of agents and variables. The program nonterminal corresponds to a valid BG program (Listing 5.22) composed of a variable declaration
section, a function declaration section, an initialisation block and an agent declaration section.
221
CHAPTER 5: CONTROL ARCHITECTURE
The initialisation block (the initializ non-terminal) is executed only once when the BG
program is started and it is intended to perform whatever initialisation is needed.
program
initializ
::= (vdecl)* (funcdecl)* initializ (agent)+
::= (INITIALIZATION “{” (command)+ “}” )?
Listing 5.22: BNF for BG programs.
5.3.3 The BGen development environment
The BG language interpreter has been developed using the Java language. The BG
parser has been produced using the Java CUP 5 parser generator. Java CUP is a system for
generating LALR parsers from a grammar specification. It serves the same role as the widely
used program YACC (Johnson, 1975), which uses C instead of Java. The BG interpreter is
composed of two modules: the BG parser and the BG executor. The BG parser takes as input
a file containing a BG program and then it proceeds with the lexical (using a custom-made
finite state lexical parser) and syntactical analysis (using the CUP parser). As the BG
language is simple, the semantic analysis (basically just type checking) is performed
simultaneously with the syntactical analysis. When a valid BG program has been parsed, the
module produces an intermediate code for later execution. The BG executor takes as input the
intermediate code and makes no additional checking, and then interprets the code (working in
fact as a transliterator). The intermediate code is stored as a directed graph, where the nodes
correspond to different elements of the BG language (variables, expressions, commands,
behaviours, tasks and agents). This way, the time-consuming part of the interpretation process
(the lexical and syntactical parsing) is decoupled from the execution part for efficiency
reasons. Although the BG executor is programmed using Java, which in fact is also an
interpreted language, BG programs can be executed successfully on i486 processors.
The BG language interpreter has been incorporated into a development environment,
called BGen, which also integrates a simulator for different customised holonomic robots,
data visualisation and recording tools, and a behaviour learning tool. The BGen environment
is intended not only for development but also for deployment. In the development stage the
user defines and specifies a BG program that reflects a particular BGA architecture and then
simulates the overall behaviour of the robot to test the feasibility of the solution. Once the BG
program is ready for deployment, it is transferred to the real robot, which contains the BGen
environment but uses a robot interface module instead of the simulator. This is possible
because all the software has been developed in Java and also implies that the real robot must
include a Java virtual machine, which is possible in most common operating systems. The
robots used in this thesis to test the BGen environment run the Linux operating system. This
approach for development and deployment presents some advantages:
• Both the development and deployment environments are identical.
5
http://www.cs.princeton.edu/~appel/modern/java/CUP
222
HUMBERTO M ARTÍNEZ BARBERÁ
• The development to deployment time is considerably reduced.
• Different platforms can reuse the code, except for the lowest-level functions.
Figure 5.21: BGen edition and component testing modules.
BGen consists of different modules, and the most important are described below. The
edition module (Figure 5.21) contains a simple text editor and some tools to aid in the process
of writing BG programs. One of these tools presents the structure of a BG program
graphically and another tool serves to test functions, individual behaviours or complete agents
by computing their output when input variables vary inside a given range. In the example, the
“Controller Graph” window displays the controller as a tree, where the root is the controller
itself. The first level corresponds to high-level module categories, namely functions and
agents. The second level in the hierarchy corresponds to the current elements that are inside
these categories. In this case there are four functions (min, max, sgn, and limits) and four
agents (VehicleControl, Localization, Planning, and ReactiveControl). The third level
corresponds to the different sub-elements of the elements that appear in the second level. In
this case they are the leaf nodes of the agents. The “Component Test Options” window lets
the user select a component, two variables and a range of values. Using these values the
component is evaluated and the corresponding control surface is displayed in the “Component
Test Output” window.
223
CHAPTER 5: CONTROL ARCHITECTURE
Figure 5.22: BGen simulation module.
The robot simulation module (Figure 5.22) simulates holonomic mobile robots and
executes BG programs. It graphically displays the robot’s behaviour in a given environment,
the map being constructed by the robot, and also the current behaviour activation. It shows
additionally the values of the robot’s sensors, the robot’s path, the number of collisions and
epochs, and the shortest path from the robot to the goal. In the example, the “Robot
Simulation Environment” window lets the user specify the parameters of the simulation, that
is the actual map of the environment, the robotics platform being simulated, and the BG
control program. The “Simulated World” window displays the robot, the map, and the robot’s
sensor values. The “Behaviour Blending” window displays the degree of activation of the
different behaviours implemented in the BG program at each time step. The “Occupancy
Grid” window has two tabs, one that displays the grip map generated by the BG program, the
path-planning result and superimposed the real map, and other that displays the 3D cost
function used by the path-planning method specified in the BG program. Additionally,
windows that display the values of the effector variables of the BG program can also be
activated and shown.
224
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 5.23: BGen sensor simulation and map-building module.
The sensor simulation and map-building module (Figure 5.23) simulates sonar and
infrared sensors, and serves as a testbed of different data fusion algorithms and map-building
techniques. In the example, the “Sensor Simulation Environment” window lets the user
specify the map, the robot, and the path that the robot will follow. It also allows the selection
of parameters related to the simulation: the simulation method used for the sensors, the
number and type of sensors to be simulated, the filters to be applied to the sensors, and the
sensor fusion method that is used. When a simulation is run, the module displays the values of
the sensors, the path of the robot, some performance calculations (for instance, the rooted
mean square error of the sensor fusion method), and optionally the internal data used to
simulate sensors (for instance, the sound paths in the case of sonar sensors). The “Occupancy
Grid” window displays the grid map generated during the robot’s path, and lets the user
change the way cells are updated.
5.3.4 Conclusions
BG is a high-level robot programming language that fully supports the BGA
architecture and thus, it allows the user to develop BGA based control systems in a friendly
but powerful way. BG aids in the development cycle of robotics applications by providing a
single uniform framework to declare and specify all the elements of the BGA architecture
225
CHAPTER 5: CONTROL ARCHITECTURE
(behaviours, tasks, blenders and agents), and by supporting behaviour modelling via extensive
use of fuzzy logic. In addition, a full distributed BGA architecture can be achieved by running
BG programs in different processors.
The BGen development environment incorporates a BG language interpreter, as well as
integrating a simulator for different customised holonomic robots, data visualisation and
recording tools, and also a behaviour learning tool. The BGen environment is intended both
for development and deployment, and provides a simple generic way to develop and test BG
programs off-robot, and to execute such programs in a real robot. Thus, the development to
deployment time is considerably reduced, and different platforms can reuse the code.
226
Chapter 6
A Sample Agent Architecture
6.1 Introduction
Once an architecture (BGA), a programming language (BG) and a development
environment (BGen) are available, the natural step is to use them in an application using the
Quaky-Ant robot (see chapter 2). The robot’s task is to navigate from a given home position
to a goal location, and then return to the initial position, in a completely unknown
environment but similar to a real indoor floor plant (with different rooms and corridors). The
coordinates of the two places are known beforehand. In addition, the robot should neither
collide nor get frozen (or get stuck in an infinite loop). To accomplish this goal the robot must
be able to perform certain tasks: navigation with obstacle avoidance, goal seeking, map
building and localisation, some form of high level planning, etc. The whole system has been
decomposed, by extracting the most important elements (Armstrong et al., 1998), into the
following series of agents (Figure 6.1) which will be described in the next subsections:
• Reactive Control. This is the BGA reactive agent that is in charge of the reactive
•
•
•
navigation. It implements a series of behaviours to avoid collisions, follow a given path,
align to the walls, and escape from U-shaped obstacles.
Planning. This is the BGA deliberative agent that is in charge of the high level planning
and the global goal supervision.
Navigation. This is the BGA deliberative agent that is in charge of building a model of the
environment, locating the robot in it, and generating feasible paths to reach the goal.
Sensor Processing and Actuator Control. These are the BGA elements that are in charge
of processing raw sensor data and controlling the motors. The former accesses raw sensor
signals, filters and processes them, and puts relevant data on the blackboard. The latter
reads motor commands from the blackboard and sends the corresponding control actions
to the motors.
227
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
Figure 6.1: Sample agent architecture.
6.2 Sensor processing and actuator control
The sensor processing module accesses raw sensor signals, filters and processes them,
and puts relevant data on the blackboard. Most parts of this module are platform dependent,
because they communicate with the hardware platform, although they have been encapsulated
to provide a uniform interface across different platforms. The basic functionalities of this
module are the following:
•
•
•
•
Sonar and infrared range sensors data acquisition and fusion.
Encoders data sampling and processing.
Collision detection.
Absolute positioning and compass sensors data acquisition.
The sensor fusion method, which is platform independent and user selectable, is based
on a set of fuzzy rules that combine information from the sonar and infrared sensors to
produce a feasible range. This method has been described in depth in chapter 3. The sensor
processing module writes the following information on the blackboard continuously:
•
•
•
•
•
•
Range sensors: information provided by the fusion method (virtu0 – virtu9).
Collision sensors: information provided by the touch sensors (bump 0 – bump 4).
Odometry: information provided by the encoders (x, y, alpha).
Absolute position: information provided by the GPS (gps_x, gps_y).
Absolute heading: information provided by the compass (compass).
Goal detection: flag to indicate that the goal (gx, gy) has been reached (ingoal).
228
HUMBERTO M ARTÍNEZ BARBERÁ
In the current example, the collision sensors, the compass and the GPS are not used, but
they are in other applications (mostly that presented in chapter 8).
The actuator control module reads motor commands from the blackboard and sends the
corresponding control actions to the motors. As occurs with the sensor processing module,
most parts of this module are platform dependent and have been encapsulated as well. The
Quaky-Ant robot present differential steering, and thus, the motor module accepts commands
of the form <motL, motR>, where motL and motR correspond to the power applied to the left
and right motors respectively. To allow for simple and intuitive programs, the tuple <speed,
turn> is used to decouple the robot’s speed and steering angle, instead of using motL and
motR directly. The reason for decoupling these variables is that developers feel usually more
comfortable with decoupled speed and turning rate than with differential motor commands.
These four variables are related by the following equations:
 2 ⋅ turn + 10 turn < 0
ml = 
turn ≥ 0
10
(6.1)
10
mr = 
(−2) ⋅ turn+ 10
(6.2)
turn ≤ 0
turn > 0
motL = ml ⋅ speed ⋅ km
(6.3)
motR = mr ⋅ speed⋅ km
(6.4)
The turn variable operates in the [-10, 10] range, the speed variable operates in the [-1,
1], and the motL and motR variables operate in the [-100, 100] range. The ml (Eq. 6.1) and mr
(Eq. 6.2) variables, which operate in the [-10, 10] range, translate the turn rotation to a
percentage of the desired maximum speed. ml and mr are later combined into motL (Eq. 6.3)
and motR (Eq. 6.4), scaling the result by km (in this case km = 10).
6.3 Reactive control
This is the most important and crucial agent because it implements the basic reactive
behaviours and has the responsibility of maintaining the physical integrity of the robot by
avoiding collisions. The reactive behaviours make extensive use of fuzzy logic in the form of
fuzzy rule bases, which have been obtained both from previous experience and fuzzy
modelling techniques (Gómez-Skarmeta and Martínez-Barberá, 1997)(Gómez-Skarmeta et
al., 1999c). The following reactive behaviours (Figure 6.2) have been identified and
implemented:
• Avoid-left (avoidL): avoids an obstacle located on its left.
• Avoid-front (avoidF): avoids an obstacle located on its front.
• Avoid-right (avoidR): avoids an obstacle located on its right.
229
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
•
•
•
•
Align-right (alignR): holds the robot at a given distance from the wall on its right.
Align-left (alignL): holds the robot at a given distance from the wall on its left.
Move-to-goal (toGoal): drives the robot in the direction of the current goal.
Escape (escape): turns the robot randomly in order to escape from a U-shaped wall.
These behaviours and the different modules that compose the reactive control agent are
described in the following subsections. Special detail is put to show the different fuzzy rule
bases, the fuzzy sets that appear in the rule bases, and the different variables that play an
active role in the control of the robot (both those from the blackboard and those from the
LPS).
toGoal
Blackboard
alignL
alignR
LPS
uobject
heading
turn
alpha
left
right
leftd
rightd
speed
virtu i
avoidR
front
avoidF
common
avoidL
escape
blender
Figure 6.2: Reactive control agent.
6.3.1 Common block
The common block reads sensory information from the blackboard, process it, and then
updates the LPS, which is used later on by the different behaviours. In the current example the
LPS (Figure 6.2) contains the following features extracted from the environment: uobject, left,
leftd, front, rightd, and right. These features account for the presence of objects on different
directions, and are defined as follows:
230
HUMBERTO M ARTÍNEZ BARBERÁ
•
•
•
•
•
•
On-front (front): indicates the presence of an object on the front (Eq. 6.5).
On-left (left): indicates the presence of an object on the left (Eq. 6.6).
On-right (right): indicates the presence of an object on the right (Eq. 6.7)
On-diagonal-left (leftd): indicates the presence of an object between the front and the left
(Eq. 6.8).
On-diagonal-right (rightd): indicates the presence of an object between the front and the
right (Eq. 6.9).
U-shaped-wall (uobject): indicates whether the robot is in a U-shaped wall (Listing 6.1).
Most of these features are computed based on the distances provided by the virtual
sensors, which fuse data from infrared sensors and sonar sensors located around the robot’s
perimeter (Figure 6.3). These features do not correspond to real measures directly but to
reference distances, which are platform dependent. These features, in fact, are the result of a
simple and heuristic sensor fusion method. Thus, the features combine information from
different sensors to reduce the amount of sensory information provided to the behaviours. In
this case, the set of 10 virtual sensors (virtui) is reduced to only 5 features (left, leftd, front,
rightd, and right). The coefficients that appear in the fusion equations (Eqs. 6.5, 6.6, 6.7, 6.8,
and 6.9) have been obtained by trial-and-error, and the underlying heuristic reflects how
important is each sensor in the corresponding feature.
front
leftd
Σ
left
Σ
virtu2
virtu1
virtu0
Σ
virtu3 virtu4
virtu5
virtu6
virtu7
Σ
Σ
virtu9
rightd
right
virtu8
Figure 6.3: Combination of Quaky-Ant’s range sensors.
front=min {virtu 3 ,virtu 4 }
(6.5)
left = (virtu 9 + virtu 0 ) ⋅ 0 .5
(6.6)
right = (virtu 7 + virtu 8 )⋅ 0 .5
(6.7)
leftd = 0.45 ⋅ (min {virtu5 ,0 .75}) + 0 .45 ⋅ (min {virtu 6 ,0.75}) + 0.1 ⋅ (min {virtu 7 ,0 .75 })
(6.8)
231
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
rightd = 0 .45 ⋅ (min {virtu 2 ,0 .75}) + 0.45 ⋅ (min {virtu1 ,0.75}) + 0 .1 ⋅ (min {virtu 0 ,0.75}) (6.9)
A different case is the uobject feature. It is computed using a series of fuzzy rules
(Listing 6.1), in which the antecedents are previously computed features (leftd, front, and
rightd). These rules use fuzzy sets (Figure 6.4) that are also used for behaviours. The
interpretation of the uobject feature is that the higher the value the more likely is that the
robot is in a U-shaped obstacle.
rules
{
background (0.01)
uobject is 0.0;
if ((front is CLOSE) && ((rightd is CLOSE)
|| (leftd is CLOSE)))
if ((front is NEAR) && ((rightd is CLOSE)
|| (leftd is CLOSE)))
if ((front is CLOSE) && ((rightd is NEAR)
|| (leftd is NEAR)))
if ((front is NEAR) && ((rightd is NEAR)
|| (leftd is NEAR)))
uobject is 1.0;
uobject is 0.7;
uobject is 1.0;
uobject is 0.7;
}
Listing 6.1: U-shaped obstacle recognition.
In addition to the LPS, the different behaviours also use the following variables
extracted from the blackboard (Figure 6.2): alpha is the current robot’s global heading,
heading is the desired robot’s heading, speed is the desired robot’s lineal speed, and turn is
the desired robot’s rate of turn. A deliberative agent sets the heading variable, while the
sensor processing and actuator control module sets or updates the others.
6.3.2 Obstacle avoidance behaviours
Most of the reactive behaviours have been defined using expressive fuzzy logic
controllers (as is the case of the obstacle avoidance behaviours), which, for instance, use
fuzzy sets for range measures, speed control, and turning rate control. Five fuzzy sets
(CLOSE, NEAR, MED, FAR, and VFAR) have been defined for range measures (Figure 6.4).
These sets are not uniformly distributed along the sensors range [0, 10], and the safety area
(the area where an object might be dangerous for the robot) is covered by CLOSE, NEAR, and
MED. Three fuzzy sets (SLOW, SMEDIUM , and SFULL) have been defined for speed setting
(Figure 6.5). These sets are almost uniformly distributed along the speed range [0, 1]. Seven
fuzzy sets (TTL, TL, TSL, TC, TSR, TR, TTR) have been defined for turning rate setting
(Figure 6.6). These sets are almost uniformly distributed along the turning rate range [-10,
10], except TTL and TTR which are outside the actuator range. Although the actuator control
module forces values to be in the range, these out-of-range values are useful when a
behaviour needs to stress that a tight turn is really needed. Thus, higher values are used in the
inference process and it will be more likely that higher values will be obtained as a result.
232
HUMBERTO M ARTÍNEZ BARBERÁ
CLOSE NEAR MED
FAR
VFAR
1
0
0.0
0.2 0.3
0.4
0.5 0.6 0.7
1.3
1.4
10.0
Figure 6.4: Fuzzy sets for range sensors.
SLOW
SMEDIUM
SFULL
1
0
0.0
0.3 0.4
0.6 0.7
Figure 6.5: Fuzzy sets for speed setting.
TTL
TL
TSL
TC
TSR
1.0
TR
TTR
1
0
-15
-10
-9
-6
-5
-2 -1
1
2
5
6
9
10
15
Figure 6.6: Fuzzy sets for turning rate setting.
The obstacle avoidance behaviours are intended to avoid collisions with obstacles that
are in the vicinity of the robot. There are four obstacle avoidance behaviours (Figure 6.2):
•
•
•
•
Avoid-left (avoidL): drives the robot out of an obstacle on the left.
Avoid-front (avoidF): drives the robot out of an obstacle on the front.
Avoid-right (avoidR): drives the robot out of an obstacle on the right.
Escape (escape): drives the robot out of a U-shaped obstacle.
The avoidL behaviour (Listing 6.1) drives the robot to the right when it encounters an
obstacle on the left. The closer is the obstacle the tighter is the turn to the right. When the
obstacle is close enough, the translation speed is set to a small value. This behaviour is crucial
when crossing doorways and small corners.
behaviour avoidL weight 1.0
{
fusion
turn, speed;
233
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
rules
{
background (0.01)
background (0.01)
speed is SFULL;
turn is TC;
if
if
if
if
if
((leftd
((leftd
((leftd
((leftd
((leftd
is
is
is
is
is
CLOSE)
CLOSE)
CLOSE)
CLOSE)
CLOSE)
if
if
if
if
if
((leftd
((leftd
((leftd
((leftd
((leftd
is
is
is
is
is
NEAR)
NEAR)
NEAR)
NEAR)
NEAR)
if
if
if
if
if
((leftd
((leftd
((leftd
((leftd
((leftd
is
is
is
is
is
MED)
MED)
MED)
MED)
MED)
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
is
is
is
is
is
is
is
is
is
is
is
is
is
is
is
CLOSE))
NEAR))
MED))
FAR))
VFAR))
CLOSE))
NEAR))
MED))
FAR))
VFAR))
CLOSE))
NEAR))
MED))
FAR))
VFAR))
if ((leftd is CLOSE) || (front is CLOSE))
if ((leftd is NEAR) || (front is NEAR))
if ((leftd is MED) || (front is MED))
turn
turn
turn
turn
turn
is
is
is
is
is
TTR;
TTR;
TTR;
TTR;
TTR;
turn
turn
turn
turn
turn
is
is
is
is
is
TR;
TR;
TR;
TSR;
TSR;
turn
turn
turn
turn
turn
is
is
is
is
is
TR;
TR;
TSR;
TC;
TC;
speed is SLOW;
speed is SLOW;
speed is SMEDIUM;
}
}
Listing 6.2: Avoid-left reactive behaviour.
The avoidR behaviour (Listing 6.2) drives the robot to the left when it encounters an
obstacle on the right. The closer is the obstacle the tighter is the turn to the left. When the
obstacle is close enough, the translation speed is set to a small value. This behaviour is crucial
when crossing doorways and small corners.
behaviour avoidR weight 1.0
{
fusion
turn, speed;
rules
{
background (0.01)
background (0.01)
speed is SFULL;
turn is TC;
if
if
if
if
if
((rightd
((rightd
((rightd
((rightd
((rightd
is
is
is
is
is
CLOSE)
CLOSE)
CLOSE)
CLOSE)
CLOSE)
if
if
if
if
if
((rightd
((rightd
((rightd
((rightd
((rightd
is
is
is
is
is
NEAR)
NEAR)
NEAR)
NEAR)
NEAR)
if
if
if
if
((rightd
((rightd
((rightd
((rightd
is
is
is
is
MED)
MED)
MED)
MED)
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
&&
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
(front
is
is
is
is
is
is
is
is
is
is
is
is
is
is
CLOSE))
NEAR))
MED))
FAR))
VFAR))
CLOSE))
NEAR))
MED))
FAR))
VFAR))
CLOSE))
NEAR))
MED))
FAR))
234
turn
turn
turn
turn
turn
is
is
is
is
is
TTL;
TTL;
TTL;
TTL;
TTL;
turn
turn
turn
turn
turn
is
is
is
is
is
TL;
TL;
TL;
TSL;
TSL;
turn
turn
turn
turn
is
is
is
is
TL;
TL;
TSL;
TC;
HUMBERTO M ARTÍNEZ BARBERÁ
if ((rightd is MED) && (front is VFAR))
turn is TC;
if ((rightd is CLOSE) || (front is CLOSE)) speed is SLOW;
if ((rightd is NEAR) || (front is NEAR))
speed is SLOW;
if ((rightd is MED) || (front is MED))
speed is SMEDIUM;
}
}
Listing 6.3: Avoid-right reactive behaviour.
The avoidF behaviour (Listing 6.4) drives the robot to the left or to the right when it
encounters an obstacle on the front. The closer is the obstacle the tighter is the turn to the left
or right. The turn is made to the left or right depending which side has a greater open area.
When the obstacle is close enough, the translation speed is set to a small value. This
behaviour is crucial when avoiding frontal collisions.
behaviour avoidF weight 1.0
{
fusion
turn, speed;
rules
{
background (0.01)
background (0.01)
turn is TC;
speed is SFULL;
if ((rightd is CLOSE) && (front is CLOSE)
&& !(leftd is CLOSE))
if ((left is CLOSE) && (front is CLOSE)
&& !(right is CLOSE))
if ((rightd is NEAR) && (front is CLOSE)
&& !(leftd is NEAR))
if ((left is NEAR) && (front is CLOSE)
&& !(right is NEAR))
if ((rightd is CLOSE) && (front is NEAR)
&& !(leftd is CLOSE))
if ((left is CLOSE) && (front is NEAR)
&& !(right is CLOSE))
if ((rightd is NEAR) && (front is NEAR)
&& !(leftd is NEAR))
if ((left is NEAR) && (front is NEAR)
&& !(right is NEAR))
if ((front is CLOSE) || (front is NEAR))
turn is TTL;
turn is TTR;
turn is TTL;
turn is TTR;
turn is TL;
turn is TR;
turn is TL;
turn is TR;
speed is SLOW;
}
}
Listing 6.4: Avoid-front reactive behaviour.
The escape behaviour (Listing 6.5) drives the robot to the left or to the right when it
encounters a U-shaped obstacle. The behaviour works with some hysteresis, and thus it is
active when the front object is at less than 0.2 metres and it changes to inactive when the front
obstacle is at more than 0.3 metres. This behaviour, in fact, is not purely reactive because it
has a state that accounts for the hysteresis effect. The turn is made to the left or right
depending which side has a greater open area. When a U-shaped obstacle is detected the
translation speed is set to a small value. This behaviour is crucial when approaching sharp
concave corners.
235
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
behaviour escape weight 1.25
{
fusion
turn, speed;
if (front > 0.3)
ushape = 0.0;
if ((front < 0.2) && (ushape == 0.0))
{
ushape = 1.0;
if (leftd
usense
if (leftd
usense
< rightd)
= 1.0;
>= rightd)
= -1.0;
}
if (ushape == 1.0)
{
if (usense == 1.0)
turn = 10.0;
if (usense == -1.0)
turn = -10.0;
turn = turn * 10.0;
speed = 0.5;
}
}
Listing 6.5: Escape behaviour.
6.3.3 Goal reaching behaviour
The goal reaching behaviours are intended to drive the robot to the current goal. There
is only one goal seeking behaviour (Figure 6.2): toGoal. This behaviour (Listing 6.6) drives
the robot to the left or to the right depending on the difference between the current heading
and the desired heading (the heading required to reach the goal). The behaviour itself is purely
reactive because it has no state, but the resulting behaviour might not be reactive because it
uses a heading that could have been computed by a state-based method. This behaviour is
important to reach the goal in an efficient way, although it is not strictly necessary because
the robot can reach the goal performing an indefinite random walk.
behaviour toGoal weight 0.3
{
fusion
turn;
float
diff;
diff = alpha - heading;
if (diff > 3.1415)
diff = diff - (2 * PI);
if (diff < -3.1415)
diff = diff + (2 * PI);
rules
{
if (diff is GN)
if (diff is GSN)
turn is TTL;
turn is TL;
236
HUMBERTO M ARTÍNEZ BARBERÁ
if (diff is GZ)
if (diff is GSP)
if (diff is GP)
turn is TC;
turn is TR;
turn is TTR;
}
}
Listing 6.6: Move-to-goal reactive behaviour.
Five fuzzy sets (GN, GSN, GZ, GSP, and GP) have been defined for heading differences
(Figure 6.7). These sets are not uniformly distributed along the speed range [-3.5, 3.5].
GN
GSN
GZ
GSP
GP
1
0
-3.5
- 0 . 8 7 - 0 . 5 2 -0.087 0 0.087 0 . 5 2 0 . 8 7
Figure 6.7: Fuzzy sets for heading difference.
3.5
6.3.4 Alignment behaviours
The alignment behaviours are intended to maintain the robot along the centre of
corridors. There are two alignment behaviours (Figure 6.2):
• Align-right (alignR): aligns the robot to the wall on the right.
• Align-left (alignL): aligns the robot to the wall on the left.
The alignR behaviour (Listing 6.8) keeps the robot at a given distance from the wall on
its right. The greater is the difference between the actual distance and the desired distance the
tighter is the turn to the left or right. If the wall is close the behaviour drives the robot to the
left, otherwise is drives the robot to the right. When the wall is close enough, the translation
speed is set to a small value. This behaviour is important when navigating along corridors.
behaviour alignR weight 0.15
{
fusion
turn, speed;
rules
{
if
if
if
if
if
(right
(right
(right
(right
(right
is
is
is
is
is
VFAR)
FAR)
MED)
NEAR)
CLOSE)
turn
turn
turn
turn
turn
is
is
is
is
is
TR,
TSR,
TC,
TSL,
TL,
speed
speed
speed
speed
speed
}
}
Listing 6.8: Align-right reactive behaviour.
237
is
is
is
is
is
SFULL;
SFULL;
SFULL;
SMEDIUM;
SLOW;
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
The alignL behaviour (Listing 6.9) keeps the robot at a given distance from the wall on
its left. The greater is the difference between the actual distance and the desired distance the
tighter is the turn to the left or right. If the wall is close the behaviour drives the robot to the
right, otherwise is drives the robot to the left. When the wall is close enough, the translation
speed is set to a small value. This behaviour is important when navigating along corridors.
behaviour alignL weight 0.15
fusion
turn, speed;
rules
{
if
if
if
if
if
(left
(left
(left
(left
(left
is
is
is
is
is
VFAR)
FAR)
MED)
NEAR)
CLOSE)
turn
turn
turn
turn
turn
is
is
is
is
is
TL,
TSL,
TC,
TSR,
TR,
speed
speed
speed
speed
speed
is
is
is
is
is
SFULL;
SFULL;
SFULL;
SMEDIUM;
SLOW;
}
}
Listing 6.9: Align-left reactive behaviour.
6.3.5 Behaviour blending
The output of the different behaviours is fused by the blender block (Figure 6.2), which
is defined using a fuzzy rule base. The blender (Listing 6.10) selects at each control cycle the
behaviours that are more appropriate given the current sensory information, extracted from
the LPS. There are three groups of rules:
• Rules for U-shaped obstacles, which modify the escape and toGoal behaviours.
• Rules for obstacle avoidance, which modify the avoidL, avoidR, avoidF and ToGoal
•
behaviours.
Rules for wall alignment, which modify the alignL and alingR behaviours.
The underlying idea is: when the robot is navigating a free area the wall alignment rules
prevail, but when there are obstacle in the vicinity of the robot the obstacle avoidance rules
prevail. If the obstacles are close enough then the U-shaped obstacles rules apply. In either
case, the toGoal behaviour has a default high value, so that it is active until it is explicitly
deactivated, which occurs when the robot reaches any obstacle.
blender
{
rules
{
background
background
background
background
background
background
(0.01)
(0.01)
(0.01)
(0.01)
(0.01)
(0.001)
escape
avoidR
avoidL
avoidF
toGoal
alignR
238
is
is
is
is
is
is
LOW;
LOW;
LOW;
LOW;
HIGH;
LOW;
HUMBERTO M ARTÍNEZ BARBERÁ
background (0.001)
alignL is LOW;
// Rules for U-shaped objects
if (uobject is UZONE)
escape is HIGH, toGoal is 0.0;
//
if
if
if
Rules for obstacle avoidance
(leftd is CLOSE)
avoidL is HIGH, toGoal is LOW;
(front is CLOSE)
avoidF is HIGH, toGoal is LOW;
(rightd is CLOSE)
avoidR is HIGH, toGoal is LOW;
if (leftd is NEAR)
if (front is NEAR)
if (rightd is NEAR)
avoidL is HALFH, toGoal is HALFL;
avoidF is HALFH, toGoal is HALFL;
avoidR is HALFH, toGoal is HALFL;
if (leftd is MED)
if (front is MED)
if (rightd is MED)
avoidL is HALFL, toGoal is HALFH;
avoidF is HALFL, toGoal is HALFH;
avoidR is HALFL, toGoal is HALFH;
// Rules for corridor alignment
if (left is MED)
alignL is HALFL;
if (right is MED)
alignR is HALFL;
if (left is FAR)
if (right is FAR)
alignL is HALFH;
alignR is HALFH;
if (left is VFAR)
if (right is VFAR)
alignL is HIGH;
alignR is HIGH;
}
}
Listing 6.10: Behaviour blending rules.
Four fuzzy sets (LOW, HALFL, HALFH, and HIGH) have been defined for behaviour
activation (Figure 6.7). These sets are distributed along the degree of activation range [0, 1].
In this case, a value of 1 means that the behaviour is executed as its maximum degree, while a
value of 0 means that the behaviour is not executed. One fuzzy set (UZONE) has been defined
for U-shaped object presence (Figure 6.8). This set uses part of the degree of presence range
[0, 1]. In this case, a value of 1 means that a U-shaped object is likely to be present, while a
value of 0 means that no U-shaped object is likely to be present.
LOW
HALFL
HALFH
HIGH
1
0
0.0
0.1 0.2
0.4
0.5
0.7 0.8
Figure 6.7: Fuzzy sets for behaviour activation.
239
1.0
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
UZONE
1
0
0.0
0.5
Figure 6.8: Fuzzy set for U-shaped object presence.
1.0
6.4 Navigation
The navigation agent is in charge of building a map of the environment, computing a
path to the goal location, and keeping track of the robot location. To implement the map, this
agent uses the grid structure provided by the BG language. This grid is updated continuously
with the different sonar readings. The robot’s pose is computed relying only on odometry,
under the assumption of near perfect odometry. This assumption holds true as long as the total
distance of each robot run is kept short, and, in fact, experiments with real robots were
reduced to an area of 20 by 20 metres (see also chapter 8). Only one task (map) has been
defined for this agent (Figure 6.9), which is in charging of updating the grid map.
Blackboard
EPS
heading
gy
gx
virtu i
alpha
grid
x
y
map
Figure 6.9: Navigation agent.
The map task (Listing 6.11) updates the grid map, which is stored in its EPS, using the
different sensor measures (virtui) with a given robot’s pose (x, y, and alpha). Sensor readings
that are longer than 5 metres are rejected in the map building process, to reduce the
uncertainty in the map. This task also specifies the use of the A* algorithm for path planning,
and the initial (x, and y) and last position (gx, and gy) of the path. Finally, the path planning
method updates the desired heading variable (heading).
agent Navigation
240
HUMBERTO M ARTÍNEZ BARBERÁ
{
task map
{
grid a_star updates heading cells 72.0 by 66.0 side 0.075
{
location (x, y);
goalpos (gx, gy);
if
if
if
if
if
if
if
if
if
if
(virtu0
(virtu1
(virtu2
(virtu3
(virtu4
(virtu5
(virtu6
(virtu7
(virtu8
(virtu9
<
<
<
<
<
<
<
<
<
<
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
5.0)
update
update
update
update
update
update
update
update
update
update
(x,
(x,
(x,
(x,
(x,
(x,
(x,
(x,
(x,
(x,
y,
y,
y,
y,
y,
y,
y,
y,
y,
y,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
alpha,
0.0,
1.0,
2.0,
3.0,
4.0,
5.0,
6.0,
7.0,
8.0,
9.0,
virtu0);
virtu1);
virtu2);
virtu3);
virtu4);
virtu5);
virtu6);
virtu7);
virtu8);
virtu9);
}
}
}
Listing 6.11: Navigation agent.
6.5 Planning
The planning agent is in charge of supervising the task which is executed by the robot,
changing the goal when a partial run has been completed, and stopping the execution when
the overall task has been accomplished. To implement this simple planner, this agent uses the
finite state machine structure provided by the BG language. The different states of this FSM
have associated goal co-ordinates. These are used for path planning (navigation agent), which
produces a heading than later drives the robot (reactive control agent) to the desired state’s
goal.). Only one task (plan) has been defined for this agent (Figure 6.10), which is in charging
of updating the grid map.
Blackboard
EPS
stage
x
gy
gx
h o m e _x
y
ingoal
plan
241
h o m e _y
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
Figure 6.10: Planning agent.
The plan task (Listing 6.12) manages information related to the robot’s goal. The task
starts updating its EPS with the home location (home_x, home_y) with the starting robot’s
position (x, y). The sensor processing module provides a function that sets a variable (ingoal)
of the blackboard when the current goal (gx, gy) is achieved. State transitions occur based on
this variable. A counter of how many subgoals are left to accomplish the global task is also
updated (stage). This is a valuable information when a learning procedure is applied for
behaviour learning. For a detailed discussion see chapter 7.
agent Planning
{
task plan
{
fsm start HOME
{
state HOME:
{
home_x = x;
home_y = y;
stage = 2.0;
shift LOOKING;
}
state LOOKING:
{
if (ingoal == 1.0)
}
state CHANGING:
{
gx = home_x;
gy = home_y;
stage = 1.0;
if (ingoal == 0.0)
}
state RETURNING:
{
if (ingoal == 1.0)
}
state HALT:
{
stage = 0.0;
halt;
}
}
}
}
shift CHANGING;
shift RETURNING;
shift HALT;
Listing 6.12: Planning agent.
The state machine (Figure 6.11), which corresponds to the planning agent (Listing
6.12), starts in the HOME state, which sets some variables in order to establish the home and
goal coordinates and then shifts to the LOOKING state. This state is executed until the ingoal
variable is set to 1.0 (the robot is in the goal location), and then shifts to the CHANGING
state. This state new state is executed until the ingoal variable is set to 0.0 (the robot is no
longer in the goal location). While the state is being executed the new goal coordinates are set
to that of the home location. When the robot exits the goal area, the new state is set to
RETURNING. This state is executed until the ingoal is set to 1.0 (the robot is in the home
242
HUMBERTO M ARTÍNEZ BARBERÁ
location), and then it shifts to the GOAL state. This state simply halts the robot and indicates
that the robot’s task has been achieved (stage is set to 0.0).
*
*
LOOKING
ingoal = 1
HOME
CHANGING
ingoal = 0
GOAL
*
RETURNING
ingoal = 1
*
Figure 6.11: Planning agent’s finite state machine.
6.6 Experiments and results
Using the agent architecture above explained, a series of simulator and real robot runs
have been performed to test the performance and suitability of both the BG language and the
BGA architecture. This performance is not measured quantitatively, but qualitatively. For
some quantitative results see chapter 7, and for web-based experiments with a real robot see
also chapter 8. All simulator experiments presented in this section have been simulated using
sound reflective walls (to emphasise the robustness of control programs, the sonar sensors
chosen are noisier in simulation than in real world environments). The floor plants of the
experiments correspond to office like environments with narrow corridors and doorways and
these are unknown to the robot before each experiment.
6.6.1 Point-to-point navigation experiment (nav-A)
This simple experiment, named nav-A, shows a point-to-point navigation task, in which
a simulated robot has to reach a goal location starting from a given initial position. No
odometry errors are taken into account, and thus perfect odometry is assumed. This
experiment is carried out in a complex floor plant (Figure 6.12), named “corridor”, which
simulates a corridor with different rooms but on a smaller scale. The overall dimensions of
this scenario are 5.2 by 4.9 metres, which forces the robot to move in a reduced environment
and make sharp turns (to test its obstacle avoidance behaviours). The robot starts in the
bottom-right corner and has to reach the top-left corner. The BG program for this experiment
is that described in the previous section, with a minor modification in the planning agent. The
243
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
LOOKING state of the planner shifts to the HALT state when the ingoal condition is true, and
thus the robot finishes the task.
Goal Position
Home Position
Figure 6.12: Corridor floor plant.
The BGen environment is provided with the Quaky-Ant robot description, the
“corridor” floor plant description which includes the home and goal locations and the BG
program. Some snapshots of the sequence are shown below (Figure 6.13). These have been
obtained by modifying the images captured from two BGen windows. The upper part of each
image shows the current behaviour activation, while the lower part shows the current
measurement of the sensors, a portion of the complete “corridor” map, and the path followed
by the robot. The thin triangles correspond to infrared sensors, the thick triangles correspond
to sonar sensors and the arrows correspond to the fused virtual sensor. Some snapshots of the
map-building and path-planning process are also shown below (Figure 6.14). These have been
obtained by capturing a BGen window. The image shows the current grid map represented as
small squares of different shading (the darker the cell, the more certainty that it is occupied).
The “corridor” floor plant is superimposed as thin straight lines. The shortest path to the goal
is represented as a thin non-straight line. And the robot position is represented as a circle
(larger than the robot).
244
HUMBERTO M ARTÍNEZ BARBERÁ
Step-1: The robot starts.
Step-2: The robot has just passed through a
doorway.
Step 3: The robot is in the corridor.
Step 4: The robot is going through a doorway.
245
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
Step 5: The robot has reached the goal.
Figure 6.13: Navigation experiment with simulated Quaky-Ant robot (nav-A).
Step 1
Step 2
Step 3
Step 4
Step 5
Figure 6.14: Different views of the grid-map during the nav-A experiment.
The experiment shows that, with the BG program described above, the robot is able to
reach the goal while describing a smooth path and avoiding obstacles. It is important to note
here that the difficulties the doorways suppose are very real since the size of the robot is only
slightly less than that of the robot. In this example the reactive behaviours help the robot in
the obstacle avoidance problem, while the path-planning module modifies the most direct
path to the goal in order to guide the robot through zones without obstacles.
6.6.2 Point-to-point navigation experiment (nav-B)
246
HUMBERTO M ARTÍNEZ BARBERÁ
This experiment, named nav-B, shows a point-to-point navigation task, in which a real
robot has to reach a goal location, starting from a given initial position, and then return to the
initial position. The robot, Milou (Figure 6.15), is a Nomad-2006 commercial robot, which
belongs to the Centre for Applied Autonomous Sensor Systems (AASS), Örebro University,
Sweden, which the author visited for some months. Due to a hardware failure, the infrareds
were not working properly, and thus, the robot relied only on sonars. As the experiment
involves only small runs, the BG program relies on odometry and does not use any
localisation method. Odometry errors are not therefore taken into account.
Figure 6.15: Milou, a Nomad-200 mobile robot.
This experiment is carried out in a laboratory floor plant (Figure 6.16), named “AASSlab-1”, which is made of wooden walls. The approximate overall dimensions of this scenario
are 4.2 by 7.5 metres. The robot starts in the bottom-right corner, has to reach the top-right
corner, and then return to the bottom-right corner. There is no direct path between the starting
and goal locations.
6
http://www.robots.com
247
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
Figure 6.16: AASS-lab-1 floor plant.
The BG program for this experiment is similar to what is described in the previous
section, but with some obvious minor modifications. The changes from the original BG
program for Quaky-Ant described in the previous section can be summarised as:
• The number and orientation of the sonar sensors is different and thus the numbering is
•
•
different. The infrareds are not fused to sonars. Moreover, the navigation agent uses more
sensors (16 instead of 10) to update the grid map.
Milou runs a little bit faster than Quaky-Ant, and thus a larger clearance area is needed.
For this reason the distance-related fuzzy sets have been shifted 15 centimetres to allow
for a greater reaction time.
The linear and rotational speeds are already decoupled in Milou’s low-level controllers,
and thus the BG variables turn and speed are simply scaled and passed on to Milou’s
controllers.
The BGen environment is provided with the Milou robot description, the home and goal
locations, and the BG program. Some images of the sequence are shown below (Figure 6.17).
248
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 6.17: Navigation experiment with Milou robot (nav-B).
The experiment shows that, with the BG program described above, the robot is able to
reach the goal and return to the initial position, describing a smooth trajectory and whitout
colliding with the walls. It is important to remember here that the robot does not know the
environment in advance, and thus the paths are recomputed dynamically as soon as new
sensor information is available. Moreover, this experiment has been repeated with different
starting and goal points, and even with slow-moving human obstacles, and once the low-level
behaviours and blending rules are correctly adjusted the robot never collides and always finds
its way to the goal point (provided there is a valid path).
6.6.3 Point-to-point navigation experiment (nav-C)
This experiment, named nav-C, shows a point-to-point navigation task, in which a
simulated robot has to reach a goal location starting from a given initial position and then
return to the initial position. This experiment is similar to nav-B, but with the robot and floor
plant (Figure 6.12) of nav-A. Two different runs, which differ in the path-planning method,
are compared. While the first is a deliberative approach, based on an A* path-planner, the
second is a purely reactive approach, based on the most direct path to the goal.
The deliberative run (Figure 6.18) produces similar results to experiments nav-A and
nav-B. The reactive behaviours help the robot in the obstacle avoidance problem, while the
path-planning module modifies the most direct path to the goal in order to guide the robot
through free zones. In this case no paths are traced through previously discovered obstacles.
249
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
Figure 6.18: Navigation experiment with deliberative path-planning (nav-C).
The deliberative run (Figure 6.19) proceeds in a completely different way. While the
reactive behaviours help the robot in the obstacle avoidance problem, the path-planning
module always returns a direct path to the goal without taking into account known obstacles.
In this case the robot still completes the task but follows a path that depends on the
configuration of the environment. It is important to notice that when using this reactive
approach the robot can get trapped in U-shaped obstacles or in infinite loops (as is usual in
reactive path-planning based systems).
Figure 6.19: Navigation experiment with reactive path-planning (nav-C).
The experiment shows that changing a strategy in BGen can be as simple as changing a
line of BG code. In this case, the only difference between both runs is the first parameter of
the grid structure, which is defined in the navigation agent. The deliberative approach is
250
HUMBERTO M ARTÍNEZ BARBERÁ
obtained using an A* path-planner (a_star option), and the reactive approach is obtained
using a reactive planner (direct option). It is important to note that the obstacle avoidance
behaviour prevails in both cases, and thus, the robot keeps its integrity in different
environments, providing the robot’s sensors can detect the obstacles. A typical problem with
the Nomad-200 robot is that its sonar sensors are placed too high and cannot usually detect
chairs, tables and small furniture in general.
6.6.4 Map-building experiment (map-A)
This experiment, named map-A, shows an autonomous map-building task. A real robot
is placed in a completely unknown environment in order to build a map. This experiment has
been carried out using a set-up similar to that of experiment nav-B: the same robot without the
infrared sensors. As the experiment involves long runs, the BG program uses a localisation
method based on a fuzzy segments map (see chapter 4). This experiment was carried out in a
laboratory floor plant (Figure 6.20), named “AASS-lab-2”, which is made of wooden walls.
The approximate overall dimensions of this scenario are 4.2 by 7.5 metres. The robot starts
somewhere in the environment and tries to build a map of the environment by moving to open
and unmapped areas.
Figure 6.20: AASS-lab-2 floor plant.
The BG program for this experiment is similar to what is used in nav-B experiment, but
with two important modifications:
• In addition to the fuzzy grid map, which is used for path-planning, the robot also builds a
•
fuzzy segments map, which is used for localisation. The combination of these two kinds
of maps has been proved to perform very well.
The planning agent is similar to that of nav-A experiment, but instead of receiving goal
coordinates from a human operator (as a parameter of the BG program), it receives goal
coordinates from a new agent, the topology-based planner.
251
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
The topology-based planner builds and maintains a global topological map (Duckett and
Nehmzow, 1998) by relaxation (Duckett et alt., 2000). The topological map (Figure 6.22)
keeps track of the places visited (black circles) and also of tentative free places (red circles).
The planner selects one of these free places, and sends its coordinates as a new goal to the
planning agent.
Figure 6.21: Topological map.
The BGen environment is provided with the Milou robot description and the BG
program. The map resulting after a ten minutes run is shown below (Figure 6.22). The darker
segments correspond to the global map, and the lighter segments correspond to the local map.
Figure 6.22: Map-building experiment with Milou robot (map-A).
252
HUMBERTO M ARTÍNEZ BARBERÁ
The experiment shows that the robot is able (with the BG program described above and
a topology-based planner) to build a map of the environment while staying localised. It is
important to note that odometry errors are very significant after some minutes and the robot is
still capable of localising. Although the reason of the experiment was to test the
interoperability of a BG program with a non BGen-based program, the experiment also
proved to be capable of solving the SLAM (Simultaneous Localisation And Map-building)
problem. More tests are still needed to verify or prove in which situations or environments
this set-up is valid for the SLAM problem.
6.7 Conclusions
The experiments presented in the previous section show that BGen, which includes the
BG language and the underlying BGA architecture, provides a good working environment for
mobile robotics related problems.
BG programs are simple and easy to understand. Without knowing in advance what a
BG program does, it is easy to figure out the purpose of the program, although of course, this
also depends on how well written the program is. In any case, although the tasks involved in
achieving a robot’s goal are complex, the BG language hides the details and allows the robot
programmer to concentrate on the application domain. All the specific techniques for mapbuilding, localisation, path-planning, etc are available directly from the language, while new
methods can be added in the future. This favours and helps in a quick development of robot
programs as well as in their maintenance.
The BGen environment is open and portable. While all the development of the BGen
environment has been carried out in the University of Murcia (Spain) in parallel to the
development of the Quaky-Ant robot, one of the experiments (nav-B) shows the performance
of a completely different robot. The porting and interfacing of the BGen environment to the
new system and the rewriting of the BG program to account for the new robot geometry and
speed took less than two weeks, when the author was at the University of Örebro (Sweden). It
is important to note also that the environment was not tried completely in a real robot. These
facts support that the BGen environment is flexible and easy to port to other platforms.
Java-based robotics environments are practical and usable. It is often argued that Java
programs are extremely slow. While it is true that Java just-in-time compiled programs are a
little bit slower than native ones, it is still possible to use Java in most robotics applications.
Moreover, the BG interpreter runs on top of a Java virtual machine. Thus, BG programs are
interpreted twice, but this is not a problem because most computing-intensive techniques run
directly as Java libraries (map-building, localisation and path-planning). The added value of
the BGen system is again its portability and reuse of code at two different levels: at a system
level, reusing the whole set of BGen libraries, and at an application level, reusing BG
programs for different robot platfor ms.
The BGen environment is expandable. One of the experiments (map-A) shows that new
techniques or modules can be incorporated into the BGA architecture. In this case the
253
CHAPTER 6: A SAMPLE AGENT ARCHITECTURE
topology-based planner is not a Java program but a C/C++ program interfaced by way of the
Java Native Interface (JNI). In this way new services or legacy applications can be added.
254
Chapter 7
Behaviour Fusion Learning,
7.1 Introduction
A key issue of behaviour-based control is how to efficiently coordinate conflicts and
competition among different types of behaviour in order to achieve a good performance.
Instead of just inhibiting and suppressing strategies according to assigned priorities, the BGA
architecture uses a fusion method which is based on context dependent blending (see chapter
5). A fuzzy rule base carries out this fusion, enabling/disabling the output of the different
behaviours, both totally and partially. The use of learning techniques (Bonarini and Basso,
1997) (Dorigo and Colombetti, 1998) in the fusion rule base may result in a robot navigation
performance improvement. One of the advantages of this approach is that the system can be
trained with just some simple behaviours, and then, if new behaviours are to be added, only
the fusion rule base needs to be re-trained while preserving the behaviours previously
developed. Different learning techniques have been used for the fusion problem, for instance:
Q-learning of motor schemas in robot soccer and robot formation maintenance (Balch and
Arkin, 1995) (Balch, 1998), genetic programming for hierarchical fuzzy control of robots
(Tunstel, 1995) (Tunstel, 1996), reinforcement learning of behaviour selection in reactive
systems and in sequential behaviour patterns (Dorigo and Colombetti, 1998), Q-learning
based methods for behaviour selection in robot soccer (Stone, 2000), reinforcement learning
of fuzzy behaviours combination (Bonarini and Basso, 1997). Other approaches do not
consider the fusion problem and apply learning techniques to develop single or multiple
behaviours.
The proposed learning process is as follows. The user selects the BGA agent whose
blending rule base is to be learned. An interesting and practical approach is to start with a
predefined set of fuzzy rules, which are then used as input to the learning method. This rule
set can even be the empty set. In either case, the individuals of the genetic algorithm are fuzzy
rule bases (Pittsburgh coding), and the result of the learning process is a fusion fuzzy rule
base. The structure of the learning method is described in the following section.
7.2 Behaviour fusion using genetic-based algorithms
255
CHAPTER 7: BEHAVIOUR FUSION LEARNING
7.2.1 Introduction
Genetic algorithms (Biethaln et al., 1995)(Golberg, 1989) are adaptive procedures of
search and optimisation inspired by the mechanisms of natural evolution that find solutions to
problems (see chapter 1). They imitate, on an abstract level, biological principles such as a
population based approach, the inheritance of information, the variation of information via
crossover/mutation, and the selection of individuals based on fitness. For any type of
application, genetic algorithms are characterised by:
• A genetic-like representation of solutions to the problem.
• A method to obtain or create an initial population of solutions.
• A fitness function to measure how good any solution is. The function plays the role of the
•
•
environment in which best solutions may have a greater probability of survival.
Genetic operators that effect the composition of children during reproduction.
A set of parameters used by the algorithm to guide its evolution: population size, number
of generations, crossing and mutation probabilities, etc.
Genetic algorithms have shown good results in obtaining fuzzy rule bases, although
usually using different algorithms in the generation and tuning process or just performing one
of these steps (Herrera et al., 1995) (Ishigami et al., 1995). The proposed algorithm for
behaviour fusion (Gómez Skarmeta and Jiménez, 1997) (Gómez Skarmeta and Jiménez,
1999) (Gómez Skarmeta et al., 1997a) has two operation modes:
• Only tuning. The genetic algorithm is used to tune a fuzzy rule base, which has been
•
previously introduced using any method.
Generation and tuning. The genetic algorithm is used to generate and tune a fuzzy rule
base directly from data. It is also possible to incorporate an initial rule base, as in the
previous mode, to help the search process. The user defines the maximum number of rules
(RMAX) a priori.
In the context of behaviour fusion, the algorithm is used in generation and tuning mode
with an initial rule base obtained manually from user knowledge. The main characteristics of
the algorithm and its parameters are described in the following subsections.
7.2.2 Representation and initial population
To simplify the representation of the fuzzy rules, the algorithm imposes and assumes
some constraints: the fuzzy sets must be trapezoidal and the rules must use only conjunctions.
While the first constraint can influence the performance of the results, the second one does
not impose any restriction. Any set of fuzzy rules with disjunctions and negations can be
reduced to a rule set with conjunctions only by changing the number of rules and
256
HUMBERTO M ARTÍNEZ BARBERÁ
complementing some fuzzy sets. Trapezoidal fuzzy sets are usually used in many control
applications and the experimental results show that they are adequate for this problem.
Conjunction-only MIMO fuzzy rules have the following form (Eq. 7.1):
Ri ≡ if
xi1 ∈Ai1 ∧K ∧ xin ∈Ain then yi1 = Bi1 ∧K ∧ yim = Bim
(7.1)
where xi1 … xin are the n input variables, yi1 … yim are the m output variables, and Ai1 …
Ain, Bi1 … Bim are trapezoidal fuzzy sets, which are defined as follows (Eq. 7.2):
[
Tij = tij1 ,t 2ij ,tij3 ,t ij4
]
(7.2)
Each rule Ri is coded by concatenating all the points that determine the different fuzzy
sets. To avoid using very small fuzzy sets, a cut-off threshold sth is applied. If one fuzzy set
has a support which is smaller than a percentage sth of its range, it is not taken into account for
the final rule. The general format is shown below (Eq. 7.3):
[
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Ri ≡ ai 1 , ai 1 , ai 1 ,ai 1 ,K,ai 1 ,ai 1 ,ai 1 ,ai 1 ,bi 1 ,bi 1 ,bi 1 ,bi1 ,K,b i1 ,bi 1 ,bi 1 ,bi 1
]
(7.3)
An individual Hi is composed by the concatenation of the different rules and a series of
control bits (Eq. 7.4). These are used to determine which rules are enabled or disabled (d i) and
which ones are background rules (k i). Background rules (see chapter 5) use the same coding
as regular fuzzy rules, but they only use the first number, which corresponds to the degree of
activation of the rule, and the values representing the consequent. To avoid the excessive
generation of background rules, only a percentage bth of the total number of rules RMAX is
allowed.
Hi ≡ [R1 ,K RRMAX,d1 ,k1 ,K,dRMAX,k RMAX]
(7.4)
If the individuals are to be used in tuning mode only, the control bits are not used, and
RMAX corresponds to the initial set of rules. Using this coding scheme, the input variables are
those used to distinguish or establish the different activation contexts and the output variables
are the degrees of activation of the different behaviours.
As stated previously, the learning procedure is run in generation and tuning mode. To
create the initial population two criteria are followed:
• To explore all the available information (user defined fusion rule base).
• To explore the solution space to obtain an appropriate diversity.
According to these criteria, three different alternatives F1, F2, F3 are used to initialise
the solutions, which are applied with different probabilities pf1, pf2, pf3. The F1 and F2 forms
assume the existence of an initial rule base with k rules, where 1
k
RMAX. The
initialisation methods are described below:
257
CHAPTER 7: BEHAVIOUR FUSION LEARNING
(F1) All the rules of the user defined rule base are introduced at random positions of
the solution, the corresponding values of the control set dj are set to 1, and the
corresponding values of the control set k j are set according to the type of the rules. The
remainder (RMAX – k) rules are obtained randomly according to the domains of the
input and output variables, the corresponding values of the control set dj are set to 0
and the corresponding values of the control set kj are obtained randomly.
(F2) Each rule of the solution, with equal probability is either selected from the initial
rule base or obtained randomly. All the values of the control sets dj and kj are
generated randomly.
(F3) All the RMAX rules of the solution and their corresponding control sets dj and kj
are obtained randomly.
7.2.3 Genetic operators and parameters
The operation of the algorithm is directed by series of parameters (Golberg, 1989).
Basically, in addition to the above-defined parameters, they correspond to the type of genetic
operators used, the probabilities of both mutation (Pm) and crossover (Pc), the maximum
number of generations (Geners) and the population size (Popul).
Since the individuals contain two different types of information (denoted as rule set and
control set) it is necessary to use different genetic operators to perform completely the
recombination process. The following operators are used for the rule set (Gómez Skarmeta
and Jiménez, 1999):
• Simple crossover. Two individuals Hi and Hj are selected as parents for the crossover
operation. Let r and s be two random numbers such that r ∈[1K RMAX ⋅ (n + m + 2)] and
s ∈[r K RMAX ⋅ (n + m + 2)]. The elements of Hi and Hj are interchanged between
•
positions r and s to obtain new individuals.
Arithmetical crossover. Two individuals Hi and Hj are selected as parents for the
crossover operation. Let r and s be two random numbers such that
r ∈[1K RMAX ⋅ (n + m + 2 )] and s ∈[r K RMAX ⋅ (n + m + 2)], and let α ∈[0 K1] be a
random number. For each fuzzy number u k from Hi with r
k
s and for each number vk
from Hj with r k s, the following equation is applied (Eq. 7.5):
t
t
t
(7.5)
u k ← α uk + (1 − α )vk
t = 1K4
v tk ← αvtk + (1 − α )ukt
t = 1K4
• MaxMin crossover. Two individuals Hi and Hj are selected as parents for the crossover
operation. Let r and s be two random numbers such that r ∈[1K RMAX ⋅ (n + m + 2 )] and
s ∈[r KRMAX ⋅ (n + m + 2)]. For each fuzzy number u k from Hi with r k s and for each
number vk from Hj with r k s, the following equation is applied (Eq. 7.6):
258
HUMBERTO M ARTÍNEZ BARBERÁ
u tk ← max{u kt ,vkt }
t = 1K4
v tk ← min{ukt ,vkt}
t = 1K4
(7.6)
• Simple mutation. Let uk be a fuzzy set which has been selected for mutation, and let α1,
α2, α3, α4 be random real numbers. The new fuzzy set is obtained as follows (Eq. 7.7):

(7.7)

u 2 − u1k  1 uk2 − u1k 

,u k +
u 1k ← α 1 ∈max a,u1k − k


2 
2 

u 2 − u 1k 2 uk3 − u2k 
u 2k ← α 2 ∈ uk2 − k
,u k +


2
2 

u 3 − u2k 3 uk4 − u3k 
u 3k ←α 3 ∈ u3k − k
,u k +


2
2 


u 4 − u 3k
u 4 − uk3 
u k4 ← α 4 ∈ u4k − k
,min  b,u k4 + k



2
2 
•
where [ a, b] is the domain of the variable corresponding to the fuzzy set uk.
Nonuniform mutation. The action of this operator depends on the age of the population,
and its effect is a fine local tuning in the last generations. Let uk be a fuzzy set which has
been selected for mutation, let r ∈[0 K1] be a random real number, let s ∈[0,1 ] be an
integer random number, and let c be a parameter which represents the degree of
nonuniformity. The new fuzzy set is obtained as follows (Eq. 7.8):



(7.8)
g  
 ut + (b − u t )1 − r  1− Geners  s = 0
k 
 k




t

←
uk
t = 1K 4


g  
 t
 1−

 uk − (u kt − a)1 − r  Geners  s = 1





c
c
where g is the number of the current generation, and [a, b] is the domain of the
variable corresponding to the fuzzy set u k.
As the control sets are bit strings, simple binary operators are used (Gómez Skarmeta
and Jiménez, 1999). They are described below:
• Classical crossover. Two individuals Hi and Hj are selected as parents for the crossover
operation. Let r and s be two integer random numbers such that r ∈[1KRMAX ] and
s ∈[r KRMAX ]. The elements of d r and d s, kr and ks are interchanged to obtain new
•
individuals.
Classical mutation. Let d i and k i be two control bits which have been selected for
mutation. The new control set is obtained by assigning 0 and 1 randomly to d i and k.
259
CHAPTER 7: BEHAVIOUR FUSION LEARNING
When crossover or mutation are to be applied, the algorithm selects one of the
previously defined methods randomly. Although different probabilities may be selected, in
the current application all the operators have the same probability of being selected.
7.2.4 Fitness function
While the other algorithm-related topics are more or less standard, the definition of the
fitness function depends strongly on the application context, which is not usually
straightforward. This function has to measure how good each of the individuals is, which
actually affect the performance of the robot. Thus, the fitness function takes into account how
the robot executes a predefined task. The user must define a simulation framework with, at
least, the following properties or parameters:
• Goal: where the robot starts, where the robot ends, and what the robot has to do.
• Map: a floor plant with walls, doors, corridors and obstacles.
• Iters: the number of simulation runs per individual.
To avoid infinite loops in the simulation some additional parameters are needed:
• Timeout: maximum allowable time for each run, measured in epochs.
• Maxcolls: maximum allowable number of collisions.
Evaluating the fitness function for each individual implies the use of the its fuzzy rule
bases as fusion rules in a BG program, and then the simulation of the robot with those BG
programs, with one different BG program for each individual in each generation. When a
simulation finishes, it returns the following values:
•
•
•
•
•
Gotcha: -1.0 if the goal is accomplished, 1.0 otherwise.
Epochs: the total number of simulations steps performed.
Colls: the number of robot-wall collisions.
Stage: the last portion of the whole task already completed by the robot.
Way: the percentage of the distance left from the start to the goal location in the current
stage.
The way (Eq. 7.11) is computed as a percentage in the shortest Euclidean distance
between the start and goal locations (Eq. 7.9) and the current robot and goal locations (Eq.
7.10). The fitness function (Eq. 7.12) is then computed using the different parameters and
values defined above.
260
HUMBERTO M ARTÍNEZ BARBERÁ
Total = (startx − goalx )2 + ( starty − goaly ) 2
(7.9)
ToGoal = (robotx − goalx ) 2 + (robot y − goaly ) 2
(7.10)
Way =
Total × 100
ToGoal
(7.11)
iters
Fitness = ∑ [(Timeouti − Epochsi ) − k1 ⋅ Collsi − k 2 ⋅ Gotchai − k3 ⋅ Stage ⋅ Way]
(7.12)
i
The fitness function thus defined tries to take into account the different aspects relevant
to a good robot performance: rewarding the full completion of the task (gotcha), rewarding
low execution times (epochs), and punishing collisions with the walls (colls). In addition, this
function is required to measure somehow the robot’s performance, even if it did not complete
the task. Thus, it also takes into account: the number of subgoals completely achieved (stage),
and the degree of completion of the last subgoal (way). Otherwise, the evolution cannot
differentiate between different individuals that are not good enough as final solutions, but
could be good candidates to continue evolving. This is very important at the beginning of the
learning process.
7.3 Experimental results
Different experiments have been carried out to test the performance of the genetic-based
algorithm for the behaviour fusion problem. In all cases, the learning process starts from a
given BG program, which is the one described previously as a sample architecture (see
chapter 6). The BG program specifies the number of input and output variables, which
correspond respectively to the different fusion variables (Figure 6.3) and the reactive
behaviours (Figure 6.2) of the reactive control agent. It also specifies the initial fuzzy rule
base (Listing 6.10), which is used in the initialisation process. The task (simulation parameter
Goal) has been defined in a similar way as in experiments nav-B and nav-C of the sample
architecture: the robot starts at a home location, it has to reach a goal location, and then it has
to return to the home position. As the fitness function is computed using the results of a
simulation, this needs to take the least time possible. For this reason only very small
environments are used (simulation parameter Map). Small environments have the
disadvantage of presenting only a few situations, which are useful for the learning process.
Thus, the approach used for training the fusion rule base is to run the simulation in three
different small scenarios which are complementary (Figure 7.1). These scenarios represent
three classical reactive problems: the goal is occluded by a symmetrical obstacle, the robot
has to turn to the left while the goal is to its right, and the robot has to turn to the right while
the goal is to its left. If only one simulation is run per individual (simulation parameter Iters),
only the first scenario is used. Otherwise, the other two are also used if Iters is set to 2 or 3. It
is important to note that the alignL and alignR behaviours (see chapter 6) are not used due to
261
CHAPTER 7: BEHAVIOUR FUSION LEARNING
the size of the scenarios: these behaviours are intended for greater environments with wider
corridors.
Figure 7.1: Simulation maps for learning.
Different tests have been carried out to select the appropriate parameters of the genetic
algorithm, which are summarised below (Table 7.1). As can be noted, the mutation and
crossover probabilities are higher than in typical genetic-based systems. The reason is related
to simulation execution time. If the probabilities are set to low values the exploratory time
(i.e. the number of generations needed to explore different parts of the solution space) is
greater. If the probabilities are set to higher values the exploratory time is smaller at the
expense of a poorer local search. In the current application the global search capability is
preferred, and thus, high probabilities are used.
Parameter
pf1, pf 2, pf3
sth
bth
Pm
Pc
RMAX
n
m
Meaning
Initialisation probabilities for F1, F 2, F3
Set size cut-off threshold
Background rules number threshold
Probability of mutation
Control set
Rule set
Probability of crossover
Control set
Rule set
Maximum number of rules
Number of input variables
Number of output variables
Value
0.33, 0.33, 0.33
0.05%
7.5%
0.15
0.25
0.3
0.9
100
6
5
Table 7.1: Parameters of the genetic algorithm.
Using the previously defined learning environment setup and genetic parameters, an
experiment (Test#31) is performed with the initial fusion rule base modified so that the robot
neither reaches the goal nor collides. The underlying idea is to test the exploratory ability of
the algorithm because the initial fusion rule base is very simple. However, it is not a search
from scratch because the rules provide a good starting point. The specific parameters of the
experiment are shown below (Table 7.2). The fitness function (Figure 7.2) starts with negative
values, which corresponds to when the goal is not reached, and ends with high positive
values, which corresponds to when the is goal reached in a small number of epochs. The dark
curve corresponds to the fitness of best individual, while the light one corresponds to the
average fitness of all the individuals. It can be noted that the population is benefiting from the
evolution (the average is slowly growing). The learning process took 111.9 hours to finish in
a Mac PowerPC G3 at 300 MHz.
262
HUMBERTO M ARTÍNEZ BARBERÁ
Parameter
Geners
Popul
Iters
Timeout
Maxcolls
Meaning
Number of generations
Number of individuals
Number of simulations per individual
Maximum number of epochs
Maximum number of collisions
Value
1033
8
2
500
5
Table 7.2: Parameters of the Test#31.
Test#31
500
400
300
200
100
Best
0
0
200
400
600
800
1000
Avg
1200
-100
-200
-300
-400
-500
Geners
Figure 7.2: Evolution of the fitness function (Test#31).
As stated previously (see chapter 5), the behaviour definition approach of BG and the
underlying BGA model allow the use of weights to indicate the importance of each behaviour.
In the learning context these weights play an important role because the information is not
evolved and is kept static during the learning process. This information is contributed by the
user which acts as a domain expert. The behaviour fusion mechanism uses both the fusion
fuzzy rules and the behaviour weights. Thus, evolving just the fuzzy rules makes sense, and it
is justified by the fact that weights establish high level priorities, while fuzzy rules give a
finer control. To test the influence of the weights in the learning process, two experiments
(Test#34 and Test#35) have been carried out. The first one has been performed using different
weights for the different behaviours and the second one has been performed without using
different weights (i.e., all the weights have been set to 1.0). The specific parameters of both
experiments are shown below (Table 7.3).
Parameter
Geners
Popul
Iters
Timeout
Maxcolls
Meaning
Number of generations
Number of individuals
Number of simulations per individual
Maximum number of epochs
Maximum number of collisions
Value
50
6
2
700
7
Table 7.3: Parameters of the Test#34 and Test#35.
The same set of initial fusion rules as in the nav-B experiment (see chapter 6) is used to
launch two learning processes: one which uses user defined behaviour weights and another
which does not uses behaviour weights. Each learning process took 15.7 hours to finish in a
PC Pentium III at 450 MHz. The first one (Figure 7.3) achieves a lower best fitness value but
achieves a much better population average.
263
CHAPTER 7: BEHAVIOUR FUSION LEARNING
Figure 7.3: Evolution of the fitness function (Test#34 and Test#35).
To test the generalisation capabilities of both approaches a series experiments has been
carried out. Each experiment consists of five tests to measure the performance in different
scenarios, which are described below:
• Test A*: the initial rule set using behaviour weights and a path planner using the A*
•
•
•
•
algorithm.
Test A*+GA +: the same as above but the fusion rules are those of Test#34.
Test A*+GA -: the same as above but the fusion rules are those of Test#35.
Test Direct: the initial rule set using behaviour weights and a path planner using the
reactive approach.
Test Direct+GA+: the same as above but the fusion rules are those of Test#34.
The experiment has been run five times, and the results and their averages are
summarised below (Table 7.4). As can be noted, the use of weight produces better results in
all cases: the average of the test A*+GA+ is greater than the average of the test A* and the
average of the test Direct+GA+ is greater than the average of the test Direct. The test A*+GAproduces better local results but on average produces poorer results compared to the tests A*
and A*+GA+. It should be noted that the tests involving the reactive path planner are more
difficult because the timeout is fixed for all tests, and, as has been shown (see chapter 6), it
does not produce optimal paths.
A*
Tests
Average
533
529
523
522
514
524.2
A*+GA+
541
550
549
548
539
545
A *+GA558
-1010
-1140
569
-1320
-468
Direct
520
-1135
508
513
-1140
-146.8
Table 7.4: Influence of priorities in the learning process.
7.4 Conclusions
264
Direct+GA +
529
-635
341
513
509
251
HUMBERTO M ARTÍNEZ BARBERÁ
A genetic-based method has been developed to assist in the development of behaviour
fusion fuzzy rule bases. The algorithm is used to obtain or learn a new fusion rule set from an
initial set of rules provided by the user (this may be the empty set). The genetic algorithm has
been adapted to encode individuals which are converted to BG fusion fuzzy rule bases. Thus,
these individuals can be standard MIMO fuzzy rules and also background fuzzy rules.
Moreover, the fitness function has been defined to include the effects of the experimental
evaluation of the robot’s behaviour by using simulation. This function takes into account
different parameters which measure the robot’s performance in both successful and
unsuccessful runs. This point is crucial in the initial part of the learning procedure because
most of the individuals do not reach the goal, and it is very important to discriminate between
them to speed up the learning process.
Different experiments have been carried out to test the influence of the behaviour
weights and the results of the learning process. As has been shown, behaviour weights can aid
the learning process by narrowing the search space through using user’s knowledge of the
domain. Moreover, the use of the learning process improves the performance of the robot with
respect user defined fusion rules.
265
Chapter 8
A Web-Based Teleoperation Application
8.1 Introduction
In recent years, web-based robotics applications (Taylor and Dalton, 2000) have grown
in such way that many of the major robotics laboratories have applications to access one of
their robots from the web. The roots of new web-based robotic applications appear back at the
early days of teleoperation. In teleoperation, a human operator forms part of the closed-loop
control of a robot. At its simplest, communication is through mechanical linkages and
feedback is by directly viewing. Such equipment has long been used in the nuclear industry,
where just a few centimetres of lead glass separate the operator from the robot. Teleoperation
from greater distances through communication links and with video or other forms of
feedback finds application in hazardous environments, notably in the nuclear industry
(hundreds of metres), the control of underwater remotely operated vehicles (several
kilometres), and space applications (thousands of kilometres). However, regardless of the
application, all teleoperation systems have the following specific features (Sayers, 1999):
• An operator interface, incorporating a master input device that the operator uses to
•
•
command the system.
A slave output device that performs the operator’s commanded actions at the remote site.
A communication scheme between sites.
As teleoperation systems have moved from a mechanical connection to a digital
technology, the opportunities for assisting the operator have increased. Possibilities for
operator aids include (Sayers, 1999):
• Reindexing. Reindexing is where the connection between master and slave devices is
•
temporarily removed, the master arm is repositioned, and then the connection is reestablished. This allows a master device with a relatively small workspace to move a slave
device through a large motion by using a series of discrete motions. This technique is
basically used with teleoperated robotic arms (Sayers, 1999).
Feedback interfaces. The use of video cameras at the remote site has long been used as a
feedback aid to provide views of the task site from different points of view. Video
transmission demands high bandwidth availability. When this is not possible, computer-
267
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
•
generated imagery supplies the operator with a virtual interface that combines low
bandwidth sensory data to form a realistic image. These virtual interfaces are also useful
to overlay computer predictions or visual clues onto video images (Kim and Stark, 1989).
Kinesthetic aids. It is interesting for some tasks that the operator receives physical
feedback additionally to video feedback. These haptic-augmented systems give superior
performance compared to purely visual systems. Using haptic systems the operator can
sense the response of its commands directly in the control interface. This technique is
applied in teleoperated robotic arms and in remotely operated vehicles control (Brooks et
al., 1990).
When there is significant time delay in the communication link (that is, as the distance
between sites increase), instability occurs and manual closed-loop control is no longer
suitable (Sheridan, 1992). There are some techniques to cope with such delays:
• Long range teleoperation. It was the first way of eliminating manual closed-loop control.
•
•
It consists of a classical open-loop control where the operator does not have direct
feedback. These systems tolerate delays by using the human operator as an intelligent link
to the control system. An error in the pose of the remote device is visible in the returned
imagery, and the operator sends commands to the remote device to compensate. While
open-loop control systems are not certainly optimal, they are simple and they work.
However, as delays become large, the efficiency with which the operator can work drops
dramatically (Ferrell, 1965).
Predictive displays. The way to improve performance is to provide the operator with
immediate feedback. Because of the communications delay, that immediate feedback
cannot come from the remote site, and instead it must be generated at the operator station.
Predictive display is suitable where a model of the system is available. Examples of such
systems include manipulators control (Machida et al., 1990) and underwater vehicles
control (Sayers, 1999).
Supervisory control. One technique that can improve operator performance and avoid the
instability problem is a shared or supervisory control scheme, where control of the robot is
shared between a local control loop and the remote human operator (Sheridan, 1992). The
goal is not to make a robot that can realise the full task autonomously but to enable the
robot to perform some simple subtasks that the operator can sequence.
Web telerobotics is a concept still in its infancy. It involves control of a remote robot or
device from within a web browser over the Internet. It requires a supervisory control scheme
to avoid instability and makes a robot available to vast number of people, thus opening up a
new range of applications. In August 1994, a SCARA type robot was connected to the
Internet at the University of California at Berkeley7. Four weeks later, in September 1994, an
ASEA IRB-6 robot was connected to the Internet through a web server at the University of
7
http://www.usc.edu/dept/garden
268
HUMBERTO M ARTÍNEZ BARBERÁ
Western Australia 8 so that anyone with web access could direct the robot to manipulate
wooden blocks on a table. These were the first physical devices connected to the web (Taylor
and Dalton, 2000a). In August 1996, the IRB-6 robot was replaced by an ABB IRB 1400
robot. The software was rewritten to work with the new robot and to incorporate knowledge
gained from the first version. The robot is now in use most of the time.
Other groups have since pursued this concept. Cox9 put up a system that allows web
user to remotely schedule photos from a robotic telescope. Paulos and Canny have
implemented several Internet robots with elegant user interfaces (Paulos and Canny, 1996).
Bekey and Goldberg used a six-axis robot, rotating platform, and stereo cameras to allow
remote scholars to closely inspect a rare sculpture (Akatsuka et al., 1998). An impressive
variety of other Internet robot projects were reported in the recent IROS and ICRA
workshops. To get a rough idea about the number and type of such web based robots see
(Taylor and Dalton, 2000a)(Goldberg et al., 2000)(Saucy and Mondada, 2000)(Schulz et al.,
2000)(Taylor and Dalton, 2000b)(Stein, 2000)(Simmons et al., 2000)(Hirukawa and Hara,
2000).
Additionally, there is a long list of mobile robots that can be controlled over the web.
Most of these systems provide exclusive remote control to a single person or provide queues
to schedule user requests. KhepOnTheWeb (Saucy and Mondada, 2000) is a typical
representative of mobile robots with a web interface. Users can give elementary movement
actions to the robot and observe it using several web-cams. Xavier (Simmons et al., 2000) was
probably the first mobile robot to operate in a populated office building controlled through the
web. Xavier can be advised by web users to move to an office and tell a “knock-knock” joke
after arrival. The robot collects requests both on-line and off-line and then process them
during special working hours. After the completion of the tasks, the robot informs the user via
e-mail. The Autonomous Observer (Simmons et al., 1999) consists of a semi-autonomous
mobile robot that follows routes specified by a remote user, searches for soft-drink cans, and
collects them in a bag. For this mission the robot has attached a gripped and a basket. The
remote user sends the robot request to move to different places and when to look for cans. The
Rhino and Minerva museum tour-guide robots (Schulz et al., 2000) are good examples of
successful interaction of web-robots with people in real world environment. They operate in
highly populated museums, performing the function of a robotic tour guide, and leading the
people in the museum through exhibitions. The web is used to make the museum accessible to
remote visitors, who watch images from the robot’s camera and who could control the robot’s
operation. Web-based operation can take place while the museum is open, which means that
both virtual and physical visitors share the control of the robot. NASA has also used Internet
for control of vehicles in space (Tso et al., 1998). For example NASA’s rover interface10
provides the user with different views of the surrounding environment. The images in these
8
http://telerobot.mech.uwa.edu.au
9
http://www.eia.brad.ac.uk/rti
10
http://mars.graham.com/wits
269
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
views are calibrated and contain range data. The user performs measures of the environment
and navigation through a point-and-click interface.
Although the field is rather immature and there are no commercial services provided
with these technologies, there are a large number of groups pursuing this concept. One of
these is well funded and provides an Internet-accessible machining service, called
CyberCut11. This suggests commercial services will become common in the future. Webbased telerobotics is suitable not only for classical hazardous environments but also for new
areas the followings:
• Tele-manufacturing. The CyberCut machining service is available from the web, and it
•
•
•
•
seems to be the first commercial venture of web-telerobotics. Remote access to machinery
can reduce production costs by reducing external manpower. This is a very promising
application.
Training. Providing access to robots and other expensive equipment for training purposes,
where purchasing cannot be justified, is another promising area. Nevertheless, it seems
that there are currently no web-based training applications.
Entertainment. Following the experience of the University of Western Australia webbased robotics arm, it is apparent that many people consider operating them an
entertainment. There are other successful experiences of museum tour-guide robots like
Rhino and Minerva (Schulz et al., 2000).
Surveillance. The Autonomous Observer (González-Baños et al., 1999) is an incipient
web-based application where a mobile robot can detect and track objects. This has
practical use in surveillance and law-enforcement applications. The operator can
command the robot to follow a suspicious subject without being exposed to danger. Weboperated surveillance robots can also have utility for personal applications when the
owners of a house go for holidays.
Exploration. The experiments carried out by NASA (Tso et al., 1998) have shown that
there is a potential application of web-robotics in space missions. A private company
called LunaCorp, in conjunction with the Carnegie Mellon University, plans to launch the
first private lunar mission12. The project involves landing of a pair of web-based
teleoperated robotic vehicles on the Moon’s surface. This application is intended also as
an entertainment mission.
11
http://cybercut.berkeley.edu
12
http://www.ri.cmu.edu/lri
270
HUMBERTO M ARTÍNEZ BARBERÁ
8.2 Web robotics technologies
Currently there are many different technologies which enable a robot to be teleoperated
from a network. The term web-telerobotics is commonly used to refer to web based
telerobotics applications, that is, teleoperation applications that can be run inside, or be
launched from, a web browser. What these applications have in common is that, as long as the
client machine has a compatible web browser and the required Java Virtual Machine (Java
VM), they are platform independent. This is not really a constraint because there are free
browser implementations and free Java VMs for almost all typical operating systems and
machines. This section describes the technologies involved in some successful web-robotics
applications. These do not include cases like the GRIP13 system (González-Baños et al.,
1999), which makes use of X-Windows programs for the client side and cannot be considered
a true platform independent solution: an X-Windows server is needed on the client machine,
and in some cases it is difficult to get free implementations.
8.2.1 The UWA Telerobot
The University of Western Australia’s Telerobot (Taylor and Dalton, 2000a) goal is to
provide enough information to the remote operator of a robotic arm to make interpretation
easy while minimising transmitted data and maintaining a simple layout. The Telerobot,
which initially was an ASEA IRB-6, is an ABB IRB 1400 robotic arm. The operator users of
the robot submit requested moves by filling in an HTML form, done by clicking images of the
workspace, or by specifying multiple moves in a script form. Additionally, there are observer
users that do not control the robot but observe what the robot is doing.
The control method of the UWA Telerobot (Figure 8.1) is implemented by submitting
the details of an HTML page, the user interface (Figure 8.2), to a web server, which receives
the request and launches a CGI script. One CGI is launched for every request, and several
copies may be running simultaneously to serve an operator and many observers. Only one
person at a time can control the robot. The CGI script determines whether the request has
come from an operator, and in such case, establishes communication with the robot server and
the image server. It then generates an HTML page from a template, replacing tags
representing variables with their current values. The image server provides some services
such as operator-controlla ble image sizes, a software zoom, and images from multiple
cameras. Once a move is complete, new images of the workspace are taken, and a new form
with these images and the robot position is returned via CGI to the user.
13
http://underdog.stanford.edu/GRIP
271
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
Figure 8.1: UWA Telerobot system architecture.
a) Operator/observer interface.
b) Live video feedback.
Figure 8.2: UWA Telerobot user interface.
Other technologies like JavaScript and Java have been used in Telerobot. JavaScript
provides the capacity for limited local data processing instead of relying on CGIs for interface
management. When specifying 3D locations using images, clicks in two different images are
272
HUMBERTO M ARTÍNEZ BARBERÁ
needed. Under normal CGI operation a click results in a request being sent to the web server.
JavaScript is used to allow specification of robot moves with a single submission to the web
sever. Java is used to produce a wireframe model of the robot, which allows users to simulate
moves before submitting them to the real robot.
Some other web-based teleoperated robotic arms are very similar in both operation and
technological support. For instance, the Mercury Project (Goldberg et al., 2000) allows users
to command the remote arm to perform archaeological-like tasks.
8.2.2 EPFL KhepOnTheWeb
The Swiss Federal Institute of Technology of Lausanne’s KhepOnTheWeb (Saucy and
Mondada, 2000) goal is to provide access, through network communication facilities, to a
complex and unique mobile robotics setup for carrying out research in mobile robotics. The
system consists on a mobile robot that moves in a wooden maze. The robot is a small Khepera
mobile robot (55 mm in diameter) which carries a small color CCD camera. Additionally
there is a panoramic overhead camera. The labyrinth is made in such a way that the operator
can see several effects. It contains a mirror that allows the user to watch the robot, some walls
are lower than the robot allowing the user to look further, and there is a ramp that gives the
robot access to a view outside the maze.
Figure 8.3: EPFL KhepOnTheWe b system architecture.
The control method of KhepOnTheWeb (Figure 8.3) is based on plain HTML pages.
When the user connects to the system (Figure 8.4) the server creates a new CGI process,
which continuously updates the video on the user interface. The user, using clickable images,
can control the robot movement, the overhead camera orientation and zoom, and has the
option of changing from robot camera to overhead camera. Additionally, a Java applet
running in the client continually requests information about the state of the robot and the time
left to the user.
273
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
There is no local intelligence on the robot such as obstacle avoidance. Thus, the
operator performs a closed-loop control of the robot using video as the feedback. In the
Khepera case, there is no need for intelligent-like behaviours because of the lightweight of the
robot. This also means that there is no risk of destroying a wall or the robot itself. The
KhepOnTheWeb designers chose this option because having direct control has the advantage
that the user can see the result of his own action without any external contribution. Of course,
this has a major drawback: the control of the robot is more difficult without help and under
important delays. Moreover, the robot operates in a tethered mode, and thus, autonomy is not
a concern: is always available and it does not run out of batteries. Although the experience
with KhepOnTheWeb is quite satisfactory for the user, the approach does not scale to real
world environments.
a) Operator interface.
b) Overhead live video feedback.
Figure 8.4: EPFL KhepOnTheWeb user interface.
Additionally, a virtual model of the experiment has also been developed using VRML
and Java. This provides the user with the possibility of training himself by running a
simulation locally on his machine, and without time-delay constraints (Michel et al., 1997).
8.2.3 CMU Xavier
The Carnegie Mellon University’s Xavier (Simmons et al., 2000) web venture started as
an aid in testing navigation algorithms. A web site was provided so that users could send tasks
to the robot and have some feedback on the degree of completion of the task. The experiment
consists of an autonomous mobile robot that moves in a real building. The robot is a custom
mobile robot (0.6 m in diameter) based on a robotic base from Real World Interface, Xavier,
274
HUMBERTO M ARTÍNEZ BARBERÁ
which has a full set of sensors (bumpers, encoders, laser, and sonars) and a color CCD
camera.
The control method of Xavier (Figure 8.5) is very simple and is based on plain HTML
pages. There are two different interfaces: the observer interface (Figure 8.6-b) and the
operator interface (Figure 8.6-a). When an observer connects to the system the server creates
a new CGI process, which continuously updates the video on the user interface. The user is
provided, in addition to the video, with a map of the building and the current position of the
robot. The operator interface is just another HTML where the user can select the required task
and then send the command to a CGI. When the task is completed, the user is notified via email. There are three types of tasks: telling a joke, saying hello, or taking a picture. In all
cases the user selects the destination room where the task is to be performed.
Figure 8.5: CMU Xavier system architecture.
Xavier, as opposed to KhepOnTheWeb, does not permit operator closed-loop control.
Instead, it allows only very high-level task specification. To achieve the desired task, the
robot has a task sequencing and a navigation module (Figure 8.5). The robot is provided in
advance with a model of the building. The navigation module is composed of four layers:
• Servo-control. This layer controls both the base and the pan-tilt head, and provides simple
•
velocity and/or position control. It also provides feedback on command execution and
position information, based on encoder readings.
Obstacle-avoidance. This layer keeps the robot moving in a desired direction, while
avoiding static and dynamic obstacles. This module is based on the Lane-Curvature
method to find highly transversable lanes in the desired direction, and the CurvatureVelocity method to switch between lanes and avoid dynamic obstacles (Ko and Simmons,
275
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
•
•
1998). Both methods take vehicle dynamics into account to provide safe high-speed
motion, at an average of 0.45 m/s.
Navigation. The navigation layer is responsible for getting the robot from one location to
another. It uses a partially observable Markov decision process (POMDP) model to
maintain a probability distribution of where the robot is at all times (Koening and
Simmons, 1998). Thus, while the robot almost never knows precisely where it is, it rarely
gets lost.
Path-planning. This layer determines efficient routes based on a topological map that is
augmented with rough metric information. It uses a decision-theoretic approach to choose
plans with high expected utility, and which take sensor and actuator uncertainty into
account.
a) Operator interface.
b) Observer interface.
Figure 8.6: CMU Xavier user interface.
The Xavier experiment suffers real world problems: inaccuracy of the sensors, moving
obstacles and autonomy. Despite these facts, the system perform satisfactorily most of the
time, with an extremely low failure ratio. Since the robot is being tasked at a high level, high
bandwidth interaction is not needed and both video and position are updated at a very low
rate. Even if communication is lost completely, Xavier can still continue achieving its current
task. The Xavier approach has a real negative impact on web-based interaction: commanding
at a high level is not as interactive as teleoperation. Some other web-based tasked mobile
robots are very similar in both operation and technological support. For instance, the Rhino
and Minerva museum tour-guide robots (Schulz et al., 2000) allow users to select which area
of the museum they want to see. One of the differences between Minerva and Xavier is that
the former uses a higher update rate so that user experience feels more interactive.
276
HUMBERTO M ARTÍNEZ BARBERÁ
8.2.4 NASA WITS
The NASA Web Interface for Telescience (WITS) has been developed to provide
Internet-based distributed ground operations for planetary lander and rover missions (Tso et
al., 1998). WITS provides an integrated environment for operations at a central location and
collaboration with geographically distributed scientists. There are two versions of WITS, the
private one for NASA use, and the public one that is free and available to the general public.
Both versions have similar functionalities, but they differ in real robot kinematics and
commands (as required by law). WITS has been used and deployed satisfactorily in some
NASA Mars missions:
• Sojourner mission (1997). The WITS control interface was tested off-line with a
•
•
simulated rover, but providing real data.
Polar Lander mission (1998). The WITS control interface was tested on-line to control
both the robotic arm and the camera of the mission. The public version works like the
Sojourner version: simulated robot with real data.
Field Integrated Design and Operations (FIDO) mission (2003). The WITS control
interface will be used for Internet based control of the FIDO rover. During May, 2000
field tests were conducted with a FIDO prototype on Earth (Backes & Norris, 2000).
During these tests, operators used WITS to visualise downlink data and generate
command sequences from several laboratories and control centres round the world.
The WITS system architecture (Figure 8.7) includes a database, a server and multiple
clients (Backes et al., 2000a)(Backes et al., 2000b). The database is a structured file system
that holds downlink data and uplink sequence information. The server provides
communication between clients and the database. Clients are distributed over the Internet and
provide the interface needed to view downlink data and generate command sequences. The
WITS system is implemented using Java and some of its extensions, like Java3D (for drawing
3D synthetic models) and the Java Cryptography Extension (to assure secure
communications). The user gets the software through an HTML page and then stops using
HTML. The Java applet is then in charge of accessing the user’s local WITS database and the
WITS server. The user generates a sequence of actions locally, using the FIDO simulator to
check the results. When finished, the user sends the sequence to the WITS server. It is
checked using a sequence integration and verification module (SEQ_GEN) and then the full
sequence is sent to the rover. Additionally, the user accesses the data received through the
downlink into the remote WITS database. This data include the robot position, images from
the navigation stereo cameras, the panoramic stereo cameras and other sensors.
WITS, as happens with Xavier, does not allow the closed-loop control of the remote
rover, although the reason is quite different. WITS has been designed to assist in space
missions where delays are very considerable and closed-loop control offers very low
performance or none at all. Nevertheless, WITS, in contrast to Xavier, allows the user to
specify complete tasks without any constraints. The user generates a sequence of actions that
the rover executes at the end of the process. The sequence includes an integrity test.
277
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
Figure 8.7: NASA WITS system architecture.
Figure 8.8: NASA WITS user interface.
278
HUMBERTO M ARTÍNEZ BARBERÁ
One of WITS’s advantages is that it allows collaboration (Backes et al., 2000a) between
mission users by providing secure communications and daily downlink data and allowing
them to specify and generate command sequences that are stored in a common database.
Users can see each other’s targets and sequences and can use target inputs from each other’s
sequences. Secure communications are provided through the WITS Encrypted Data Delivery
System (WEDDS) service (Norris and Backes, 2000), which is a communication platform
based on the Secure Sockets Layer (SSL) protocol. WITS is a fully functional Internet-based
teleoperation system which features a very complete and detailed user interface aimed at
assisting in space missions. Although the user is presented with many controls and options,
interactivity is greatly penalised because the user does not feel that he is really controlling the
robot (this was not a design concern), as happens with Xavier, but when it is in simulation
mode.
8.3 The BGen-Web system
8.3.1 Objectives
BGen-Web is a proof-of-concept web-robotics exploration application. The BGen-Web
system aim is to provide a web-based service for highly interactive supervisory control
(Sheridan, 1992) of a mobile robot in order to perform exploration of scarcely known, or
completely unknown, environments. This objective is very similar to some of the examples
described in the previous subsection. Nevertheless, it also exhibits features not found in those
examples. The BGen-Web application’s objective can be summarised in the following points:
• The application should run in real world environments, both indoor and outdoor. In the
•
•
•
latter case, as there is no all-terrain vehicle available, only the on-road case is considered.
The intended use is for exploration and surveillance of scarcely known or completely
unknown environments.
The system should be built on top of the BGA architecture and thus use all the resources
from the software already developed.
The user’s experience should be as highly interactive as possible, and only limited in the
main network bandwidth requirements.
The user’s degree of control should be profile dependent, from operator closed-loop
control (mostly for data-gathering use) to operator supervisory control (for regular remote
operators).
A major premise is that neither the robot nor the remote operators know the
environment in advance. In some sense, both the NASA WITS and EPFL KhepOnTheWeb
systems can be considered to belong to the same class of applications. In both cases, the first
time the operator connects to the system, he does not know anything about the environment
and has to rely on the robot’s sensors to get a model of the world. The key difference is that in
BGen-Web the user himself does not build a model of the environment but uses the model
279
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
provided by the robot. Thus, in BGen-Web the robot is provided with range sensors to
construct a model of the environment and positioning sensors to locate the robot within the
model. In addition the robot has a camera to provide the operator with video feedback.
Although the operator can be allowed to close the control loop, the operator typically only
sends desired goal locations, referred to the robot’s initial position, and then the robot moves
autonomously to that place. While the robot is moving, it constructs a map of the environment
and computes the path to reach the goal, without colliding with obstacles. Hence, BGen-Web
makes full use of the characteristics of BGen and the sample agent architecture developed for
the Quaky-Ant robot.
Another important aspect of BGen-Web is the task granularity. KhepOnTheWeb
provides interface to control at the lowest level, i.e., direct control of the motors. WITS
provides control commands at a higher level, for instance, positioning the robot at a given
heading and distance. Xavier provides control commands at the task level, for instance, going
to a previously known room. BGen-Web provides commands at an intermediate level
between WITS and Xavier. Thus, BGen-Web achieves interactivity levels beyond those of
KhepOnTheWeb (much more information is provided to the operator), and WITS and Xavier
(they do not feel interactive). Additionally, BGen-Web, like WITS and Xavier, operates in
real world environments. Nevertheless, BGen-Web operation time and distance is somewhat
limited because it incorporates localisation methods that degrade noticeably with time and
distance. In any case, as was stated at the beginning of the section, the system is intended to
be a proof-of-concept and, by no means, a robust and readily deployable system.
8.3.2 BGen-Web architecture
As shown in the previous examples (Telerobot, KhepOnTheWeb, and Xavier), HTMLbased interfaces are not the answer for highly interactive systems because HTML demands a
lot of processing on the server side. As WITS demonstrates, the way to accomplish the
objective is by increasing the processing on the client side, and thus, Java technology provides
a multi-platform and easily deployable framework to achieve high performance interfaces (in
the multi-platform context). As the whole BGA and BGL implementations are already based
on Java, it is a very simple process to incorporate previous developments in a web-based
architecture. Although the BGen-Web architecture is not robot dependent, the current
implementation relies on the Quaky-Ant robot (see chapter 2). Among other elements, it has a
series of sonar and infrared sensors, a video camera, a radio modem for bidirectional data
transmission and a 2.4 GHz video emitter.
As in WITS, the BGen-Web architecture (Figure 8.9) uses HTML as the means of
distributing software and configuration files, and launching the Java applet. The clients
provide interfaces to sensor data, video feedback, and robot control. The web server provides
the software repository and a gateway to the BGA architecture protocol (KQML). The video
server provides the real-time video. Both servers can run physically on the same machine. The
BGA gateway acts as a proxy. It receives requests from the different clients and puts the
information onto blackboard, then it reads sensor data from the blackboard and sends the
information to the different clients.
280
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 8.9: BGen-Web system architecture.
The real-time video feedback has been developed using the Java Media Framework
(JMF), which is an API that allows incorporating time based media types in Java applications
and applets. It allows programmers to develop software that presents, captures (using video
cameras, microphones, etc), and stores such media, and also controls the kind of processing
applied to the multimedia flows. Moreover, JMF offers RTP support to be used with ondemand audio and video applications or interactive videoconferencing applications.
The Real-Time Transport Protocol (RTP) is a protocol that provides end-to-end realtime data transmission. It is the Internet standard for audio and video transport, as it is defined
in the IETF (Internet Engineering Task Force) RFC 1889, developed by the AVT working
group (Audio Video Transport). RTP by itself does not offer on time delivery of packets or
other quality of service (QoS) warranties. RTP is therefore complemented with the Real-Time
Control Protocol which allows monitoring the quality of data distribution and controlling and
identifying RTP transmissions. An RTP stream is a series of UDP packets, which have their
origin in a given source and can be sent to a multicast address, thus producing a multireceptor emission which makes a scalable use of the network. Each packet has two parts: a
header structure and a payload. There is a 7 bits payload-type identification field in the
header. Its value is an index in a payload format profile. The mappings for audio and video
payloads are specified in the RFC 1890.
The media type used for video transmission in BGen-Web is the H.261 ITU
recommendation (Tekalp, 1995). It is a video compression standard developed to support
281
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
ISDN videoconferencing services at n x 64 kbps, with n = 1 … 30 (n is the number of ISDN
channels). The H.261 standard presents two important features: it specifies a maximum
coding delay of 150 ms (the user does not appreciate interactive visual feedback with a
greater delay), and the low-cost VLSI hardware implementation of the codecs is possible (this
is very important in standardising the format in videoconferencing equipment). The input
image format is specified using the Common Intermediate Format (CIF). For low bit-rate
applications there is also a smaller format (QCIF) which is a quarter of the CIF. The CIF and
QCIF formats specification is as follows (Table 8.1):
CIF
Number of pixels / line
Luminance (Y)
Chrominance (U, V)
Number of lines / image
Luminance (Y)
Chrominance (U, V)
Interlacing
Temporal ratio
Aspect ratio
QCIF
360 (352)
180 (176)
180 (176)
90 (88)
288
144
1:1
30, 15, 10 o 7.5
4:3
144
72
1:1
30, 15, 10 o 7.5
4:3
Table 8.1: H.261 CIF and QCIF image formats.
At 30 frames/s, the CIF data rate is 37.3 Mbps and the QCIF data rate is 9.35 Mbps.
With 10 frames/s QCIF images, a 48:1 compression is required to use only one 64 Kbps
channel. CIF images should only be used with six or more 64 Kbps channels. In the current
BGen-Web implementation, the video server sends H.261 QCIF images at 7.5 frames/s,
which produces enough video feedback quality over Internet.
The operation method in BGen-Web is very simple. When the applet (Figure 8.10)
starts up, it connects to both the BGA proxy (using a unicast address) and the video server
(using a multicast address), and starts receiving sensor data and real-time video feedback. The
applet has some widgets to show the current sensor readings (sonar, infrared, virtual sensors,
bumpers, compass, odometry and GPS), the video feedback and the current fuzzy grid map,
and to drive the robot. At start-up the map is empty and it is updated while the robot moves.
The operator in supervisory control mode can choose the desired goal point by clicking on the
map. The robot runs a BG program which tries to reach the current desired goal, and when it
reaches the goal, it stays still. The operator can choose a different goal point at any time. In
this way, the typical operator can explore unknown environments. In addition, for data
acquisition purposes, the operator can also work in human closed-loop control mode using the
steer and speed sliders. This option is only available to a certain set of operators whose
machine identifiers are stored in the BGA proxy. Thus, whenever a steering/driving command
arrives at the proxy, its source address is compared to the allowed list, and if not found, it is
not forwarded to the robot.
The BG program implements an architecture similar to that described in chapter 6,
which is composed of three agents:
282
HUMBERTO M ARTÍNEZ BARBERÁ
• Reactive behaviours. This agent contains the same set and implementation of behaviours
•
•
used for autonomous robot navigation. These include obstacle avoidance and goal seeking
behaviours.
Navigation. This agent builds a fuzzy grid map and computes robot paths using the A*
path planning algorithm.
Task supervision. This agent is an FSM which supervises the current goal. If the robot
reaches the desired goal position, it stops the robot until a new goal position is available.
Figure 8.10: The BGen-Web teleoperation applet.
8.3.3 Experiments and results
The BGen-Web system has been tried and tested in different places and configurations.
With this proof-of-concept application the service is not available all the time and it is
mounted only for exhibitions or tests, because the hardware is shared for other experiments. A
typical set-up (Figure 8.11), mainly used for exhibitions, is as follows: the Quaky-Ant robot is
used in its standard hardware (see chapter 2) and software (see chapter 6) configurations, with
a modified task supervision FSM. The robot uses a radio modem bidirectional link to send
sensor data and receive commands (usually goal points), and a 2.4 GHz video emitter to send
video feedback. The analogue video is digitised and streamed in a PC running the video
283
CHAPTER 8: A WEB-BASED T ELEOPERATION APPLICATION
server. The radio modem is connected to a PC, which manages the PPP link and runs the
BGA proxy. Another PC runs the web-based client to monitor sensors, in order to get video
feedback and to send desired goal points.
Figure 8.11: Experimental set-up.
A sequence of images from a teleoperation run is shown below (Figure 8.12). The robot
is commanded to reach the position occupied by the public. The robot tries to get to the
position twice, but cannot occupy the position because there is an obstacle. Finally it is
ordered to reach a position near the exhibition table, were it stays still.
284
HUMBERTO M ARTÍNEZ BARBERÁ
Figure 8.12: Sequence of a Quaky teleoperation run.
8.4 Conclusions
BGen-Web is a proof-of-concept web-robotics exploration application. The BGen-Web
system provides a web-based service for highly interactive supervisory control of a mobile
robot to perform exploration of scarcely known, or completely unknown, environments, and
as such it has been successfully tested in real world environments, both indoor and outdoor.
This system has been built on top of the BGA architecture in order to use all the resources
from the already developed software. Using sensor data and video feedback in real-time, the
operator’s experience is highly interactive when controlling a robot over Internet (it is only
limited by network bandwidth requirements). This is an important aspect which distinguishes
it from other web robotics applications.
Another important aspect of BGen-Web is the task granularity. Some systems provide
interfaces to control at the lowest level (for instance direct control of the motors), while others
provide control commands at a higher level (for instance positioning the robot in a given
previously known place). BGen-Web provides commands at an intermediate level: the user
chooses the desired position and the robot tries to reach it. Thus, BGen-Web shows
interactivity levels beyond those of KhepOnTheWeb (much more information is provided to
the operator), and WITS and Xavier (they do not feel interactive). Additionally, BGen-Web
operates in real world environments, as do WITS and Xavier.
285
Chapter 9
Conclusions and Future Work
9.1 Conclusions
This thesis presents a series of mobile robotics related works, which cover the different
layers needed to have a mobile robot perform tasks autonomously in an a priori unknown
environment. Each layer provides a series of capabilities that are used by upper level layers.
These layers are the physical platform, the sensor fusion layer, the navigation layer, the
control architecture and language layer and the application layer. The following subsections
summarise the results and the contributions achieved in each layer
9.1.1 The Quaky-Ant mobile robot
The Quaky-Ant is a simple but powerful design that features a flexible and modular
architecture. When measuring cost, performance and expandability Quaky-Ant compares
favourably with commercial robots. The main characteristics of this robot can be summarised
as:
• It features an open architecture. Two key points are the use of a standard I2C bus for the
•
•
interconnection of different modules and the definition of a generic open protocol.
One consequence of the design of the architecture is its robustness (mainly due to the
custom protocols) and its expandability (all the modules are addressed the same way
irrespective of their nature).
The architecture is not tied to any control architecture or development system, although it
serves as the reference point of the control architecture developed.
It operates smoothly and correctly in indoor environments and is also capable of
operating in non-rough outdoor environments. Either way, it does not exhibit an outstanding
top speed (approximately 0.45 m/s) because this was not a design concern. In contrast it
exhibits manoeuvrability and precision.
287
CHAPTER 9: CONCLUSIONS AND FUTURE WORK
9.1.2 Sensor fusion
Two different methods have been developed for solving the sensor fusion problem in a
priori unknown environments. In this problem, model based techniques that make use of a
priori information are not applicable, and thus, neural networks and fuzzy rule based systems
are used instead. These fusion techniques require some examples for the learning stage.
Obtaining such examples from real sensors is an extremely complex problem. The approach
followed is to use examples obtained using a simulator. For this reason, models of sonar and
infrared sensors have been evaluated and integrated in a simulation environment, which is
also used for sensor fusion validation. The sonar model is based on the sonar-tracing
technique which has been modified to take into account the sonar lobe effects, and the
infrared model has been obtained by linearising experimental values.
Once the neural network and fuzzy rules fusion methods have been trained, they are
tested on different simulated environments ranging from a small to a large environment. The
fusion methods are then compared with raw sensor data and with a simple and heuristic rule
based system. The results show that in most cases the proposed fusion methods perform better
than the raw sensors and simple rules. In addition, the fuzzy rule system obtained using fuzzy
tabu clustering has the best overall performance of all the different fusion methods tested.
The fuzzy tabu clustering is an algorithm based on the fuzzy c-means algorithm but
with the introduction of a global minimum search by means of a tabu search technique. The
algorithm has been successfully applied also to different examples showing comparable
results with the well-known fuzzy clustering algorithms available in the literature.
9.1.3 Navigation
Two navigation schemes have been developed to allow the Quaky-Ant robot to navigate
in a priori unknown environments. The first procedure relies on the assumption of reliable
odometry for short time and distance operations. Bearing this assumption in mind, the robot is
provided with a map building method and a path planning algorithm, which are run
simultaneously and thus the robot is required to build the map while it is moving, resulting in
a continuous movement. This approach is based on a fuzzy grid map for map-building and an
A* based algorithm for path-planning. The fuzzy grid map has been modified to accomplish
the required task: the virtual sensors are used instead of sonars, the initialisation procedure
has been changed, and some heuristics have been incorporated to modify the certainty of
visited cells. The A* based planner has been defined with a new evaluation function to
generate paths that leave enough space between the robot and the obstacles, and also to be
efficient in time, so that the node evaluation is fast enough to run the planner at a high
frequency. The navigation scheme has been successfully used in different, both simulated and
real world, scenarios.
The second procedure overcomes the limitation of the previous approach by adding a
localisation method. This is accomplished by building fuzzy segments map to model the
288
HUMBERTO M ARTÍNEZ BARBERÁ
previously visited environment and the current sensed environment. The transformations
needed to match these maps are used to correct the current robot’s position. The performance
of the localisation method is greatly affected by the number and quality of the fuzzy
segments. The segments generation procedure has been modified to make use of a sensor
buffer, which produces more consistent segments. Although the motivation behind the
navigation procedure was to make the overall work of the system more robust, it also proved
to be capable of solving the SLAM (Simultaneous Localisation And Map-building) problem
in the different, both simulated and real world, tests carried out. More tests are still needed to
verify or prove in which situations or environments this set-up is valid for the SLAM
problem.
9.1.4 Control architecture and programming language
The BGA architecture relies on and takes advantage of various well-developed
concepts. It is based on a blackboard architecture that coordinates a multiagent system.
Agents can be layered and distributed according to their functionalities, including three-layer
hybrid architecture schemas, where the behaviour coordination is accomplished by a blending
method derived from the context-dependent blending. The main contributions of the BGA
architecture are the following:
• BGA effectively groups different concepts and paradigms of control architectures (namely
•
•
blackboard architectures, hybrid architectures, and multi-agent systems) extending and
combining their characteristics.
BGA incorporates a new blending mechanism that is based on the context-dependent
blending that effectively addresses the behaviour coordination problem. In addition, its
design allows the use of expert knowledge during the process of learning the fuzzy fusion
rule base.
BGA makes use of KQML, which is typical of message-passing systems but is not
habitual in blackboard systems, implemented over UDP for performance.
The BG language is a high-level robot programming language that fully supports the
BGA architecture and thus, it allows the user to develop BGA based control systems in a
friendly but powerful way. BG aids in the development cycle of robotics applications by
providing a single uniform framework to declare and specify all the elements of the BGA
architecture (behaviours, tasks, blenders and agents), and by supporting behaviour modelling
via extensive use of fuzzy logic. In addition, a full distributed BGA architecture can be
achieved by running BG programs in different processors.
The BGen development environment incorporates a BG language interpreter, as well as
integrating a simulator for different customised holonomic robots, data visualisation and
recording tools, and also a behaviour learning tool. The BGen environment is intended both
for development and deployment, and provides a simple generic way to develop and test BG
289
CHAPTER 9: CONCLUSIONS AND FUTURE WORK
programs off-robot, and to execute such programs in a real robot. Thus, the development to
deployment time is considerably reduced, and different platforms can reuse the code.
A series of experiments has shown that BGen, which includes the BG language and the
underlying BGA architecture, provides a good working environment for mobile robotics
related problems. Some conclusions have been obtained:
• BG programs are simple and easy to understand. Without knowing in advance what a BG
•
•
•
program does, it is easy to figure out the purpose of the program, although of course, this
also depends on how well written the program is. In any case, although the tasks involved
in achieving a robot’s goal are complex, the BG language hides the details and allows the
robot programmer to concentrate on the application domain. All the specific techniques
for map-building, localisation, path-planning, etc are available directly from the language,
while new methods can be added in the future. This favours and helps in a quick
development of robot programs as well as in their maintenance.
The BGen environment is open and portable. While all the development of the BGen
environment has been carried out in the University of Murcia (Spain) in parallel to the
development of the Quaky-Ant robot, one of the experiments shows the performance of a
completely different robot. The porting and interfacing of the BGen environment to the
new system and the rewriting of the BG program to account for the new robot geometry
and speed took less than two weeks, when the author was at the University of Örebro
(Sweden). It is important to note also that the environment was not tried completely in a
real robot. These facts support that the BGen environment is flexible and easy to port to
other platforms.
Java-based robotics environments are practical and usable. It is often argued that Java
programs are extremely slow. While it is true that Java just-in-time compiled programs are
a little bit slower than native ones, it is still possible to use Java in most robotics
applications. Moreover, the BG interpreter runs on top of a Java virtual machine. Thus,
BG programs are interpreted twice, but this is not a problem because most computingintensive techniques run directly as Java libraries (map-building, localisation and pathplanning). The added value of the BGen system is again its portability and reuse of code
at two different levels: at a system level, reusing the whole set of BGen libraries, and at an
application level, reusing BG programs for different robot platforms.
The BGen environment is expandable. One of the experiments shows that new techniques
or modules can be incorporated into the BGA architecture. In this case the topology-based
planner is not a Java program but a C/C++ program interfaced by way of the Java Native
Interface. In this way new services or legacy applications can be added.
A genetic-based method has been developed to assist in the development of behaviour
fusion fuzzy rule bases. The algorithm is used to obtain or learn a new fusion rule set from an
initial set of rules provided by the user (this may be the empty set). The genetic algorithm has
been adapted to encode individuals which are converted to BG fusion fuzzy rule bases. Thus,
these individuals can be standard MIMO fuzzy rules and also background fuzzy rules.
Moreover, the fitness function has been defined to include the effects of the experimental
evaluation of the robot’s behaviour by using simulation. This function takes into account
290
HUMBERTO M ARTÍNEZ BARBERÁ
different parameters which measure the robot’s performance in both successful and
unsuccessful runs. This point is crucial in the initial part of the learning procedure because
most of the individuals do not reach the goal, and it is very important to discriminate between
them to speed up the learning process.
Different experiments have been carried out to test the influence of the behaviour
weights and the results of the learning process. As has been shown, behaviour weights can aid
the learning process by narrowing the search space through using user’s knowledge of the
domain. Moreover, the use of the learning process improves the performance of the robot with
respect user defined fusion rules.
9.1.5 Web robotics application
BGen-Web is a proof-of-concept web-robotics exploration application. The BGen-Web
system provides a web-based service for highly interactive supervisory control of a mobile
robot to perform exploration of scarcely known, or completely unknown, environments, and
as such it has been successfully tested in real world environments, both indoor and outdoor.
This system has been built on top of the BGA architecture in order to use all the resources
from the already developed software. Using sensor data and video feedback in real-time, the
operator’s experience is highly interactive when controlling a robot over Internet (it is only
limited by network bandwidth requirements). This is an important aspect which distinguishes
it from other web robotics applications.
Another important aspect of BGen-Web is the task granularity. Some systems provide
interfaces to control at the lowest level (for instance direct control of the motors), while others
provide control commands at a higher level (for instance positioning the robot in a given
previously known place). BGen-Web provides commands at an intermediate level: the user
chooses the desired position and the robot tries to reach it. Thus, BGen-Web shows
interactivity levels beyond those of other systems, which provide much less information to the
operator and do not feel interactive. Additionally, BGen-Web operates in real world
environments, as do WITS and Xavier.
9.2 Future work
This thesis presents solutions in different areas to achieve a distributed architecture for
intelligent control in autonomous mobile robots. From this point new lines and areas of work
are created for the improvement of the different components.
The sensor fusion approaches have shown good performance, especially the hybrid
neuro-fuzzy approach, which from a qualitative point of view produces fair results in real
world. The experimental setup needed to extend these tests, in order to include real world data
291
CHAPTER 9: CONCLUSIONS AND FUTURE WORK
and obtain quantitative results, is very difficult. Moreover, the real world data acquisition is
quite complex and has not yet been performed. This could also improve the results.
The navigation problem has been successfully solved for most situations, but there are
still some points that can be improved. The fuzzy grid map assumes perfect odometry and
cells are updated based on the current localisation. If this map is used in conjunction with the
fuzzy segments map, the quality of the results depends heavily on the update frequency of the
localisation. The uncertainty on the position is not forwarded to the fuzzy map and, clearly,
this could be an improvement. In addition, the fuzzy segments localisation suffers from the
initial localisation problem. If the tracking of the current environment is lost by any reason,
the initial localisation may fail in highly symmetrical environments, like those found in
buildings with corridors. In this case, a simple geometrical matching does not suffice, and it
remains being an open problem.
The BGA architecture is flexible enough to approach different problems and tasks, like
those presented in this thesis. Although this architecture is not restricted to single robots,
some additional structures are needed to use BGA in a multi robot scenario. A possible
improvement is the inclusion of a message passing facility to communicate between the
different robots, while keeping the current blackboard model for intra robot communication.
In this context, the Java-based JINI 14 network technology is an interesting framework for
establishing communications between the different agents of the architecture. It provides
simple mechanisms which enable agents to plug together to form a community without any
planning, installation or human intervention. Additionally, the BG language should be
modified to incorporate the new control structures needed for communicating with other
robots.
The BG language, at its beginning, was designed to use fuzzy grid maps and A* path
planning. During the work of this thesis new methods and techniques were added, but they are
not visible from the BG language. A clear improvement could be the use of generic
navigation structures (map-building, path-planning and localisation) which are instantiated at
run-time using a configuration file. In this way, the control program simply sends update
commands to these structures and receives the corresponding feedback using a generic
interface. From a readability point of view, the language could also be extended to support
more data types (booleans, integers and data structures). From a performance point of view,
the current BG interpreter is very flexible and fast. But if BG programs are compiled to Java,
an important speed up could be achieved at the expense of a more complex program
preparation (program modification on the fly will be no longer supported). The BGen
architecture supports the Quaky-Ant and Nomad 200 robots, but it is currently being updated
to support other commercial robots, like the RWI B21.
The genetic-based approach for behaviour fusion has been shown to produce fair
results. Its disadvantage is the time required to obtain a good solution. The problem is two
fold. First, different scenarios are needed to discard random search and to test different
abilities. If there are many scenarios, or their dimensions are long, the algorithm spends too
14
http://www.sun.com/jini
292
HUMBERTO M ARTÍNEZ BARBERÁ
much time simulating the robot. In this case, the use of a parallelised or distributed version of
the algorithm could improve noticeably the time needed to obtain a solution. Additionally,
more termination conditions should be incorporated to avoid wasting time in clearly unuseful
solutions.
The BGen-Web web-robotics application has been shown to be very effective. It
presents a high degree of interactivity, mostly due to the video feedback, and has many
interesting real world applications, like semi-autonomous exploration. One problem with the
current approach is that it uses multicast. Usually, only some Internet Service Providers (ISP)
provide or allow the routing of multicast packages. This clearly limits the number of potential
users. An improvement could consist of the addition of an on-demand video server to allow
non multicast users receiving video feedback, at the expense of a greater network load.
Another problem is security. Currently, the access control is performed based on IP addresses
and no encryption is performed. A possible improvement could be the use of a Java
implementation of SSL15 (Alan et alt., 1996) to implement certificate-based authentication
and encryption. Finally, if JINI technology is incorporated in the BGA architecture, the
BGen-Web application could benefit from some added value services, like automatic
discovery of BGen-Web servers.
The BGen environment could also incorporate new technologies for distributed
simulation. First, the current user interface includes a two-dimensional representation of the
world. The environment could benefit from the use of web-based three-dimensional
visualisation technologies, like VRML16 (Roehl et al., 1997) or Java3D (Brown and Petersen,
1999). These allow a web browser to display three-dimensional contents generated from a
Java application. Most web browsers support VRML, while only Windows and Solaris based
browsers support Java3D currently. However, this should change as long as Sun is developing
Java3D on top of OpenGL. Additionally, the use of the IEEE 1278.1 standard, the Distributed
Interactive Simulation17 protocol (DIS), could allow the use of a more elaborated simulation
environment organised in a distributed framework. In this case, this framework could be a
fundamental point for integrating a non-monolithic multi-robot simulation. Moreover, Java
implementations of the DIS protocol are available.
15
http://jcewww.iaik.at
16
http://www.vrml.org
17
http://standars.ieee.org/catalog/simint.html
293
Appendix A
Commercial Mobile Robots: a Survey
A.1 IS Robotics
Real World Interface18 (RWI), a division of IS Robotics, manufactures mobile robots
and develops robot programming environments. In the medium size segment they have the
following models (Table A.1): B14r, Magellan and microATVR. They are built around the
rFLEXTM robot control architecture, which guarantees time-synchronised sensor and encoder
readings, that provides flexibility and responsiveness in robot control. The sensing bus is
based on the FARnetTM logical ring architecture that permits simple hub-and-node
configuration and use of standardised components, and provides low network latency. The
idea under the rFLEXTM architecture is to speed the development and modification of robot
research projects and applications. The rFLEXTM architecture is accessed through the
Mobility development environment.
Model
Characteristics
Diameter: 32 cm
Height: 60 cm
Weight: 27.2 Kg
Payload: 20 Kg
Ground clearance: 0.65 cm
CPU: Pentium 233 MHz
and rFLEXTM
B14r
18
http://www.rwii.com
295
Sensors
16 sonar sensors
16 infrared sensors
22 tactile sensors
Optional sensors:
Vision system
Compass
Actuators
3 wheels
2 24V DC motors
Speed: 0.9 m/s
Synchronised
APPENDIX A: COMMERCIAL MOBILE ROBOTS : A SURVEY
Diameter: 36.8 cm
Height: 19.8 cm
Weight: 13.2 Kg
Payload: 9.1 Kg
CPU: rFLEXTM
Magellan
Length: 42 cm
Height: 32 cm
Weight: 11.5 Kg
Payload: 5 Kg
Ground clearance: 5.1 cm
CPU: rFLEXTM
microATRV
16 sonar sensors
16 infrared sensors
16 tactile sensors
Optional sensors:
Vision system
Laser range finder
Compass
17 sonar sensors
Optional sensors:
Vision system
Compass
Inertial navigation
Laser range finder
GPS
2 wheels
2 24V DC motors
Speed: 2.5 m/s
Differential
4 wheels
2 24V DC motors
Speed: 1.0 m/s
Differential
Table A.1: IS Robotics medium size robots
A.2 ActivMedia Robotics
ActivMedia Robotics19 manufactures mobile robots. In the medium size segment they
have the following models (Table A.2): Pioneer-2 DX, and AmigoBot. They implement a
custom sensing bus, which is accessed through the Saphira development environment and
COLBERT and AmigoEYES programming languages.
Model
Pioneer-2 DX
Characteristics
Length: 44 cm
Width: 38 cm
Height: 22 cm
Weight: 9 Kg
Payload: 23 Kg
Ground clearance: 6 cm
CPU: Siemens C166
Length: 33 cm
Width: 28 cm
Height: 15 cm
Weight: 3.6 Kg
Payload: 1 Kg
Ground clearance: 3 cm
AmigoBot
Sensors
8 sonar sensors
Optional sensors:
8 sonar sensors
Vision system
Compass
Tactile sensors
Laser range finder
GPS
10 sonar sensors
Optional sensors:
Vision system
Speed: 1.6 m/s
Differential
2 wheels
2 12V DC motors
Speed: 1.0 m/s
Differential
$1,800
CPU: Hitachi H8S
Table A.2: ActivMedia Robotics medium size robots
19
Actuators
2 wheels
2 12V DC motors
http://www.activmedia.com
296
HUMBERTO M ARTÍNEZ BARBERÁ
A.3 Nomadics Technologies
Nomadics Technologies20 manufactures mobile robots and develops robot programming
environments. In the medium size segment they have the following models (Table A.3):
Nomad Scout II and Nomad Super Scout II. They implement a custom sensing bus, which is
accessed through the XRDev development.
Model
Characteristics
Diameter: 41 cm
Height: 35 cm
Weight: 23 Kg
Payload: 5 Kg
Ground clearance: 1.5 cm
Sensors
16 sonar sensors
6 tactile sensors
Actuators
2 wheels
2 12V DC motors
No optional sensors
Speed: 1.0 m/s
Differential
16 sonar sensors
6 tactile sensors
2 wheels
2 12V DC motors
No optional sensors
Speed: 1.0 m/s
Differential
CPU: MC68332 16 MHz
Scout II
Diameter: 41 cm
Height: 35 cm
Weight: 23 Kg
Payload: 5 Kg
Ground clearance: 1.5 cm
CPU: Pentium 233 MHz
Super Scout II
Table A.3: Nomadics Technologies medium size robots
A.4 K-Team
K-Team21 manufactures small mobile robots. In the medium size segment they have the
following model (Table A.4): Koala. It implements a custom sensing bus, which is accessed
through plug-in modules for Matlab and LabView or via cross-compiling in C.
Model
20
http://www.robots.com
21
http://www.k-team.com
Characteristics
Sensors
297
Actuators
APPENDIX A: COMMERCIAL MOBILE ROBOTS : A SURVEY
Length: 32 cm
Width: 32 cm
Height: 20 cm
Weight: 3 Kg
16 infrared sensors
Optional sensors
4 triangulation infrared
6 wheels
2 DC motors
Differential
CPU: MC68331 22 MHz
Koala
Table A.4: K-Team medium size robot
A.5 Robosoft Service Robotics
Robosoft Service Robotics22 manufactures and resells mobile robots and manipulators.
In the medium size segment they have the following model (Table A.5): Robulab-III. It
implements a custom sensing bus, which is accessed through programming in C the ASDP
libraries.
Model
Characteristics
Diameter: 48 cm
Height: 47 cm
Weight: 50 Kg
Payload: 30 Kg
CPU: MC68040 25 MHz
Sensors
Tactile sensors
Optional sensors
Video system
6 infrared sensors
6 sonar sensors
Actuators
2 wheels
2 24V DC motors
Speed: 1.8 m/s
Differential
Robulab-III
Table A.5: Robosoft medium size robot
A.6 Applied AI Systems
Applied AI Systems 23 manufactures and resells mobile robots and manipulators. In the
medium size segment they have the following models (Table A.6): Gaia-1a and Labo-3. They
implement a custom sensing bus, which is accessed through cross-compiling in C.
22
http://www.robosoft.fr
23
http://www.aai.ca
298
HUMBERTO M ARTÍNEZ BARBERÁ
Model
Characteristics
Length: 53 cm
Width: 36 cm
Height: 26 cm
Weight: 26 Kg
Payload: 10 Kg
Sensors
7 sonar sensors
4 infrared sensors
Actuators
4 wheels
2 12V DC motors
Optional sensors
Video system
Speed: 0.8 m/s
Differential
7 sonar sensors
4 infrared sensors
$13,000
2 wheels
2 12V DC motors
Optional sensors
Video system
Speed: 1.8 m/s
Differential
Gaia-1a
CPU: MC68332 16 MHz
Length: 46 cm
Width: 34 cm
Height: 38 cm
Weight: 25 Kg
Payload: 30 Kg
Labo -3
CPU: MC68332 16 MHz
$7,500
Table A.6: Applied AI Systems medium size robot
A.7 Probotics
Probotics24 manufactures low-cost mobile robots. In the medium size segment they have
the following model (Table A.7): Cye-sr. This robot is extremely low-cost, and uses just only
a current sensor to detect if any of its wheels are blocked (that is, the robot has collided). It
has available an I/O digital port, and it is programmed using both Java libraries and the Mapn-Zap software.
Model
Cye -s r
Characteristics
Length: 25 cm
Width: 40 cm
Height: 12.5 cm
Weight: 4 Kg
Payload: 4.5 Kg
Sensors
2 current sensors
No optional sensors
Speed: 1.0m/s
Differential
CPU: 16 bit - 16 MHz
$695
Table A.7: Applied AI Systems medium size robot
24
Actuators
2 wheels
2 12V DC motors
http://www.personalrobotics.com
299
Appendix B
Obtaining the BGen Software
The BGen environment, some papers and other documents are available on-line for web
download at the following address:
http://ants.dif.um.es/~humberto/robots
Additionally, the author can be contacted using the following address:
Humberto Martínez Barberá
Facultad de Informática
Universidad de Murcia
30100 Murcia (Spain)
Phone:
Fax:
E-Mail:
301
+34-968.364666
+34-968.364151
[email protected]
References
Adams, D. (1979). The Hitchhiker’s Guide to the Galaxy, Pan Books, London
Agre, P.E. & Chapman, D. (1987). Pengi: an Implementation of a Theory of Activity, Proc. 6th
National Conference on AI, Morgan Kaufman, pp 268-272
Aguilar, J.M. & Contreras, J.L. (1994). NAVITE: a Neural Network System for Sensory-Based
Robot Navigation, World Congress on Neural Networks, New Jersey, USA, pp 177-182
Aguirre, E.; García Alegre, M.C. & González, A. (1998). A Fuzzy Safe Follow Wall
Behaviour Fusing Simpler Fuzzy Behaviours, 3rd IFAC Symposium on Intelligent
Autonomous Vehicles, Madrid, Spain, pp 607-612
Akatsuka, Y.; Goldberg, S.B. & Bekey, G.A. (1998). Digimuse: an Interactive Telerobotic
System for Viewing of Three-Dimensional Art Objects, IEEE Intl. Conf. on Intelligent
Robots and Systems, Victoria, Canada
Al-Sultan, K.S. (1995). A Tabu Search Approach to the Clustering Problem, Pattern
Recognition, 28(2):1443-1451
Alan, A.O.; Freier, P. & Kocher, P.C. (1996). The SSL Protocol Version 3.0, IETF Internet
Draft
Albus, J.S.; Quintero, R. & Lumia, R. (1994). An Overview of NASREM: the NASA-NBS
Standard Reference Model for Telerobot Control System Architecture, NIST Technical
Report 5412, Gaithersburg
Andrews, J.R. & Hogan, N. (1983). Impedance Control as a Framework for Implementing
Obstacle Avoidance in a Manipulator, in Control of Manufacturing Processes and Robotic
Systems (D.E. Hardt and W. Book eds.), ASME, Boston, USA
Araújo, R. & Almeida, A. (1999). Learning Sensor-Based Navigation of a Real Mobile Robot
in Unknown Worlds, IEEE Trans. on Systems, Man and Cybernetics, 29(2):164-178
Arkin, R. (1989). Motor-Schema-Based Mobile Robot Navigation, Intl Jour. Robot Res.,
8(4):92-112
Arkin, R. (1990). Integrating Behavioural, Perceptual and World Knowledge in Reactive
Navigation, Robotics and Autonomous Systems, 6(1):105-122
303
REFERENCES
Armstrong, D.G. & Crane, C.D. (1998). A Modular, Scalable, Architecture for Intelligent,
Autonomous Vehicles, 3rd IFAC Symposium on Intelligent Autonomous Vehicles,
Universidad Carlos III, Madrid, Spain, pp 62-66
Arrúe, B.C.; Cuesta, F.; Braunstingl, B. & Ollero, A. (1997). Fuzzy Behaviour Combination
to Control a Non Holonomic Mobile Robot using Virtual Perception Memory, 6th IEEE
Intl. Conf. on Fuzzy Systems, Barcelona, Spain, pp 1239-1244
Austin, J.L. (1962). How to Do Things with Words, Harvard University Press, Cambridge
Backes, P. & Norris, J.S. (2000). Mars Rover Mission Distributed Operations, TMOD
Technology and Science Programs News, June
Backes, P.; Tso, K.S.; Norris, J.S.; Tharp, G.K.; Slostad, J.T. & Bonitz, R.G. (2000a). Mars
Polar Lander Mission Distributed Operations, IEEE Aerospace Conf., Big Sky, USA
Backes, P.; Tso, K.S.; Norris, J.S.; Tharp, G.K.; Slostad, J.T.; Bonitz, R.G. & Ali, K.S.
(2000b). Internet-Based Operations for the Mars Polar Lander Mission, IEEE Intl. Conf.
on Robotics and Automation, San Francisco, USA
Balch, T. (1998). Behavioral Diversity in Learning Robot Teams, Ph.D. Thesis, Georgia
Institute of Technology, USA
Balch, T. & Arkin, R.C. (1994). Communication in Reactive Multiagent Robotic Systems,
Autonomous Robots, 1(1)
Balch, T. & Arkin, R.C. (1995). Motor Schema -Based Formation Control for Multiagent
Robot Teams, 1st Intl. Conf. on Multi-Agent Systems (ICMAS-95), Menlo Park, USA, pp
10-16
Barman, R.A.; Kingdon, S.J.; Little, J.J.; Mackworth, A.K.; Pai, D.K.; Sahota, M.; Wilkinson,
H. & Zhang, Y. (1993). DYNAMO: Real-Time Experiments with Multiple Mobile Robots,
Intelligent Vehicles Symposium, Tokyo, Japan, pp 261-266
Barshan, B. & Kuc, R. (1990). Differentiating sonar reflections from corners and planes by
employing an intelligent sensor, IEEE Trans. on Pattern Analysis and Machine
Intelligence, 12(6):560-569
Bayes, T. (1763). An Essay Towards Solving a Problem in the Doctrine of Chances,
Philosophical Transactions of the Royal Society of London, 53(1):370-418
Benreguieg, M.; Maaref, H. & Barret, C. (1998). Navigation of an Autonomous Mobile Robot
by Coordination of Behaviours, 3rd IFAC Symposium on Intelligent Autonomous Vehicles,
Madrid, Spain, pp 589-594
Beom, H.R. & Cho, H.S. (1995). A Sensor-Based Navigation for a Mobile Robot using Fuzzy
Logic and Reinforcement Learning, IEEE Trans. Systems Man and Cybernetics,
3(25):464-477
Berenji, H.; Chen, Y.Y.; Lee, C.C.; Jang, C.S. & Murugesan, S. (1990). A Hierarchical
Approach to Designing Approximate Reasoning-Based Controllers for Dynamic Physical
Systems, 6th Conf. on Uncertainty in Artificial Intelligence, Cambridge, USA
304
HUMBERTO M ARTÍNEZ BARBERÁ
Berlanga, A.; Borrajo, D.; Fernández, F.; García-Martínez, R.; Molina, J.M. & Sanchís, A.
(1999). Robótica Cognoscitiva y Aprendizaje Automático, VIII Conf. de la Asociación
Española para la Inteligencia Artificial, Murcia, Spain, pp 1-8
Bezdek, J.C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum,
New York
Bezdek, J.C.; Tsao, E.C.-K. & Pal, N.R. (1992). Fuzzy Kohonen Clustering Networks, IEEE
Intl. Conf. on Fuzzy Systems, pp 1035-1043
Biethahn, J. & Nissen, V. (1995). Evolutionary Algorithms in Management Applications,
Springer-Verlag, Berlin
Biris, E. & Shen, Q. (1997). Navigating Mobile Robot with a Fuzzy Logic Controller, 7th
ISAF World Congress, Prague, Austria, pp 345-350
Bonarini, A. & Basso, F. (1997). Learning to Compose Fuzzy Behaviours for Autonomous
Agents, Intl. Journal of Approximate Reasoning, 17(1):409-432
Bonasso, R.P. (1991). Integrating Reaction Plans and Layered Competences through
Synchronous Control, 12th Intl. Joint Conf. On Artificial Intelligence, San Francisco, USA,
pp 1225-1231
Bonissone, P. (1997). Soft Computing: the Convergence of Emerging Reasoning
Technologies, Soft Computing, 1(1):6-18
Borenstein, J.; Everett, H.R. & Feng, L. (1996). Navigating Mobile Robots: Systems and
Techniques, A.K. Peters, Wellesley, MA
Borenstein, J. & Koren, Y. (1988). Obstacle Avoidance with Ultrasonic Sensors, IEEE J.
Robotics and Automation, 4(1):213-218
Borenstein, J. & Koren, Y. (1989). Real-Time Obstacle Avoidance for Fast Mobile Robots,
IEEE Trans. on Systems, Man and Cybernetics, 19(5):1179-1187
Borenstein, J. & Koren, Y. (1991). The Vector Field Histogram: Fast Obstacle Avoidance for
Mobile Robots, IEEE J. on Robotics and Automation, 17(3):278-288
Borenstein, J. & Raschke, U. (1992). Real-Time Obstacle Avoidance for Non-Point Mobile
Robots, SME Trans. on Robotics Research, 2(1):1-10
Bradley, A.M. & Yoerger, D.R. (1993). Design and Testing of the Autonomous Benthic
Explorer, 20th Ann. Technical Symposium and Exhibition (AUVS-93), Washington, USA
Brenner, W.; Zarnekow, R. & Wittig, H. (1998). Intelligent Software Agents, Springer-Verlag,
Berlin, Germany
Briot, M; Talou, J.C. & Bauzil, G. (1979). Le Systeme de Perception du Robot HILARIE, 2eme
Congress AFCET/IRIA, Toulouse, France
Brooks, F.P.; Ouh-Uoung, M.; Batter, J.J. & Kilpatrick, P.J. (1990). Project GROPE: Haptic
Displays for Scientific Visualisation, ACM Computer Graphics, 24(4):177-185
Brooks, R.A. (1986). A Robust Layered Control System for a Mobile Robot, IEEE J. Robotics
and Automation, 1(2):14-23
305
REFERENCES
Brooks, R.A. (1989). The Behaviour Language User’s Guide. Artificial Intelligence
Laboratory Memo 1227, MIT, Cambridge, USA
Brooks, R.A. & Connel, J.H. (1986). Asynchronous Distributed Control System for a Mobile
Robot, in Proceedings of the SPIE, 727(1):77-84
Brown, K. & Petersen, D. (1999). Ready to run Java 3D, Wiley, New York, USA
Buchberger, M.; Jorg, K.W. & Puttkamer (1993). Laser Radar and Sonar Based World
Modelling and Motion Control for Fast Obstacle Avoidance of the Autonomous Mobile
Robot MOBOT-IV, IEEE Intl. Conf. on Robotics and Automation, pp 534-539
Cabri, G.; Leonardi, L. & Zambonelli, F. (2000). Mobile-Agent Coordination Models for
Internet Applications, IEEE Computer, 33(2):82-89
Cardelli, L. & Gordon, D. (1998). Mobile Ambients, in Foundations of Software Science and
Computational Structures, Springer-Verlag, Stuttgart, Germany
Chang, C.C. & Song, K.T. (1996). Ultrasonic Sensor Data Integration and its Application to
Environment Perception, Journal of Robotic Systems, 13(10):663-677
Chauhan, D. (1997). JAFMAS: a Java-Based Agent Framework for Multiagent Systems
Development and Implementation, PhD. Thesis, ECECS Dept., University of Cincinnati,
USA
Chauhan, D. & Baker, A. (1998). JAFMAS: a Multiagent Application Development System, in
2nd ACM Conf. on Autonomous Agents (M. Woolridge and T. Finin eds.) ACM Press
Cheah, J.Y.C. (1999). Practical Wireless Data Modem Design, Artech House Publishers,
Boston, USA
Chien, S.Y.P.; Xue, L.Q. & Palakal, M. (1997). Task Planning for a Mobile Robot in an
Indoor Environment using Object-Oriented Domain Information, IEEE Trans. on Systems,
Man and Cybernetics, 27(6):1007-1016
Comer, D.E. & Stevens, D.L. (1993). Internetworking with TCP/IP Volume III: Client-Server
Programming and Applications BSD Sockets Version, Prentice Hall, New Jersey, USA
Congdon, C.; Huber, M.; Kortenkamp, D.; Konolige, K.; Myers, K.; Ruspini, E. & Saffiotti,
A. (1993). CARMEL vs Flakey: a Comparison of Two Winners, AI Magazine, 14(1):49-57
Connel, J. (1992): SSS: a Hybrid Architecture Applied to Robot Navigation, IEEE Int. Conf.
on Robotics and Automation, Los Alamitos, USA, pp 2719-2724
Cost, R.S. (1998). Jackal: a Java-Based Tool for Agent Development, in Working Papers of
the AAAI-98 Workshop on Software Tools for Developing Agents, AAAI Press
Courtney, J.D. & Jain, A.K. (1994). Neural network learning of variable grid-based maps for
the autonomous navigation of robots, IEEE Intl. Conf. on Robotics and Automation, pp
40-45
Crevier, D. (1993). AI: The Tumultuous History of the Search for Artificial Intelligence, Basic
Books, Harper Collins, New York, USA
Cybenko, G. (1988). Continuous Valued Neural Networks with Two Hidden Layers Are
Sufficient, Technical Report, Dept. of Computer Science, Tufts University, USA
306
HUMBERTO M ARTÍNEZ BARBERÁ
Cybenko, G. (1989). Approximation by Superpositions of a Sigmoidal Function, Mathematics
of Control, Signals, and Systems, 2(1):303-314
Dain, R. (1998). Developing Mobile Robot Wall-Following Algorithms using Genetic
Programming, Applied Intelligence, 8(1):33-41
Decker, K.S. (1987). Distributed Problem Solving: a Survey, IEEE Trans. on Systems, Man
and Cybernetics, 17(5):729-740
Delgado, M.; Gómez Skarmeta, A.F. & Martínez Barberá, H. (1997). A Tabu Search
Approach to the Fuzzy Clustering Problem, Sixth IEEE Intl. Conf. on Fuzzy Systems
(FuzzIEEE'97), Barcelona, Spain, pp 125-129
Delgado, M.; Gómez Skarmeta, A.F.; Martínez Barberá, H. & Gómez, J. (1998). Fuzzy Range
Sensor Filtering for Reactive Autonomous Robots, Fifth Intl. Conference on Soft
Computing and Information / Intelligent Systems (IIZUKA'98), Fukuoka, Japan, pp 688694
Dempster, A.P. (1967). Upper and Lower Probabilities Induced by a Multivalued Mapping,
Annals of Mathematics and Statistics, 38(1):325-339
Denna, M.; Mauri, G. & Zanaboni, A.M. (1999). Learning Fuzzy Rules with Tabu Search –
An Application to Control, IEEE Trans. on Fuzzy Systems, 7(2):295-318
Dorigo, M. & Colombetti, M. (1998). Robot Shaping: an Experiment in Behaviour
Engineering, The MIT Press, Cambridge, USA
Driankov, D.; Hellendoorn, H. & Reinfrank, M. (1993). An Introduction to Fuzzy Control,
Springer-Verlag, Berlin, Germany
Duckett, T.; Marsland, S. & Shapiro, J. (2000). Learning Globally Consistent Maps by
Relaxation, IEEE Intl. Conf. on Robotics and Automation, California, USA
Duckett, T. & Nehmzow, U. (1998). Mobile Robot Self-Localisation and Measurement of
Performance in Middle-Scale Environments, Journal of Robotics and Autonomous
Systems, 24(2):57-69
Duda, R. & Hart, R. (1973). Pattern Classification and Pattern Analysis, Wiley, New York,
USA
Dudek, G. & Jenkin, M. (2000). Computational Principles of Mobile Robotics, Cambridge
University Press, Cambridge, UK
Ferrell, W.R. (1965). Remote Manipulation with Transmission Delays, IEEE Trans. on
Human Factors in Electronics, 6(1):24-32
FIPA (1997). FIPA ACL Version 1.0 Specifications: Agent Management and Agent
Communication Language, in http://www.fipa.org
Firby, R.J. (1989). Adaptive Execution in Complex Dynamic Worlds, Technical Report
YALEU/CSD/RR 672, Dept. Computer Science, Yale University, USA
Firby, R.J. (1994). Task Networks for Controlling Continuous Processes, 2nd Intl. Conf. On
AI Planning Systems, Menlo Park, USA, pp 49-54
307
REFERENCES
Flynn, A.M. (1988). Combining Sonar and Infrared Sensor for Mobile Robot Navigation, Intl.
Journal Robot. Res., pp 5-14
Fukuda, T. & Kubota, N. (1999). An Intelligent Robotic System Based on a Fuzzy Approach,
Proceedings of the IEEE, 87(9):1448-1470
Gallardo, D. (1999). Aplicación del Muestreo Bayesiano en Robots Móviles: Estrategias para
la Localización y Estimación de Mapas del Entorno, PhD Thesis, Dept. of Computer
Technology and Computer Science, University of Alicante, Spain
García Alegre, M.C.; Gasós, J. & García Rosa, R. (1991). Fuzzy Control of Autonomous
Mobile Robots, Congreso Español sobre Tecnologías y Lógica Fuzzy (ESTYLF’91),
Granada, Spain, pp 71-77
García Alegre, M.C.; Ribeiro, A.; Gasós, J. & Salido, J. (1993). Optimization of Fuzzy
Behaviour-Based Robots Navigation in Partially Known Industrial Environments, 3rd IEEE
Intl. Conf. on Industrial Fuzzy Control and Intelligent Systems, Texas, USA
García Cerezo, A.; Mandow, A. & López, M.J. (1997). Fuzzy Modelling Operator Navigation
Behaviours, 6th IEEE Intl. Conf. on Fuzzy Systems, Barcelona, Spain, pp 1339-1345
García Rosa, R. & García Alegre, M.C. (1990). Fuzzy Logic Strategies to Control an
Autonomous Mobile Robot, Intl. Journal on Cybernetics and Systems, 21(1):267-276
Gasós, J. & Martín, A. (1996a). Mobile Robot Localization using Fuzzy Maps, in Lectures
Notes in Artificial Intelligence (A. Ralescu and T. Martín eds.), Springer-Verlag, Germany
Gasós, J. & Martín, A. (1996b). A Fuzzy Approach to Build Sonar Maps for Mobile Robots,
Computers in Industry, 32(1):151-167
Gasós, J. (2000). Integrating Linguistic Descriptions and Sensor Observations for the
Navigation of Autonomous Robots, in Fuzzy Logic Techniques for Autonomous vehicle
Navigation (D. Driankov and A. Saffiotti eds.), Springer-Verlag, Germany
Gat, E. (1991). Reliable Goal-Directed Reactive Control for Real-World Autonomous Mobile
Robots, PhD Thesis, Dept. Computer Science, Virginia Polytechnic Institute and State
University, USA
Gat, E. (1997). ESL: a Language for Supporting Robust Plan Execution in Embedded
Autonomous Agents, IEEE Aerospace Conference, Los Alamitos, USA
Gat, E. (1998). Three Layer Architectures, in Artificial Intelligence and Mobile Robots (D.
Kortenkamp, P. Bonasso and R. Murphy eds.), The MIT Press
Georgeff, M.P. & Lanskey, A. (1987). Reactive Reasoning and Planning, 6th National Conf.
of Artificial Intelligence, Menlo Park, USA, pp 677
Glasius, R.; Komoda, A. & Gielen, S. (1995). Neural Networks Dynamics for Path Planning
and Obstacle Avoidance, Neural Networks, 8(1):125-133
Goldberg, D.E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning,
Addison-Wesley
Goldberg, K.; Gentner, S.; Sutter, C. & Wiegley, J. (2000). The Mercury Project: a
Feasibility Study for Internet Robots, IEEE Robotics and Automation Magazine, 7(1):3540
308
HUMBERTO M ARTÍNEZ BARBERÁ
Gómez Skarmeta, A.F. & Jiménez, F. (1997). Generating and Tuning Fuzzy Rules with
Hybrid Systems, Sixth IEEE Intl. Conference on Fuzzy Systems (FuzzIEEE’97),
Barcelona, Spain, pp 247-252
Gómez Skarmeta, A.F. & Jiménez, F. (1999). Fuzzy Modelling with Hybrid Systems, Fuzzy
Sets and Systems, 104(1):199-208
Gómez Skarmeta, A.F., Martínez Barberá, H. (1997). Fuzzy Logic Based Intelligent Agents
for Reactive Navigation in Autonomous Systems , Fifth International Conference on Fuzzy
Theory and Technology, Raleigh, USA, pp 125-131
Gómez Skarmeta, A.F.; Martínez Barberá, H. & Botía, J. (2000). Fuzzy Modelling in a MultiAgent Framework for Learning in Autonomous Systems, in Fuzzy Learning and
Applications (M. Russo and L.C. Jain eds.), CRC Press
Gómez Skarmeta, A.F.; Martínez Barberá, H. & Sánchez, M. (1999a). A Fuzzy Agents
Architecture for Autonomous Mobile Robots, International Fuzzy Systems World Congress
(IFSA'99), Taiwan, pp 713-717
Gómez Skarmeta, A.F.; Martínez Barberá, H. & Sánchez, M. (1999b). A fuzzy logic based
language to model autonomous mobile robots, 8th IEEE Intl. Conference on Fuzzy
Systems (FuzzIEEE’99), Seoul, Korea, pp 550-555
Gómez Skarmeta, A.F.; Martínez Barberá, H. & Sánchez, M. (1999c). Learning behaviour
fusion in autonomous mobile robots, First EUSFLAT/ESTYLF Joint Conference
(ESTYLF’99), Mallorca, Spain, pp 91-94
González Baños, H.; Gordillo, J.L.; Lin, D.; Latombe, J.C.; Sarmiento, A. & Tomasi, C.
(1999). The Autonomous Observer: a Tool for Remote Experimentation in Robotics, SPIE
Intl. Symposium on Voice, Video and Data Communications, Boston, USA
Goodridge, S.G.; Kay, M.G. & Luo, R.C. (1997). Multi-Layered Fuzzy Behaviour Fusion for
Reactive Control of an Autonomous Mobile Robot, IEEE Int. Conf. on Fuzzy Systems,
Barcelona, Spain, pp 579-584
Goodridge, S.G. & Luo, R.C. (1994). Fuzzy Behaviour Fusion for Reactive Control of an
Autonomous Mobile Robot: MARGE, IEEE Intl. Conf. on Robotics and Automation, San
Diego, USA, pp 1622-1627
Grand, S. & Cliff, D. (1998). Creatures: Entertainment Software Agents with Artificial Life,
Autonomous Agents and Multi-Agent Systems, 1(1):39-58
Harmon, S.Y. (1987). The Ground Surveillance Robot (GSR): an Autonomous Vehicle
Designed to Transit Unknown Terrain, IEEE Trans. on Robotics and Automation, 3(3)
Hart, P.E.; Nilsson, N.J. & Raphael, B. (1968). A Formal Basis for the Heuristic
Determination of Minimum Cost Paths, IEEE Trans. on Man, Systems, and Cybernetics,
4(2):100-107
Hayes-Roth, B.A. (1985). Blackboard for Control, Artificial Intelligence, 1(26):251-324
Hendler, J. (1999). Making Sense out of Agents, IEEE Intelligent Systems, 2(14):32-37
Herrera, F.; Lozano, M. & Verdegay, J.L. (1995). Tuning Fuzzy Logic Controllers by Genetic
Algorithms, Intl. J. of Approximate Reasoning, 12(1):299-315
309
REFERENCES
Hexmoor, H. & Meeden, L. (1997). Learning in Autonomous Robots: A Summary of the 1996
Robolearn Workshop, Knowledge Engineering Review, 11(4)
Hirukawa, H. & Hara, I. (2000). Web-Top Robotics, IEEE Robotics and Automation
Magazine, 7(2):40-45
Hoffmann, F. & Pfister, G. (1996). Evolutionary Learning of Fuzzy Control Rule Base for an
Autonomous Vehicle, Proc. Sixth Int. Conf. Information Processing and Management of
Uncertainty in Knowledge-Based Systems (IPMU'96), Granada, Spain, pp 1235-1240
Holland, J.H. (1975). Adaptation in Natural and Artificial Systems, MIT Press, Cambridge,
USA
Horswill, I.D. (2000). Functional Programming of Behaviour-Based Systems, Autonomous
Robots, 9(1):83-93
Huberman, B. & Clearwater, S.H. (1995). A Multi-Agent System for Controlling Building
Environments, 1st Intl. Conf. on Multi-Agent Systems, Menlo Park, USA, pp 171-176
Ishibuchi, H.; Nozaki, K.; Yamamoto, N. & Tanaka, H. (1995). Selecting Fuzzy If-Then Rules
for Classification Problems Using Genetic Algorithms, IEEE Trans. on Fuzzy Systems,
3(3):260-270
Ishigami, H.; Fukuda, T.; Shibta, T. & Arai, F. (1995). Structure Optimization of Fuzzy
Neural Network by Genetic Algorithms, Fuzzy Sets and Systems, 71(1):257-264
Jang, J.S. (1993). ANFIS: Adaptive-network -based fuzzy inference systems, IEEE Trans. on
Systems, Man, and Cybernetics, 23(3):665-685
Jang, J.S.; Sun, C.T. & Mizutani, E. (1997). Neuro-Fuzzy and Soft Computing, Prentice Hall,
New Jersey, USA
Johnson, S.C. (1975). YACC: Yet Another Compiler Compiler, CS Technical Report#32, Bell
Telephone Laboratories, Murray Hill, USA
Joo. Y.H.; Hwang, H.S.; Kim, K.B. & Woo, K.B. (1997). Fuzzy System Modelling by Fuzzy
Partition and GA Hybrid Schemes, Fuzzy Sets and Systems, 86(1):279-288
Kaelbling, L.P. (1987). REX: a Symbolic Language for the Design and Parallel
Implementation of Embedded Systems, AIAA Conf. on Computers in Aerospace,
Wakefield, USA
Kam, M.; Zhu, X. & Kalata, P. (1997). Sensor Fusion for Mobile Robot Navigation,
Proceedings of the IEEE, 85(1):108-119
Karnik, N.M. & Tripathi, A.R. (1998). Design Issues in Mobile -Agent Programming Systems,
IEEE Concurrency, July/Sept, pp 52-61
Khatib, O. (1985). Real-Time Obstacle Avoidance for Manipulators and Mobile Robots, IEEE
Intl. Conf. on Robotics and Automation, St. Louis, USA, pp 500-505
Khatib, O. (1986). Real-Time Obstacle Avoidance for Manipulators and Mobile Robots, Intl.
Journal of Robotics Research, 5(1):90-98
Kim, W.S. & Stark, L.W. (1989). Cooperative Control of Visual Displays for
Telemanipulation, IEEE Intl. Conf. on Robotics and Automation, pp 1327-1332
310
HUMBERTO M ARTÍNEZ BARBERÁ
Kirn, S. (1996). Kooperativ-Intelligente Softwareagenten, Information Management, 11(1)
Ko, N.Y. & Simmons, R.G. (1998). The Lane-Curvature Method for Local Obstacle
Avoidance, IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, Victoria, Canada
Koenig, S. & Simmons, R.G. (1998). Xavier: A Robot Navigation Architecture Based on
Partially Observable Markov Decision Process Models, in Artificial Intelligence Based
Mobile Robotics (D. Kortenkamp, R. Bonasso and R. Murphy eds.), MIT Press, USA
Konolige, K. (1997). COLBERT: a Language for Reactive Control in Saphira, German
Conference on Artificial Intelligence, Freiburg, Germany
Konolige, K.; Meyers, K.; Saffiotti, A. & Ruspini, E. (1997). The Saphira Architecture: A
Design for Autonomy , Journal of Experimental and Theoretical Artificial Intelligence,
9(1):215-235
Koren, Y. & Borenstein, J. (1991). Potential Field Methods and Their Inherent Limitations
for Mobile Robot Navigation, IEEE Intl. Conf. on Robotics and Automation, Sacramento,
USA, pp 1398-1404
Kosko, B. (1992). Neural Networks and Fuzzy Systems , Prentice Hall, New Jersey, USA
Krogh, B.H. (1984). A Generalised Potential Field Approach to Obstacle Avoidance Control,
IEEE Intl. Conf. on Robotics Research, Bethlehem, USA
Krogh, B.H. & Thorpe, C.E. (1986). Integrated Path Planning and Dynamic Steering Control
for Autonomous Vehicles, IEEE Intl. Conf. on Robotics and Automation, San Francisco,
USA, pp 1664-1669
Kröse, B. & van Dam, J. (1997). Neural Vehicles, in Neural Systems for Robotics (O.
Omidvar and P. van der Smagt eds.), Academic Press, San Diego
Kurz, A. (1996). Constructing maps for mobile robot navigation based on ultrasonic range
data, IEEE Trans. on System, Man, and Cybernetics, 26(2):233-242
Kwon, Y.D.; Won, J.M. & Lee, J.S. (1998). Control of a Mobile Robot by using Evolutionary
Fuzzy Controller, IEEE Intl. Conf. ICEC, pp 422-426
Labrou, Y. (1997). Semantics for an Agent Communication Language, PhD. Thesis, Dept. of
Computer Science and Electronic Engineering, University of Maryland, USA
Labrou, Y. & Finin, T. (1994). A Semantics Approach for KQML – a General Pu rpose
Communication Language for Software Agents, 3rd Intl. Conf. on Information and
Knowledge Management
Labrou, Y. & Finin, T. (1997). A Proposal for a New KQML Specification, Technical Report
TR-CS-97-03, Computer Science and Electrical Engineering Dept., University of
Maryland, USA
Labrou, Y.; Finin, T. & Peng, Y. (1999). Agent Communication Languages: the Current
Landscape, IEEE Intelligent Systems, 14(2):45-52
Labrou, Y. & Weber, J. (1993). Specification of the KQML Agent Communication Language,
DARPA Knowledge Sharing Initiative, External Interfaces Working Group
311
REFERENCES
Lee, D. (1996). The Map-Building and Exploration Strategies of a Simple Sonar-Equipped
Robot, Cambridge University Press, Cambridge
Lekei, D. (1997). Using a PIC16C5X as a Smart I2C Peripheral, AN541, Microchip
Technology, Chandler, USA
Leonard, J. & Durrant-Whyte, H. (1992). Directed Sonar Sensing for Mobile Robot
Navigation, Kluwer, Dordrecht, The Netherlands
Levesque, H.; Reiter, R.; Lespérance, Y.; Lin, F. & Scherl, R. (1997). GOLOG: A Logic
Programming Language for Dynamic Domains, Journal of Logic Programming, 31(1):5984
Levy, R. & Rosenschein, J.S. (1992). A Game Theoretic Approach to the Pursuit Problem,
11th Intl. Workshop on Distributed Artificial Intelligence, pp 195-213
Li, W.; Ma, C. & Wahl, F.M. (1997). A Neuro-Fuzzy System Architecture for BehaviourBased Control of a Mobile Robot in Unknown Environments, Fuzzy Sets and Systems,
87(1):133-140
Lim, J.H. & Cho, D.W. (1992). Physically Based Sensor Modelling for a Sonar Map in a
Specular Environment, IEEE Intl. Conf. on Robotics and Automation, pp 1714-1719
Liu, J. & Xie, W. (1995). A Genetics-Based Approach to Fuzzy Clustering, IEEE Intl. Conf.
on Fuzzy Systems
López Sánchez, M.; López de Mántaras, R. & Sierra, C. (1998). Possibility Theory-Based
Environment Modelling by Means of Behaviour Based Autonomous Robots, European
Conference on Artificial Intelligence, pp 590-594
Lund, H.H.; Miglino, O.; Pagliarini, L.; Billard, A. & Ijspeert, A. (1998). Evolutionary
Robotics – A Children’s Game, IEEE Intl. Conf. ICEC, pp 154-158
Luo, R.C. & Kay, M.G. (1989). Multisensor Integration and Fusion in Intelligent Systems ,
IEEE Transaction on Systems, Man and Cybernetics, 19(5):901-931
Machida, K.; Toda, Y. & Iwata, T. (1990). Graphic-Simulator-Augmented Teleoperation
System for Space Applications, J. of Spacecraft and Rockets, 27(1):64-69
MacKenzie, D. & Arkin, R. (1997). Evaluating the Usability of Robot Programming Toolsets,
Technical Report, Georgia Institute of Technology, USA
Maravall, D.; de Lope, J. & Serradilla, F. (2000). Combination of Model-Based and Reactive
Methods in Autonomous Navigation, IEEE Intl. Conf. on Robotics and Automation, San
Francisco, USA, pp 2328-2333
Martin, F.G. (1994). Circuit to Control: Learning Engineering by Designing LEGO Robots,
Ph.D. Thesis, MIT, USA
Martin, F.G. (1996). The Interactive C Manual for the HandyBoard, MIT, USA
Martinez, A.; Tunstel, E. & Jamshidi, M. (1994). Fuzzy Logic Based Collision Avoidance for
a Mobile Robot, Robotica
Matsuyama, T. (1997). Cooperative Distributed Vision, 1st Intl. Workshop on Cooperative
Distributed Vision, Kyoto, Japan
312
HUMBERTO M ARTÍNEZ BARBERÁ
McDermott, D. (1994). A Reactive Plan Language, Technical Report YALEU/CSD/RR 864,
Dept. of Computer Science, Yale University, USA
Michaud, F. (1997). Selectin g Behaviours using Fuzzy Logic, IEEE Int. Conf. on Fuzzy
Systems, Barcelona, Spain, pp 585-592
Michaud, F.; Lachiver, G. & Dinh, C.T.L. (1996). Fuzzy Selection and Blending of
Behaviours for Situated Agents, IEEE Intl. Conf. on Fuzzy Systems, New Orleans, USA
Michel, O.; Saucy, P. & Mondada, F. (1997). KhepOnTheWeb: an Experimental
Demonstrator in Telerobotics and Virtual Reality, in Intl Conf. on Virtual Systems and
Multimedia, IEEE Computer Society Press
Mitchell, T. (1997). Machine Learning, McGraw Hill, New York, USA
Moravec, H. (1990). The Stanford CART and the CMU Rover, Autonomous Robot Vehicles,
I.J. Cox and G.T. Wilfong eds., Springer-Verlag
Moravec, H.P. & Elfes, A. (1985). High Resolution Maps from Wide angle Sonar, IEEE Intl.
Conf. on Robotics and Automation, Washington, USA, pp 116-121
Nehmzow, U. (2000). Mobile Robotics: a Practical Introduction, Series on Applied
Computing, Springer-Verlag, London, UK
Nguyen, D. & Widrow, B. (1990). The Trucker Backer-Upper: an Exmaple of Self-Learning
in Neural Networks, in Neural Networks for Control (W.T. Miller, R. Sutton and P.
Werbos eds.), Bradford Book, MIT Press, Cambridge
Nilsson, N.J. (1969). A Mobile Automaton: An Application of Artificial Intelligence
Techniques, Proc. of IJCAI, reprinted in Autonomous Mobile Robots: Control, Planning
and Architecture, 2(1): 233-239
Nilsson, N.J. (1980). Principles of Artificial Intelligence, Tioga Press, Palo Alto, USA
Norris, J.S. & Backes, P. (2000). WEDDS: The WITS Encrypted Data Delivery System, IEEE
Aerospace Conf., Big Sky, USA
Ollero, A.; García Cerezo, A.; Martínez, J.L. & Mandow, A. (1998). Fuzzy Tracking Methods
for Mobile Robots, in Applications of Fuzzy Logic: Towards a High Machine Intelligence
Quotient (M. Jamshidi ed.), Prentice Hall
Opitz, R. & Lernfahrzeug, D. (1990). Neural Network Application for Autonomous Mobile
Robots, in Advanced Neural Computers (R. Eckmiller ed.), Elsevier, Amsterdam
Oriolo, G.; Ulivi, G. & Venditelli, M. (1998). Real-Time Map Building and Navigation for
Autonomous Mobile Robots in Unknown Environments, IEEE Trans. on Systems, Man and
Cybernetics-Part B: Cybernetics, 3(28):316-333
Pal, N.C.; Bezdek, J.C. & Tsao, E.C.-K. (1993). Generalized Clustering Networks and
Kohonen's Self -Organizing Scheme, IEEE Trans. on Neural Networks, 4(1):549-557
Parunak, H.vD. (1996). Applications of Distributed Artificial Intelligence in Industry, in
Foundations of Distributed Artificial Intelligence (G.M.P. O’Hare and N.R. Jennings eds.),
Wiley Interscience
313
REFERENCES
Parunak, H.vD.; Baker, A.D. & Clark, S.J. (1997). The AARIA Agent Architecture: an
Example of Requirements-Driven Agent-Based System Design, 1st Intl. Conf. on
Autonomous Agents, Marina del Rey, USA
Paulos, E. & Canny, J. (1996). Delivering Real Reality to the World Wide Web via
Telerobotics, IEEE Intl. Conf. on Robotics and Automation
Paulos, E. & Canny, J. (1999). Designing Personal Tele-Embodiment, IEEE Intl. Conf. on
Robotics and Automation, Detroit, USA
Payton, D.W. (1986). An Architecture for Reflexive autonomous Vehicle Control, IEEE Intl.
Conf. on Robotics and Automation, pp 1838-1843
Pearson, J.C.; Gelfand, J.J.; Sullivan, W.E.; Peterson, R.M. & Spence, C.D. (1995). Neural
Network Approach to Sensory Fusion, in Multisensor Integration and Fusion for Intelligent
Machines and Systems (R.C. Luo and M.G. Kay eds.), Ablex Publishing, USA
Peterson, J.; Hager, G. & Hudak, P. (1999). A Language for Declarative Robot Programming,
in Proceedings of the Intl. Conference on Robotics and Automation, IEEE Press, USA
Pin, F.G. & Watanabe, Y. (1994). Navigation of Mobile Robots using a Fuzzy Behaviourist
Approach and Custom-Designed Fuzzy Inference Boards, Robotica, 12(1):491-503
Pomerleau, D.A. (1991). Efficient Training of Artificial Neural Networks for Autonomous
Navigation, Neural Computation, 3(1):88-97
Pratihar, D.K.; Deb, K. & Ghosh, A. (1999). A Genetic-Fuzzy Approach for Mobile Robot
Navigation Among Moving Obstacles, Intl. J. of Approximate Reasoning, 20(1):145-172
Ram, A.; Arkin, R.C.; Boone, G. & Pearce, M. (1994). Using Genetic Algorithms to Learn
Reactive Control Parameters for Autonomous Robotic Navigation, Adaptive Behaviour,
2(3):277-304
Ram, A.; Arkin, R.C.; Moorman, K. & Clark, R.J. (1997). Case-Based reactive Navigation: a
Method for On-Line Selection and Adaptation of Reactive Robotic Control Parameters,
IEEE Trans. on Systems, Man and Cybernetics, 27(3):376-394
Reyes, C.; Alcalde, S.; Hernán, I.; Reviejo, J.; García-Rosa, R. & de Pedro, T. (1999).
Controlador de Vehículos sin Conductor: Simulación e Instrumentación, VIII Conf. de la
Asociación Española para la Inteligencia Artificial, Murcia, Spain, pp 9-17
Rochwerger, B; Fennema, C.L.; Draper, B.; Hanson, A.R. & Riseman, E.M. (1994).
Executing Reactive Behaviour for Autonomous Navigator, Technical Report CMPSCI
TR94-05, Dept. of Computer Science, University of Massachusetts at Amherst, USA
Roehl, B.; Couch, J.; Reed-Ballreich, C.; Rohaly, T. & Drown, J. (1997). Late Night VRML
2.0 with Java, Ziff-Davis Press, USA
Rosenbaltt, F. (1959). Two Theorems of Statistical Separability in the Perceptron, Symp.
Mechanization of Thought Processes, London, UK, pp 421-456
Rosenblatt, J.K. (2000). Optimal Selection of Uncertain Actions by Maximizing Expected
Utility, Autonomous Robots, 9(1):17-25
Rosenblatt, J.K. & Payton, D.W. (1989). A Fine-Grained Alternative to the Subsumption
Architecture, AAAI Spring Symposium on Robot Navigation, Stanford, USA
314
HUMBERTO M ARTÍNEZ BARBERÁ
Ruspini, E. (1991). Truth as Utility: a Conceptual Synthesis, 7th Conf. on Uncertainty in
Artificial Intelligence, Los Angeles, USA
Rusell, S.J. & Norvig, P. (1995). Artificial Intelligence: a Modern Approach, Prentice Hall,
Englewood Cliffs, USA
Saffiotti, A. (1993). Some Notes on the Integration of Planning and Reactivity in Autonomous
Mobile Robots, AAAI Spring Symposium on Foundations of Planning, Stanford, USA, pp
122-126
Saffiotti, A. (1997a). The use of Fuzzy Logic for Autonomous Robot Navigation, Soft
Computing, 1(4):180-197
Saffiotti, A. (1997b). Fuzzy Logic in Autonomous Robotics: Behaviour Coordination, Sixth
IEEE Intl. Conference on Fuzzy Systems (FuzzIEEE’97), Barcelona, Spain, pp 573-578
Saffiotti, A. et al. (1994). Robust Execution of Robot Plans using Fuzzy Logic, in Fuzzy Logic
in Artificial Intelligence: IJCAI'93 Workshop. Lectures Notes in Artificial Intelligence 847
(Ralescu A. Ed.). Springer-Verlag, pp 24-37
Saffiotti, A.; Ruspini, E. & Konolige, K. (1993). Blending Reactivity and Goal-Directness in
a Fuzzy Controller, Second IEEE Intl. Conference on Fuzzy Systems (FuzzIEEE’93), San
Francisco, USA, pp 134-139
Saffiotti, A.; Ruspini, E. & Konolige, K. (1999). Using Fuzzy Logic for Mobile Robot
Control, in International Handbook of Fuzzy Sets (D. Dubois, H. Prade and H.
Zimmermann eds.), Kluwer Academic Publishing
Sánchez, O.; Ollero, A. & Heredia, G. (1997). Adaptive Fuzzy Control for Automatic Path
Tracking of Outdoor Mobile Robots. Application to Romeo 3R, IEEE Int. Conf. on Fuzzy
Systems, Barcelona, Spain, pp 593-599
Santos, V.; Gonzalves, J. & Vaz, F. (1994). Perception Maps for the Local Navigation of a
Mobile Robot: a Neural Network Approach, IEEE Intl. Conf. on Robotics and Automation,
pp 2193-2198
Saucy, P. & Mondada, F. (2000). KhepOnTheWeb: Open Access to a Mobile Robot on the
Internet, IEEE Robotics and Automation Magazine, 7(1):41-47
Sayers, C. (1999). Remote Control Robotics, Springer-Verlag, New York, USA
Schultz, A.C. & Nourbakhsh, I.R. (2000). Computational Intelligence in Robotics and
Automation, Autonomous Robots, 9(1):5-6
Schultz, D.; Burgard, W.; Fox, D.; Thrun, S. & Cremers, A.B. (2000). Web Interfaces for
Mobile Robots in Public Places, IEEE Robotics and Automation Magazine, 7(1):48-56
Serradilla, F. & Maravall, D. (1998). A Novel Approach to Autonomous Navigation: the
Sensory Gradient Concept and its Application to Mobile Robots, 3rd IFAC Symposium on
Intelligent Autonomous Vehicles, Madrid, Spain, pp 383-388
Shaad, R. (1998). Representation and Execution of Situated Action Sequences, Dissertation
der Wirtschaftswissenschaftlichen, University of Zurich, Switzerland
Shafer, G. (1976). A Mathematical Theory of Evidence, Princeton University Press, Priceton,
USA
315
REFERENCES
Sheridan, T. (1992). Telerobotics, Automation, and Human Supervisory Control, MIT Press,
Cambridge, USA
Simmons, R. & Apfelbaum, D. (1998). A Task Description Language for Robot Control, in
Proceedings of the Conference on Intelligent Robots and Systems, IEEE Press, USA
Simmons, R.; Fernández, J.L.; Goodwin, R.; Koening, S. & O´Shullivan, J. (2000). Lessons
Learned from Xavier, IEEE Robotics and Automation Magazine, 7(2):33-39
Stanev, P.; Enchev, E.; Noikov, S.; Manolov, O. & Zaprianov, J. (1998). Fuzzy Control and
reflexive Behaviour of an Autonomous Vehicle in an Unknown Environment, 3rd IFAC
Symposium on Intelligent Autonomous Vehicles, Madrid, Spain, pp 521-525
Stein, M.R. (2000). Interactive Internet Artistry, IEEE Robotics and Automation Magazine,
7(2):28-32
Stone, P. (2000). Layered Learning in Multiagent Systems, The MIT Press, Cambridge, USA
Stone, P. & Veloso, M. (1999). Task Decomposition, Dynamic Role Assignment, and LowBandwidth Communication for Real-Time Strategic Teamwork, Artificial Intelligence,
110(2):241-273
Stone, P. & Veloso, M. (2000). Multiagent Systems: a Survey from a Machine Learning
Perspective, Autonomous Robots, 8(3):345-383
Stone, P.; Veloso, M. & Riley, P. (1999). The CMUnited-98 Champion Simulator Team, in
RoboCup-98: Robot Soccer World Cup II (M. Asada and Kitano H. eds.), Springer-Verlag,
Berlin, Germany
Sugeno, M. & Nishida, M. (1985). Fuzzy Control of a Model Car, Fuzzy Sets and Systems,
16(1):103-113
Sugeno, M. & Takagi, T. (1983). Multidimensional Fuzzy Reasoning, Fuzzy Sets and
Systems, 9(1)
Surmann, H.; Huser, J. & Peters, L. (1995). A Fuzzy System for Indoor Mobile Robot
Navigation, IEEE Intl Conf. on Fuzzy Systems, Yokohama, Japan, pp 83-86
Sycara, K.; Decker, K.; Pannu, A.; Williamson, M. & Zeng, D. (1996). Distributed Intelligent
Agents, IEEE Expert, 11(6):36-46
Takagi, T. & Sugeno, M. (1985). Fuzzy Identification of Systems and its Application to
Modelling and Control, IEEE Trans. on Systems, Man, and Cybernetics, 15(1):116-132
Tanenbaum, A.S. (1996). Computer Networks, Prentice Hall, New Jersey, USA
Taylor, K. & Dalton, B. (2000a). Internet Robots: a New Robotics Niche, IEEE Robotics and
Automation Magazine, 7(1):27-34
Taylor, K. & Dalton, B. (2000b). Distributed Robotics over the Internet, IEEE Robotics and
Automation Magazine, 7(2):22-27
Tekalp, A.M. (1995). Digital Video Processing, Prentice Hall, New Jersey, USA
Thorpe, C.F. (1985). Path Relaxation: Path Planning for a Mobile Robot, Annual Report,
Robotics Institute, Mobile Robots Laboratory, Carnegie -Mellon University, USA, pp 3942
316
HUMBERTO M ARTÍNEZ BARBERÁ
Thrun, S. (1998a). Learning Metric-Topological Maps for Indoor Mobile Robot Navigation,
Artificial Intelligence, 99(1):21-71
Thrun, S. (1998b). A Framework for Programming Embedded Systems: Initial Design and
Results, Technical Report CMU-CS-98-142, Carnegie Mellon University, USA
Topalov, A.V.; Tsankova, D.D.; Petrov, M.G. & Proychev, T.P. (1998). Intelligent SensorBased Navigation and Control of Mobile Robot in a Partially Known Environment, 3rd
IFAC Symposium on Intelligent Autonomous Vehicles, Madrid, Spain, pp 439-444
Tso, K.S.; Backes, P.G. & Tharp, G.K. (1998). Mars Pathfinder Mission Internet-Based
Operations using Wits, IEEE Intl. Conf. on Robotics and Automation, Leuven, Belgium
Tsotsos, J.K. (1997). Intelligent Control for Perceptually Attentive Agents: the S* Proposal,
Robotics and Autonomous Systems, 21(5):5-21
Tunstel, E. (1995). Adaptive Hierarchy of Distributed Fuzzy Control: Application to
Behaviour Control of Rovers, Ph.D. Thesis, University of New Mexico, USA
Tunstel, E. (1996). Mobile Robot Autonomy via Hierarchical Fuzzy Behaviour Control,
World Automation Congress, ISARM Track, Montpellier, France, pp 837-642
Tunstel, E. & Jamshidi, M. (1996). On Genetic Programming of Fuzzy Rule Based Systems
for Intelligent Control, Intl. Journal of Intelligent Automation and Soft Computing, 3(1)
Ulrich, I. & Borenstein, J. (1998). VFH+: Reliable Obstacle Avoidance for Fast Mobile
Robots, IEEE Intl. Conf. on Robotics and Automation, Leuven, Belgium, pp 1572-1577
Ulrich, I. & Borenstein, J. (2000). VFH*: Local Obstacle Avoidance with Look Ahead
Verification, IEEE Intl. Conf. on Robotics and Automation, San Francisco, USA, pp 25052511
Urzelai, J.; Uribe, J.P. & Ezkerra, M. (1997). Fuzzy Controller for Wall-Following with a
Non-Holonomous Mobile Robot, IEEE Int. Conf. on Fuzzy Systems, Barcelona, Spain, pp
1361-1368
Vasudevan, V. (1998). Comparing Agent Communication Languages, OBJS Technical Note
9807, Object Services and Consulting Inc., USA (http://www.objs.com/agility/techreports/9807-comparing-ACLs.html)
Velasco, J.R. & Ventero, F.E. (1994). Some Applications of Fuzzy Clustering to Fuzzy
Control Systems , Third International Conference of Fuzzy Theory and Technology,
Durham, USA
Villalba, G. & Martínez Barberá, H. (1998). Robot Autónomo con Control Reactivo, MsE
Thesis, Escuela Politécnica Superior, Universidad de Murcia, Spain
Voudouris, C.; Chernett, P.; Wang, C.J. & Callaghan, V.L. (1994). Fuzzy Hierarchical
Control for Autonomous Vehicles, Intl. Workshop on Intelligent Robotic Systems
Vranes, S. & Stanojevic, M. (1995). Integrating Multiple Paradigms within the Blackboard
Architecture, IEEE Transactions on Software Engineering, 3(21): 244-262
Walter, W.G. (1950). An Imitation of Life, Scientific American, 182(5):42-45
317
REFERENCES
Wang, L. & Langari, R. (1996). Complex Systems Modelling via Fuzzy Logic , IEEE Trans. on
Systems, Man and Cybernetics, 1(26):100-106
Wang, L. & Mendel, J.M. (1992). Generating Fuzzy Rules by Learning from Examples, IEEE
Trans. on Systems, Man and Cybernetics, 22(6):1414-1427
Wang, X. (1996). Planning while Learning Operators, 3rd Intl. Conf. on AI Planning Systems
Wang, Y. & Chirikjian, G. (2000). A New Potential Field Method for Robot Path Planning,
IEEE Int. Conf. on Robotics and Automation, San Francisco, USA, pp 977-982
Watt, A. & Watt, M. (1992). Advanced Animation and Rendering Techniques, ACM Press,
1992
Welch, J.C. (1999). Wireless Networking and the AirPort, MacTech Magazine, 15(12):14-32
Werbos, P. (1974). Beyond Regression: New Tools for Prediction and Analysis in the
Behavioral Science, PhD Thesis, Harvard University, Cambridge, USA
Widrow, B. & Hoff, M.E. (1960). Adaptive Switching Circuits, IRE Western Electric Show
and Convention Record, USA, pp 96-104
Xu, W.L. & Tso, S.K. (1999). Sensor-Based Fuzzy reactive Navigation of a Mobile Robot
Through Local target Switching, IEEE Trans. on Systems, Man and Cybernetics,
29(3):451-459
Yager, R. & Filev, D. (1994). Approximate Clustering via the Mountain Method, IEEE Trans.
on Systems, Man, and Cybernetics, 24(1):1279-1284
Yen, J. & Pfluger, N. (1995). A Fuzzy Logic Based Extension to Payton and Rosenblatt’s
Command Fusion Method for Mobile Robot Navigation, IEEE Trans. on Systems, Man,
and Cybernetics, 25(6):971-978
Yeong, W.; Howes, T. & Kille, S. (1993). Lightweight Directory Access Protocol, RFC 1487
Yuan, B.; Klir, G.J. & Swan-Stone, J.F. (1995). Evolutionary Fuzzy C-Means Clustering
Algorithm, IEEE
Zadeh, L.A.. (1965). Fuzzy Sets, Inf. Control, 8(1):338-353
Zalama, E.; Gaudiano, P. & López Coronado, J. (1995). A Real-Time, Unsupervised Neural
Network for the Low-Level Control of a Mobile Robot in a Nonstationary Environment,
Neural Networks, 8(1):103-123
318

Documentos relacionados