Actas de las XVII Jornadas de ARCA 2015

Transcripción

JARCA 2015
Actas de las XVII Jornadas de ARCA
Sistemas Cualitativos y sus Aplicaciones en Diagnosis, Robótica,
Inteligencia Ambiental y Ciudades Inteligentes
Vinaros 23 al 27 de Junio de 2015
Juan Antonio Ortega Ramírez
Mario Muñoz Organero
XVII Jornadas de ARCA
Actas de las XVII Jornadas de ARCA
Sistemas Cualitativos y sus Aplicaciones en
Diagnosis, Robótica, Inteligencia Ambiental y
Ciudades Inteligentes
Proceedings of the XVII ARCA Days
Qualitative Systems and its Applications in Diagnose, Robotics, Ambient
Intelligence and Smart Cities
Editado por / Edited by
Juan Antonio Ortega
Departamento de Lenguajes y Sistemas Informáticos
Universidad de Sevilla, España
y / and
Departamento de Ingeniería Telemática
Universidad Carlos III de Madrid, España
I
Sistemas Cualitativos y sus Aplicaciones en Diagnosis, Robótica, Inteligencia
Ambiental y Ciudades Inteligentes / Juan Antonio Ortega Ramírez , Mario
Muñoz Organero
ISBN: 978-84-608-5599-6
I.
II.
Edita:
© Derechos de autor
De los textos: los autores correspondientes
De las ilustraciones: los autores correspondientes
De esta edición: Juan Antonio Ortega Ramírez
Junio 2015.
ISBN: 978-84-608-5599-6
No está permitida la reproducción total o parcial de este libro, ni su transmisión
de ninguna forma o por cualquier medio, ya sea electrónico, por registro u otros
métodos sin el permiso previo y por escrito de los titulares del copyright.
II
Director de las Jornadas
Juan Antonio Ortega Ramírez, Universidad de Sevilla
Comité Organizador
Mario Muñoz Organero, Universidad Carlos III de Madrid
Víctor Córcoba Magaña, Universidad Carlos III de Madrid
Jorge Yago Fernández Rodríguez, Universidad de Sevilla
Álvaro Arcos García, Universidad de Sevilla
Comité de programa
Núria Agell, Universidad Ramon Llull
Cecilio Angulo, Universitat Politécnica de Catalunya
Joaquim Armengol, Universitat de Girona
Andreu Català, Universitat Politécnica de Catalunya
Zoe Falomir, Universität Bremen
Luis González, Universidad de Sevilla
Natividad Martínez, Reutlingen University
Quim Meléndez, Universitat de Girona
Lledó Museros, Universitat Jaume I
Juan Antonio Ortega, Universidad de Sevilla
Francisco Ruiz, Universitat Politécnica de Catalunya
Ismael Sanz, Universitat Jaume I
Ralf Seepold, Universität Konstanz
Miguel Toro, Universidad de Sevilla
Jesús Torres, Universidad de Sevilla
Josep Vehí, Universitat de Girona
Francisco Velasco, Universidad de Sevilla
III
Preface
This volume contains the papers presented at JARCA 2015: XVII Jornadas de ARCA
Sistemas Cualitativos y sus Aplicaciones en Diagnosis, Robótica, Inteligencia
Ambiental y Smart Cities that took place from 22 to 28 June 2015 in Vinaros.
There were 20 accepted papers. Each submission was reviewed by program
committee members.
This proceedings are partially supported by the projects of the Spanish Ministry of
Economy and Competitiveness HERMES (TIN2013-46801-C4-1-R , TIN2013-46801C4-2-R) and Simon (TIC8052) of the Andalusian Regional Ministry of Economy,
Innovation and Science and with the cooperation of Fidetia (Fundación para la
Investigación y el Desarrollo de las Tecnologías de la Información en Andalucía).
Easychair was used to manage paper submissions, reviewing and generating the
proceedings.
Viraros, junio de 2015
Ediciones Anteriores
Sevilla 1998
Sevilla 2000
Valladolid 2001
Vilanova i la Geltrú 2002
Lanzarote 2003
Menorca 2004
Benalmádena 2005
Peñíscola y Castellón de la Plana 2006
Girona 2007
Tenerife 2008
Granada 2009
Mallorca 2010
Huelva 2011
Salou 2012
Murcia 2013
Cádiz 2014
IV
Index
Cognitive-AmI project: Widespreading results . . .................................................................1
Zoe Falomir
A pilot study to test the cognitive performance of a qualitative shape description
scheme for juxtaposing shapes . . ..................................................................................................6
Zoe Falomir Lledó Museros, Ismael Sanz Luis González-Abril
Un big picture sobre las tecnologías de computación y almacenamiento distribuidos . .........................................................................................................................................................11
Damián Fernández Cerero, Alejandro Fernández-Montes, Juan Antonio Ortega
A multidimensional expertise recommender tool . . . . . ...................................................14
Germán Sánchez-Hernández, Jennifer Nguyen, Núria Agell
Tracking infrastructure for large crowd movement . . . . . ...............................................18
Andreas Eberlei , Manuel Herzberg , Jonas Kaltenbach , Dominik Ringgeler ,
Patrick Datko , Wilhelm D. Scherz , Tatjana Thimm and Ralf Seepold
ENGAGE: an Evidence-based Model of Engagement for Dementia . . ........................22
Giulia Perugia, M.D.; Marta Díaz Boladeras; Andreu Català Mallofré Plataforma
BREATHE de apoyo al cuidador informal: ¿Estamos preparados para el Gran
Hermano? . . . . ......................................................................................................................................26
Ángel Martínez, Juan Pablo Lázaro, Isabel Martí
Filtering process and data exchange architecture over ECG custom-hardware
platform . . . . ........................................................................................................................................32
Luis Miguel Soria Morillo , Daniel Scherz , Ralf Seepold and Juan Antonio
Ortega Ramírez
Drive assistant combined with EEG data applied to aggressive driving perception
..............................................................................................................................................................33
Emre Yay , Luis Miguel Soria Morillo , Natividad Martínez Madrid and Juan
Antonio Ortega Ramírez
Preparing an energy-efficiency and safety relevant driving system for the evaluation on a driving simulator. . . . . ..................................................................................................34
Emre Yay and Natividad Martínez Madrid, Juan Antonio Ortega Ramírez
V
Towards emotion pattern extraction with the help of stress detection techniques
in order to enable a healthy life. ................................................................................................ 39
Wilhelm D. Scherz , Juan Antonio Ortega and Ralf Seepold
SmartDriver: An assistant for reducing stress and improve the fuel consumption
.................................................................................................................................................................. 45
V. Corcoba Magaña and M. Muñoz Organero
Plataforma para gestión de información de ciudadanos de una SmartCity ............53
Jorge Yago Fernández, Álvaro Arcos García, Juan Antonio Álvarez-García, Jesús
Torres, Jesús Arias Fisteus, Víctor Corcoba Magaña, Mario Muñoz Organero, Luis
Sánchez Fernández
Sistema de reconocimiento de señales de tráfico para una SmartCity .....................57
Álvaro Arcos García, Juan Antonio Álvarez-García, Jorge Yago Fernández, Juan
Antonio Ortega, Mario Soilán, Belén Riveiro, Pedro Arias-Sánchez
Infraestructuras para gestión de información de una SmartCity ................................ 63
Miguel Luaces, Susana Ladra, Mario Muñoz Organero, Jesús Arias Fisteus, Víctor
Córcoba Magaña, Pedro Arias-Sánchez, Belén Riveiro, Juan Antonio ÁlvarezGarcía, Juan Antonio Ortega, Jorge Yago Fernández
A Pilot Study on Energy Saving in an Intelligent Building.............................................. 65
Zoe Falomir, Alejandro Fernández-Montes, Luis González-Abril
Integrating SOAR Cognitive Architecture in ROS Environment on a Parrot
AR.Drone 2.0. ..................................................................................................................................... 67
Sai Kishor Kothakota, Cecilio Angulo
Towards Parameterizing a Colour Model depending on the Context........................ 73
Lledó Museros, Ismael Sanz Zoe Falomir Luis González-Abril
Calculo de la odometría en un robot cuadrúpedo mediante técnicas de visión
artificial................................................................................................................................................. 75
Lucía Lillo-Fantova, Manel Velasco, Cecilio Angulo
Un estudio estadístico de los e-mail que recibe un investigador actual en el área
de conocimiento de la computación y de la bioingeniería ............................................. 83
Luis González-Abrila y Yenny Lealb
VI
Proyecto Cognitive-AmI: Descripción y Resultados
Zoe Falomir
Bremen Spatial Cognition Research Centre
University of Bremen
Enrique-Schmidt-Str. 5, DE-28359 Bremen
[email protected]
Abstract
puede interactuar con un sistema que controla el edificio
mediante pantallas interactivas.
En este artículo se describe el proyecto
Descripciones cognitivas y semánticas de escenas
para razonar y aprender en inteligencia ambiental
(Cognitive-AmI, GA 328763) que ha sido
financiado por el programa Marie Curie IntraEuropean Fellows del 7º Programa Marco de la
Unión Europea.
También, se explican sus principales resultados de
forma divulgativa, se mencionan algunas de las
publicaciones científicas más relevantes y se
proporciona acceso a los datos obtenidos a través
de la web del proyecto:
https://sites.google.com/site/cognitiveami/
1
Descripción
Este proyecto trata de obtener información cualitativa a
partir de imágenes/vídeos de entornos interiores. ¿Por qué
información cualitativa? Porque las representaciones
cualitativas abstraen detalles innecesarios y pueden tratar
datos que contienen incertidumbre (por ejemplo ruido). Las
representaciones cualitativas también se corresponden con
conceptos lingüísticos que tienen significado para las
personas, y por tanto, se pueden utilizar intuitivamente en la
comunicación usuario-máquina. Además, el razonamiento
cualitativo espacio-temporal (QSTR) [Cohn y Renz, 2007;
Ligozat, 2011] ha definido previamente modelos útiles para
razonar en el espacio a cerca de la localización [Hernández,
1991]], topología [Egenhofer y Al-Taha, 1992; Cohn et al.
1994], dirección [Freksa, 1992], visibilidad [Tarquini et al.,
2007]], forma [Falomir et al. 2013], etc. Dichos modelos se
ha aplicado en muchas áreas científicas como la robótica
[Kunze et al., 2014; Falomir et al., 2013b], la arquitectura y
el diseño [Bhatt and Freksa, 2015], los sistemas de
información
geográfica
[Fogliarioni,
2013],
el
reconocimiento de esquemas o esbozos [Lovett et al. 2006],
etc.
Figura 1. Resumen gráfico del proyecto Cognitive-AmI.
El problema de las imágenes/vídeos digitales es que
discretizan todo el espacio y lo representan en una matriz de
puntos de color o píxeles (por ejemplo píxeles en rojo, verde
y azul o RGB en inglés). Dichos píxeles no están
relacionados entre sí, es decir, no conservan las propiedades
que el espacio tiene (por ejemplo continuidad,
interrelaciones, etc.). Como consecuencia, un aspecto
cognitivo tan básico como saber dónde termina una taza y
donde empieza una mesa, no es tan trivial de calcular en el
mundo de las imágenes digitales. Con la aparición de los
sensores que incorporan información de profundidad (por
ejemplo los sensores MS Kinect o Asus Xtion) este
problema tiene una dimensión más que calcular, ya que
ahora el espacio se discretiza utilizando en una nube de
puntos en 3 dimensiones.
El objetivo de este proyecto es utilizar los métodos
disponibles en el campo de la visión por computador para
reconocer objetos [Bay et al., 2008; Muja y Lowe,2009],
regiones [Felzenszwalb y Huttenlocher, 2004] o
movimientos [Zivkovic, 2004], y a partir de los datos
obtenidos, intentar abstraer conceptos que conserven las
propiedades del espacio y que puedan ayudar a describir
escenas de manera más cognitiva.
En el proyecto Cognitive-AmI (Figura 1) trabajamos con
imágenes/vídeos capturados por cámaras situadas en un
robot o en el interior de un edificio, como es el caso del
edificio Cartesium en la Universidad de Bremen, donde se
1
2014], formas [Museros et al., 2015] y colores de pinturas
[Falomir et al., 2015] utilizando las relaciones de vecindad
conceptual entre conceptos cualitativos. Valores de similitud
entre colores de pinturas de autores como Dalí, Miró, el
Greco, Velázquez y Hundertwasser han sido calculados
automáticamente (Figura 3) y comparados a aquellos
proporcionados por las personas en una encuesta. El
resultado de la comparación ha sido una correlación de
valores. Por otra parte, como un sistema cognitivo debe
tener la capacidad de aprender de entradas de sensores,
técnicas de aprendizaje (por ejemplo máquinas de soporte
vectorial) se han aplicado a la categorización de estilos de
pintura (como por ejemplo Barroco, Impresionismo y PostImpresionismo) [Falomir et al., 2015b] donde también se ha
observado que la utilización de colores sigue una lógica.
Además, se ha demostrado la adaptabilidad al usuario del
modelo cualitativo de color utilizado [Sanz et al., 2015].
En concreto, escenas de entornos interiores han sido
capturadas en el edificio Cartesium, donde se encuentra el
Centro Investigación Spatial Cognition en la Universidad de
Bremen, para obtener un conjunto de datos donde aplicar el
modelo QIDL desarrollado para obtener una descripción
lógica y narrativa de espacios utilizando características
cualitativas de forma, color, topología, ubicación y tamaño
[Falomir, 2015]. El principal objetivo es describir la
ubicación de objetos necesarios para realizar una tarea o
conocidos a priori, pero también describir objetos
desconocidos de los que podemos nombrar su color, forma o
localización para recibir más información a partir de la
interacción con el usuario. Esta descripción lógica se
proporciona utilizando cláusulas de Horn implementadas en
Prolog para razonar con localizaciones espaciales (Figura
2). Los experimentos realizados en el edificio Cartesium en
salas comunes y en oficinas han mostrado la utilidad del
método desarrollado [Falomir y Olteteanu, 2015]. Con el fin
de ayudar a los agentes de software en la comprensión de
los entornos interiores, estas descripciones cualitativas se
han ligado también a descripciones ontológicas [Falomir,
2014; 2013b].
Figura 3. Ejemplo de valores de similitud entre los colores
utilizados en pinturas de Dalí (D) y Miró (M).
Además, también se ha mostrado la aplicabilidad de
modelos cualitativos en la descripción de movimientos en
videos, por ejemplo obteniendo la localización y dirección
de un objeto en un instante determinado o en un período de
tiempo. De nuevo, esta descripción de movimiento puede
ser escrita utilizando cláusulas de Horn en Prolog que se
pueden utilizar para razonar sobre la información obtenida e
inferir tipos de movimientos (por ejemplo parabóla, rebote
en pared, etc.) [Falomir y Rahman, 2015].
Además, para mejorar la comunicación hombre-máquina, se
ha diseñado una gramática que obtiene frases en lenguaje
natural a partir de las descripciones cualitativas obtenidas de
las escenas y producir así una descripción narrativa
[Falomir, 2013]. También se han hecho estudios sobre cómo
las personas nos referimos a objetos en una escena para
maximizar la discriminación entre el resto de los objetos y
que nuestro interlocutor sepa a qué nos referimos cuando
hablamos. El modelo generado [Mast et al., 2015] obtiene
características de forma, color y localización tanto absolutas
como difusas y compara su utilización. Nuestros estudios
demuestran que los modelos no precisos (es decir un objeto
puede ser nombrado como rojo, rosa o naranja según el
interlocutor que nos habla) son más adaptables a las
Figura 2. Esquema del proceso seguido para extraer descripciones
lógicas de entornos interiores.
Dentro del proyecto Cognitive-AmI se han definido también
métodos de similitud para comparar escenas [Falomir et al.,
2
situaciones de comunicación reales que los modelos
absolutos/exactos.
También se ha desarrollado un modelo cualitativo para
describir objetos 3D basado en la profundidad vista desde
diferentes perspectivas (Figura 4). Si consideramos 3
perspectivas de un objeto como por ejemplo, la delantera,
lateral y superior, podemos tener en cuenta las relaciones de
continuidad existentes en el espacio para definir condiciones
que se han de encontrar en cada perspectiva. También
podemos utilizar estas descripciones para inferir
información de otras perspectivas como son la trasera, la del
otro lateral o la inferior, que según el punto de vista
adoptado pueden estar ocultas. Por ejemplo, si el objeto
tiene un agujero abierto transversal, este debe estar
representado en todas las perspectivas a las que afecta. No
es coherente que se represente en una perspectiva y en otra
no. Pues, de acuerdo con ello, se han obtenido descripciones
lógicas en cláusulas de Horn, que se han programado y
probado en Prolog [Falomir, 2015b, Falomir 2015c]. Los
resultados obtenidos parecen prometedores y pensamos que
pueden ayudar a los estudiantes a resolver tests de
inteligencia como los de la agencia alemana Studienstiftung1
sobre dibujos técnicos de objetos 3D (Figura 5). En el
mundo 3D, también hemos realizado estudios utilizando
sensores Kinect para describir escenas reales que contienen
objetos orientados (como por ejemplo, sillas, que tienen un
frente diferente al frente del interlocutor) y hemos generado
descripciones narrativas adecuadas al contexto [Kluth y
Falomir, 2013].
Figura 5. Aplicación del modelo Q3D a la educación con el
objetivo de ayudar a estudiantes a resolver tests de inteligencia
sobre dibujo técnico.
Finalmente, se han realizado tests cognitivos sobre
creatividad y la relación de ésta con la asociación típica o
atípica entre conceptos lingüísticos y visuales que hacemos
las personas. Un ejemplo de Asociación Remota de
Conceptos (RAT) sería qué asociamos las personas a la
terna de conceptos: Cottage-Swiss-Cake (en español chozasuizo-pastel). Los estudios de Mednick y Mednick [1971]
proporcionan Cheese (en español queso) como resultado
convergente, pues existe el queso suizo, el pastel de queso y
el requesón (en inglés cottage cheese). Se ha desarrollado
un método computacional (comRAT-C) [Olteteanu y
Falomir, 2015] el cual converge en un concepto relacionado
con otros tres y proporciona gran parte de los resultados
obtenidos en el test de Mednick y Mednick [1971] el cual
fue creado para medir el nivel de creatividad en personas.
Además dicho sistema computacional es capaz de dar otros
resultados posibles por convergencia de dos conceptos,
como por ejemplo Chocolate, en común a Swiss y Cake
(chocolate suizo o pastel de chocolate).
2
Conclusiones
En este artículo se describe el proyecto Descripciones
cognitivas y semánticas de escenas para razonar y
aprender en inteligencia ambiental (Cognitive-AmI,
GA 328763) y sus principales resultados de forma
divulgativa. Dicho proyecto ha sido financiado por la
Unión Europea a través de las acciones Marie Curie del
7° programa marco (FP7).
Figura 4. Ejemplo de descripción cualitativa basada en volúmenes
de un objeto 3D simple.
Para más detalles sobre métodos e implementación, se
proporcionan las correspondientes publicaciones
científicas en las referencias.
Más información puede encontrarse también en la web del
proyecto: https://sites.google.com/site/cognitiveami/
Agradecimientos
A la financiación de la Unión Europea a través de las
acciones Marie Curie del 7° programa marco (FP7) (GA
1
Test der Studienstiftung: Gehirnjogging für Hochbegabte o
Prueba de la Fundación Académica Alemana sobre Cerebro
Dotados. http://www.spiegel.de/quiztool/quiztool-49771.html
3
328763) y al Bremen Spatial Cognition Research Centre y
su personal.
También se agradece la colaboración de los siguientes
investigadores y coautores de artículos: Ana-Maria
Olteteanu (U. Bremen), Vivien Mast (U. Postdam), Diedrich
Wolter (U. Bamberg), Lledó Museros (U. Jaume I), Ismael
Sanz (U. Jaume I), Luis Gonzalez-Abril (U. Sevilla) y
Christian Freksa (U. Bremen).
scenes in robotics, Pattern Recognition Letters, 38: 731–
743,
2013.
[Online].
Available:
http://dx.doi.org/10.1016/j.patrec.2012.08.012
[Falomir et al., 2014] Z. Falomir, L. Museros, and L.
Gonzalez-Abril. Towards a similarity between
qualitative image descriptions for comparing real scenes.
In Qualitative Representations for Robots, Proc. AAAI
Spring Symposium, Technical Report SS-14-06, pages
42–49, 2014. ISBN 978-1-57735-646-2, Palo Alto,
California, USA, 2014.
[Falomir, 2014] Z. Falomir. An approach for scene
interpretation using qualitative descriptors, semantics
and domain knowledge. In Knowledge Representation
and Reasoning in Robotics, AAAI Spring Symposium
Series, pages 95–98, 2014. ISBN 978-1-57735-646-5.
Palo Alto, California, USA, 2014.
[Falomir, 2015a] Zoe Falomir. A qualitative image
descriptor QIDL+ applied to ambient intelligent systems.
In Proceedings of the 10th International Workshop on
Artificial Intelligence Techniques for Ambient
Intelligence (AITAmI15), co-located at IJCAI-2015,
Accepted. Buenos Aires, Argentina, 2015.
[Falomir, 2015b] Falomir Z., A Qualitative Model for
Reasoning about 3D Objects using Depth and Different
Perspectives, 1st Workshop on Logics for Qualitative
Modelling and Reasoning (LQMR). Federated
Conference on Computer Science and Information
Systems (FedCSIS), pp. 1--9, Lodz, Poland, September
2015.
[Falomir, 2015c] Falomir Z., A Qualitative Model for
Describing 3D Objects using Depth. In Spatio Temporal
Dynamics (STeDy) Workshop at International Joint
Conference on Artificial Intelligence (IJCAI), Buenos
Aires, Argentina, July 2015.
[Falomir y Olteteanu, 2015] Z. Falomir and A-M. Olteteanu.
Logics based on qualitative descriptors for scene
understanding. Neurocomputing, 161:3–16, 2015.
Available:
http://dx.doi.org/10.1016/j.neucom.2015.01.074
[Falomir y Rahman, 2015] Z. Falomir and S. Rahman. From
qualitative descriptors of movement towards spatial
logics for videos. In J. Dias, F. Escolano, and R. Marfil,
editors, Proc. of the 3rd Workshop on Recognition and
Action for Scene Understanding (REACTS), accepted,
2015.
[Falomir et al., 2015a] Z. Falomir, L. Museros, and L.
Gonzalez-Abril. A model for colour naming and
comparing based on conceptual neighbourhood. An
application for comparing art compositions. KnowledgeBased
Systems,
81:1–21,
2015.
Available:
http://dx.doi.org/10.1016/j.knosys.2014.12.013
[Falomir et al., 2015b] Z. Falomir, L. Museros, I. Sanz, and
L. Gonzalez-Abril. Guessing art styles using qualitative
colour descriptors, SVMs and logics. In Artificial
Intelligence Research and Development, Frontiers in
References
[Bay et al., 2008] Herbert Bay, Andreas Ess, Tinne
Tuytelaars, and Luc Van Gool. Speeded-up robust
features (SURF). Comput. Vis. Image Underst.,
110(3):346–359, June 2008.
[Bhatt y Freksa, 2015] M. Bhatt and C. Freksa, Spatial
computing for design an artificial intelligence
perspective, in Studying Visual and Spatial Reasoning
for Design Creativity, J. S. Gero, Ed., 2015, pp. 109–
127.
[Cohn et al. 1994] Cohn, A., Randell, D., Cui, Z., Bennett,
O., and Gooday, J. (1994). Taxonomies of logically
defined qualitative spatial relations. In in N. Guarino and
R. Poli (eds), Formal Ontology in Conceptual Analysis
and Knowledge Representation, pages 831-846. Kluwer.
[Cohn y Renz, 2007] A. G. Cohn and J. Renz, Qualitative
Spatial
Reasoning,
Handbook
of
Knowledge
Representation, V. L. F. Harmelen and B. Porter, Eds.
Wiley-ISTE, London: Elsevier, 2007.
[Egenhofer y Al-Taha, 1992] Egenhofer, M. J. and Al-Taha,
K. K. (1992). Reasoning about gradual changes of
topological relationships. In Frank, A. U., Campari, I.,
and Formentini, U., editors, Theories and Methods of
Spatio-Temporal Reasoning in Geographic Space. Intl.
Conf. GIS|From Space to Territory, volume 639 of
Lecture Notes in Computer Science, pages 196-219,
Berlin. Springer.
[Falomir et al. 2013] Falomir Z., Gonzalez-Abril L.,
Museros L., Ortega J. (2013), Measures of Similarity
between Objects from a Qualitative Shape Description,
Spatial Cognition and Computation, 13 (3): 181–218.
[Falomir, 2013a] Z. Falomir. Towards cognitive image
interpretation qualitative descriptors, domain knowledge
and narrative generation. In V. Botti K. Gibert and R.
Reig-Bolao, editors, Artificial Intelligence Research and
Development, Frontiers in Artificial Intelligence and
Applications, vol. 256, pages 77-86, IOS Press,
Amsterdam, 2013.
[Falomir, 2013b] Z. Falomir. Towards scene understanding
using contextual knowledge and spatial logics. In J.
Dias, F. Escolano, and R. Marfil, editors, Proc. of the
2nd Workshop on Recognition and Action for Scene
Understanding (REACTS), pages 85–100, 2013. ISBN
978-84-616-7092-5.
[Falomir et al. 2013b] Z. Falomir, L. Museros, V. Castelló,
and L. Gonzalez-Abril, Qualitative distances and
qualitative image descriptions for representing indoor
4
Artificial Intelligence and Applications. IOS Press,
accepted, Amsterdam, 2015.
[Felzenszwalb y Huttenlocher, 2004] P. F. Felzenszwalb, D.
P. Huttenlocher, Efficient graph-based image
segmentation, Int. J. Comput. Vis. 59 (2) (2004) 753
167–181.
[Fogliarioni, 2013] P. Fogliaroni, Qualitative Spatial
Configuration Queries. Towards Next Generation
Access Methods for GIS, ser. Dissertations in
Geographic Information Science. IOS Press, 2013,
ISBN 978-1614992486.
[Freka, 1992] Freksa, C. (1992). Using orientation
information for qualitative spatial reasoning. In Frank,
A. U., Campari, I., and Formentini, U., editors, Theories
and Methods of Spatio-Temporal Reasoning in
Geographic Space. Intl. Conf. GIS|From Space to
Territory, volume 639 of Lecture Notes in Computer
Science, pages 162-178, Berlin. Springer.
[Hernandez, 1991] Hernandez, D. (1991). Relative
representation of spatial knowledge: The 2-D case. In
Mark, D. M. and Frank, A. U., editors, Cognitive and
Linguistic Aspects of Geographic Space , NATO
Advanced Studies Institute, pages 373-385. Kluwer,
Dordrecht.
[Kluth y Falomir, 2013] T. Kluth and Z. Falomir. Studying
the role of location in 3D scene description using natural
language. In J. A. Ortega I. Sanz, L. Museros, editor, XV
Workshop of the Association on Qualitative Reasoning
and its Applications (JARCA13). Qualitative Systems
and their applications to Diagnosis, Robotics and
Ambient Intelligence. Proceedings from the University
of Seville, pages 33–36, 2013. ISBN 978-84-616-76224.
[Kunze et al., 2014] L. Kunze, C. Burbridge, and N. Hawes,
Bootstrapping probabilistic models of qualitative spatial
relations for active visual object search, in Qualitative
Representations for Robots, Proc. AAAI Spring
Symposium, Technical Report SS-14-06, 2014, pp. 81–
80, ISBN 978-1-57735-646-2.
[Ligozat, 2011] G. Ligozat, Qualitative Spatial and
Temporal Reasoning. Wiley-ISTE, London: MIT Press,
2011.
[Lovett et al. 2006] A. Lovett, M. Dehghani, and K. Forbus,
Learning of qualitative descriptions for sketch
recognition, in Proc. 20th Int. Workshop on Qualitative
Reasoning (QR), Hanover, USA, July, 2006.
[Mast et al., 2015] V. Mast, Z. Falomir, and D.Wolter.
Probabilistic reference and grounding with PRAGR for
dialogues with robots. Journal of Experimental &
Theoretical Artificial Intelligence, under revision, 2015.
[Mednick y Mednick, 1971] Mednick, S.A., Mednick, M.:
Remote associates test: Examiner's manual. Houghton
Mifflin (1971).
[Muja y Lowe, 2009] M. Muja and D. G. Lowe. Fast
approximate nearest neighbors with automatic algorithm
configuration. In VISAPP Int. Conf. on Computer Vision
Theory and Applications, pages 331–340, 2009.
[Museros et al., 2015] L. Museros, Z. Falomir, I. Sanz, and
L. Gonzalez-Abril. Sketch retrieval based on qualitative
shape similarity matching: Towards a tool for teaching
geometry to children. AI Communications, 28(1):73–86,
2015.
[Sanz et al., 2015] I. Sanz, L. Museros, Z. Falomir, and L.
Gonzalez-Abril. Customizing a qualitative colour
description for adaptability and usability. Pattern
Recognition Letters, SI: Cognitive Systems for
Knowledge
Discovery,
[Olteteanu i Falomir, 2015] Olteteanu, A.M., Falomir, Z.:
comRAT-C - A computational compound Remote
Associates Test solver based on language data and its
comparison to human performance. Pattern Recognition
Letters
(2015),
[Tarquini et al. 2007] Tarquini, F., De Felice F., Fogliaroni
P., Clementini E., A qualitative model for visibility
relations. KI, Advances in Artificial Intelligence 2007.
[Zivkovic, 2004] Zoran Zivkovic. Improved adaptive
gaussian mixture model for background subtraction. In
Pattern Recognition, 2004. ICPR 2004. Proceedings of
the 17th International Conference on, volume 2, pages
28 – 31. IEEE, 2004.
5
A Qualitative Shape Description Scheme for Juxtaposing Objects Applied to Solve
Spatial Transformation Tasks: a proof-of-concept
Zoe Falomir∗
Universität Bremen
Lledo´ Museros and Ismael Sanz
Universitat Jaume I
Abstract
the implementation), in order to have a more cognitive
performance; (ii) if there is a significant correlation between
the problems that are hard to solve for the participants and
those that are also challenging for the QSDJux approach.
The goal of the study presented in this paper is
to determine if the Qualitative Shape Description
Scheme for juxtaposing objects (QSDJux)
[Museros et al., 2011, 2010] has a cognitive
performance, that is, it can perform a juxtaposition
task in a similar way as human participants do
when solving tests involving mental transformation
skills (i.e. direct or diagonal translation and direct
or diagonal rotation, etc). An example case from
the psychological test developed by Levine et
al. [1999] has been selected and it is used as a
proof-of-concept for this cognitive verification.
1
Luis Gonzalez-Abril
Universidad de Sevilla
Introduction
A fundamental question in human cognition is how people
reason about space. The ability to represent and transform
spatial information is a vital component of humans. It is
important in everyday activities, such as navigating in a new
city or finding an office in a public building.
The Qualitative Shape Description Scheme (QSDJux)
[Museros et al., 2011, 2010] is able to juxtapose two shapes
described qualitatively, and to generate a new qualitative
shape description which corresponds to the new juxtaposed
shape.
The work by Levine et al. [1999] used a two-dimensional
stimuli that is divided in half by a vertical line of symmetry.
In a test (see Figure 1), the two halves were shown to
children either rotated or translated apart. Then children
were asked to find out which of four given objects will result
when putting the two halves together. Results of this test
are analyzed to study how participants perform when doing
mental transformation (translation and rotation on bilaterally
symmetrical items and on vertically symmetrical items).
In this paper, QSDJux is used to automatically solve an
example of the tests presented by Levine et al. [1999].
The aspects to study are: (i) how the QSDJux could
be improved (regarding both the theoretical framework or
Figure 1: Example of a question in the tests presented
in Levine et al. [1999]. Given stimuli a, b, c and d,
the participant must determine which object results when
juxtaposing them.
The rest of the paper is organised as follows. Section
2 provides an example of a qualitative shape juxtaposition
(QSDJux). Section 3 presents the proof-of-concept computed
and the results observed. Finally, Section 5 presents the
conclusions and outlines the ideas for future work.
2
Example of Qualitative Shapes
Juxtaposition based on QSDJux
This section shows an example of the qualitative juxtaposition
(+q ) of two figures following the QSDJux scheme. Figure
2 shows the graphic result and its qualitative description
obtained using the qualitative shape description schema
presented in Museros et al. [2011, 2010].
∗
Correspondence to:
Zoe Falomir, Bremen Spatial
Cognition Centre, FB3 - Informatics, Universität Bremen,
P.O. Box 330 440, 28334 Bremen, Germany.
E-mail:
[email protected]
6
∗ Starting by vertex 1 in F1 , we copy it in the new
object, and repeat this step with the next vertex
clockwise, up to the vertex which is one of the
vertices in the +q operation.
∗ This vertex is replaced by the result of applying
+q between this vertex and the corresponding
one in object F2 .
∗ We continue copying the vertices in object F2
(clockwise) up the next vertex in object F2
related with the +q operation. This vertex
description is replaced by the +q of this vertex
and its corresponding one in object F1 .
∗ If there are still vertices in object F1 which have
not been considered during the juxtaposition,
they are copied to the new object F3 too.
3
Figure 2: Example of a juxtaposition of two objects.
A Computational Proof-of-Concept and
Results
In order to test if QSDJux is able to juxtapose the stimuli used
in Levine et al. [1999] we have chosen an example presented
in that paper, shown in Figures 3 and 4.
Figure 3 shows the array of possible objects to be built
using the objects in Figure 4.
The QSDJux approach is based on a qualitative shape
description scheme, which is defined by the 4-tuple
(P, ∗, C, A) as follows:
• P is a set of primitive shapes, which is the set of
the regular and non-regular polygonal shapes described
by the qualitative shape description theory presented in
Museros and Escrig [2004];
• ∗ is a set of functions/operations, called shape operators,
which in this case is the qualitative juxtaposition
operator, named +q . In order to juxtapose two shapes it
is necessary to indicate the related edges in +q , and it is
indicated using the following notation: A(i) +q B(m),
where A, B are shapes to be juxtaposed, and i and m
indicate the first vertices to be considered (clockwise) in
the juxtaposition operation;
• C is a set of production rules, which specifies how the
shape operators are to be used to construct new shapes
from the already existing shapes. This set is defined
by an Extended Backus-Naur Form (EBNF) production
rules and a set of tables that can be found in Museros et
al. [2011, 2010]; and,
Figure 3: Example of choice array in Levine et al. [1999].
• A is a set of explicit axioms, which specifies conditions
that each constructed shape must satisfy. In a sense, A is
a set of constraints or restrictions. In a shape description
scheme, the set A may or may not be built. In this case,
the restrictions defined, are:
– It is not possible to overlap shapes when computing
+q .
– The shapes considered as operators are only simply
connected and closed 2D regions.
– In +q it is necessary to specify the edges involved
in the operation, and these edges should satisfy that
they must have similar lengths.
– When considering objects F1 and F2 , the vertices
in object F3 , resulting of juxtaposing F1 and F2
(F3 = F1 +q F2 ), are defined as follows:
Figure 4: Objects to use in order to build the stimuli shown in
Figure 3.
The QSDJux approach has been implemented in an
application where, given two images containing one object,
first the qualitative shape description of the objects in
each image is calculated and then, the qualitative shape
description of the object obtained by the juxtaposition of
7
both objects is obtained. The application used in the selected
proof-of-concept is presented in Figure 5.
Figure 6: Screenshot of the prototype application of QSDJux
showing the qualitative descriptions of the objects involved
in the juxtaposition and the final qualitative shape description
obtained, but selecting a different pair of equal edges.
Figure 5: Screenshot of the prototype application of QSDJux
showing the qualitative descriptions of the objects involved
in the juxtaposition and the final qualitative shape description
obtained for the selected proof-of-concept.
where i is the number of vertices of the first object, m is the
number of vertices of the second object, and 2 are the vertices
that disappear in the juxtaposition. If the way QSDJux
approach is juxtaposing shapes would be the way humans do
it –we need a deeper research study to affirm that– then the
stimuli a), b) and c) presented in Figure 3 would be discarded
because they exceed the maximal number of possible vertices
(in this case 4) and the solution would be obvious. However,
there are more complicated cases in the test study by Levine
et al. [1999] in which discarding the objects by the maximal
number of vertices would not be so effective.
Moreover, human beings and specially children –who
usually have a less categorical and a more creative way of
thinking– are not constrained when juxtaposing shapes. Thus,
they can try any possibilities to juxtapose the two objects in
Figure 4. Therefore, so that the QSDJux approach can reflect
all the possible juxtapositions of two objects, it should be
extended to define also juxtapositions using:
(i) only one vertex (see Figure 7), where the maximal
number of vertices of the resulting object is i + m.
(ii) one vertex and one of the edges of the object (see
Figure 8), where the maximal number of vertices of the
resulting object is i + m − 1.
(iii) edges of different sizes (see Figure 9), where the
maximal number of vertices of the resulting object is
i+m. It is important to note that in the second example,
one of the objects has been scaled and, in this way, it is
very similar to one of the choices of the array in Figure
4, if QSDJux could identify that, it would be able to
explain the actions to carry out to build object (c) in
Figure 4.
We have tested QSDJux application using the objects in
Figure 4 in different orientations in order to find out if the
QSDJux scheme is able to juxtapose two objects even if they
have undergone a direct translation, a diagonal translation,
a direct rotation or a diagonal rotation (see Figure 1 for
examples of these transformations). In all cases the system
has been able to juxtapose the objects.
The challenging aspect in QSDJux is to find which are
exactly the edges to juxtapose. Once these edges are detected,
the resulting object corresponds to one of the objects shown
in the choice array. Note that sometimes it is possible to
juxtapose the objects using distinct pairs of equal edges.
Thus, all the combinations must be computed until the system
finds the juxtaposition that produces one of the object in the
choice array (see for example Figure 6).
4
Discussion
To test the performance of the computational QSDJux
approach, the juxtaposing edge must not be provided a priori.
Then, all the possibilities of juxtaposing the two objects must
be tried and a shape similarity method (i.e. that defined
by Falomir et al. [2013]) must be used to identify which
juxtaposing action corresponds to one of the objects in the
choice array. This is the usual computational procedure to
follow.
Another more cognitive procedure to select the object
from the choice array is to think about the maximal number
of vertices that is possible as a result of the juxtaposing
operation. In QSDJux, as the juxtaposition is done using
an edge of similar length, if the two objects are convex, the
maximal number of vertices of the resulting object is
5
Conclusions and Future Work
The proof-of-concept presented in this paper showed us
that the QSDJux scheme may be used to automatically
i+m−2
8
has to be embedded in a more general system where all
the edge juxtaposition combinations are automatically done
without the need of a user introducing the edges involved in
the operation. Here, we have identified the first challenge
to face: determining when two edges are compatible or
not for being juxtaposed. Then, the resulting juxtaposed
objects must be compared with the objects in the choice array
automatically. This can be done by using a similarity measure
between qualitative shape descriptors. Here, we have
identified the second challenge since the shape similarity
method defined by Falomir et al. [2013] corresponds to
an extension of the qualitative shape description used by
QSDJux, which was created to avoid ambiguities in shape
description as explained in Falomir et al. [2008]. Then, an
extension of the QSDJux to include the improved qualitative
shape descriptor is needed.
Regarding the compatibility of juxtaposition of two edges,
the third challenge is to extend the approach to calculate
juxtapositions of two objects by: (i) only one vertex; (ii) one
vertex and one of the edges of the object; and (iii) edges of
different sizes.
As future work, we intend to tackle the 3 challenges
previously mentioned and then to use QSDJux approach
to carry out automatic tests and compare them to people’s
performance.
Figure 7: Example of juxtaposition only using one vertex.
Figure 8: Example of juxtaposition using one vertex and one
edge of the objects.
Acknowledgments
Dr.-Ing. Zoe Falomir gratefully acknowledges the project
COGNITIVE-AMI (GA 328763) funded by the European
Commission through FP7 Marie Curie IEF actions and the
project Cognitive Qualitative Descriptions and Applications
(CogQDA) funded by the Universität Bremen through the
04-Independent Projects for Postdocs action. The support
by the Bremen Spatial Cognition Centre (BSCC, http:
//bscc.spatial-cognition.de/) and the Spatial
Intelligent Learning Centre (SILC) are also acknowledged.
Dr. Luis Gonzalez-Abril acknowledges the funding by
the Spanish Ministry of Economy and Competitiveness
HERMES (TIN2013-46801-C4-1-r) and the Andalusian
Regional Ministry of Economy (project SIMON TIC-8052).
Dr. Lledo Museros and Dr. Ismael Sanz acknowledge
the funding by the Spanish Ministry of Economy and
Competitiveness (project TIN2011-24147), Generalitat
Valenciana (project GVA/2013/135) and Universitat Jaume I
(project P11B2013-29).
Figure 9: Examples of juxtaposition using edges with no
similar length. Note that in the second example, one of the
objects has been scaled and, in this way, it is very similar to
one of the choices of the array in Figure 4.
References
solve spatial transformation tests similar to those provided
in Levine et al. [1999], since the QSDJux approach is
able to juxtapose objects even if they have undergone a
direct translation, a diagonal translation, a direct rotation
or a diagonal rotation, this is due to the inherited cognitive
properties of the Qualitative Shape Descriptior used [Museros
and Escrig, 2004; Falomir et al., 2013].
The critical aspect in QSDJux approach is that the edges
of juxtaposition must be of equal size and they must also
be indicated to the system by the user. Therefore, there is
still work to be done in order to carry out the test completely
automatically. First of all, the code implementing the scheme
Z. Falomir, J. Almazán, L. Museros, and M.T. Escrig.
Describing 2D objects by using qualitative models of
color and shape at a fine level of granularity.
In
Proc. Spatial and Temporal Reasoning Workshop at
the 23rd American Association on Artificial Intelligence
(AAAI) Conference, ISBN: 978-1-57735-379-9, pages
7–15. AAAI-Association, Chicago, Illinois, USA, 2008.
Z. Falomir, L. Gonzalez-Abril, L. Museros, and J. A.
Ortega. Measures of similarity between objects based
on qualitative shape descriptions. Spatial Cognition &
Computation, 13(3):181–218, 2013.
9
S.C. Levine, J. Huttenlocher, A. Taylor, and A. Langrock.
Early sex differences in spatial skill. Developmental
psychology, 35(4):940, 1999.
L. Museros and M. T. Escrig. A qualitative theory for shape
representation and matching for design. In Proceedings
ECAI 2004, pages 858 – 862, 2004.
L. Museros, L. Gonzalez-Abril, F. Velasco, and Z. Falomir.
A pragmatic qualitative approach for juxtaposing
shapes. Journal of Universal Computer Science (J.UCS),
16(11):1410–1424, 2010.
L. Museros, Z. Falomir, , L. Gonzalez-Abril, and F. Velasco.
A qualitative shape description scheme for generating new
manufactured shapes. In 25th International Workshop on
Qualitative Reasoning (QR), co-located at the 22nd Joint
International Conference on Artificial Intelligence (IJCAI),
pages 116 – 124, Barcelona, Spain, July 2011.
10
Un big picture sobre las tecnologı́as de computacion
´ y almacenamiento distribuidos
Damin
´ Fernndez
´
Cerero, Alejandro Fernandez-Montes,
´
Juan Antonio Ortega
Departamento de Lenguajes y Sistemas Informáticos
Av. Reina Mercedes s/n, 41012 Sevilla
[email protected], [email protected], [email protected]
Abstract
Software as a Service: Donde los usuarios consumen
un servicio software transparente para ellos en la nube.
GMail, Whatsapp o Facebook son ejemplos de este paradigma.
Tras la consolidacin del paradigma de cloud computing y el incremento en el uso de servicios alojados en la nube, las necesidades computacionales y
de almacenamiento dentro de los centros de datos
crece a un ritmo cada vez mayor, y con él la aparicion
´ de nuevas tecnologı́as. En este trabajo hacemos una comparativa del ecosistema big data, ası́
como las tendencias y problemáticas a abordar en
un futuro proximo.
´
1.
Platform as a Service: Donde los usuarios consumen
un servicio online donde pueden acceder a entornos de
ejecucion
´ de aplicaciones de manera transparente para
ellos. Google App Engine son ejemplos de este paradigma.
Infrastructure as a Service: Donde el proveedor de servicios ofrece al usuario un ordenador como servicio donde pueden instalar cualquier entorno de manera transparente para ellos. Amazon EC2, o JBoss OpenShift son
ejemplos de este paradigma.
Introduccion
´
Debido al auge en la demanda de los servicios prestados
bajo el paradigma de cloud computing, la utilizacion
´ masiva de smartphones consumiendo software como servicio, el
almacenamiento generalizado de nuestros archivos en plataformas online y la irrupcion
´ de nuevos paradigmas de computacion
´ como el de internet of things o el de internet of everything crece constantemente la necesidad de una alta capacidad
tanto de computacion
´ como de almacenamiento que den servicio en tiempo real a millones de usuarios.
Esta evolucion
´ en el patron
´ de consumo de software gestionado y servido desde enormes centros de datos ha llevado
a que el conocido como ecosistema big data haya tenido que
desarrollarse en un breve perı́odo de tiempo para cubrir las
necesidades especı́ficas de los diferentes escenarios, creando
ası́ un paisaje poco estandarizado, heterogéneo y difı́cilmente
gestionable.
En este trabajo hacemos una comparativa del ecosistema
y tecnologı́as actuales dentro del campo de la computacion
´
distribuida, la evolucion
´ y desarrollo de las necesidades que
satisfacen, ası́ como las tendencias y problemáticas a abordar
en un futuro proximo.
´
2.
Debido a que las necesidades de operacion,
´ condiciones
y naturaleza estas cargas de trabajo son tan distintas, los requisitos y restricciones de cada una de ellas no pueden ser
satisfechas con un único stack tecnologico.
´
Para cubrir estos requisitos, en los últimos meses y anos
˜
ha surgido un elenco de tecnologı́as focalizadas en solucionar problemas muy especı́ficos o en mejorar las soluciones
existentes en situaciones muy concretas, aumentando ası́ el
abanico de posibilidades tecnologicas,
´
complicando la operatividad de los diferentes stacks tecnologicos
´
debido al gran
numero
´
de combinaciones posibles de los mismos.
Con el objetivo de paliar esta falta de estandarizacion
´ para solucionar problemas compartidos por muchos operadores
de centros de datos, diferentes fundaciones de codigo
´
abierto
como Apache Software Foundation y consorcios de empresas
están trabajando en homogeneizar las soluciones y proponer
stacks estándares para diferentes problemas facilitando ası́ su
puesta en marcha, comprension
´ y adopcion.
´
Sin embargo, las necesidades y nuevos paradigmas parecen desarrollarse a una velocidad mayor que estos intentos
de estandarizacion,
´ por lo que nuevas tecnologı́as surgen para
cubrir el vacı́o de mercado que se generan en estos nuevos
nichos, agravando aun
´ más la situacion
´ antes comentada.
Para comprender este escenario, en este trabajo explicaremos los diferentes conceptos subyacentes a las tecnologı́as
actuales para poder ası́ agruparlos en diferentes categorı́as
segun
´ su enfoque y solucion
´ aportada, vislumbrando también
las tendencias en la evolucion
´ del ecosistema big data ası́ como las necesidades no cubiertas que puedan desarrollarse en
Analisis
´
del problema
Los grandes centros de datos que suponen el nucleo
´
de las
aplicaciones en internet están compuestos por cientos o miles de nodos que a nivel logico
´
deben operar como una única entidad, poniendo a disposicion
´ de las necesidades de los
operadores de estas aplicaciones toda su capacidad operativa.
Sin embargo, la carga de trabajo que satisfacen estos centros de datos es muy diversa, dando soporte a paradigmas tan
diferentes como:
11
contrar Cassandra y una gran variedad de soluciones
NoSQL.
Coordinadores: En esta capa se encuentran las soluciones encargadas de mantener el correcto estado y consistencia entre las multiples
´
réplicas de los datos, estados
intermedios de las tareas, etc. Como ejemplo de estas
soluciones de coordinacion
´ podemos encontrar: Google
Chubby [Burrows, 2006] o Apache ZooKeeper [Hunt
et al., 2010], entre otros.
un futuro cercano.
3.
Estructura logica
´
de un centro de datos
Para comprender las soluciones tecnologicas
´
que surgen en
mercado, lo primero que debemos conocer es como
´
se divide y estructura logicamente
´
un centro de datos, además de
enteder como
´
interactuan
´ y colaboran entre sı́, teniendo ası́
una suerte de piezas que conforman el puzle que soluciona
las diferentes necesidades tanto de computacion
´ como de almacenamiento de los operadores de los centros de datos. Por
tanto, describiremos qué parte de este gran problema del big
data soluciona cada una de las divisiones o capas.
Aunque estas divisiones pueden variar segun
´ el punto de
vista sobre el que categoricemos, tomando en consideracion
´
una vision
´ de operacion
´ podemos dividir el centro de datos al
menos en estas grandes capas:
Teniendo en cuenta cada una de las particularidades de las
diferentes tecnologı́as que pueden utilizarse dentro de cada
una de las capas logicas
´
de nuestro centro de datos, podremos
conformar diferentes stacks tecnologicos
´
que den solucion
´ a
los requisitos de los diferentes servicios a los que nuestro centro de datos tenga que dar soporte.
Referencias
Sistemas de archivos distribuidos: En esta capa se encuentran las soluciones encargadas de cubrir la necesidad de almacenaje y servicio de los datos que aloja y necesita en tiempo de ejecucion
´ el centro de datos. Como
ejemplo de estos sistemas de archivos distribuidos podemos encontrar: GFS [Ghemawat et al., 2003] o su implementacion
´ en codigo
´
abierto HDFS [Shvachko et al.,
2010], QFS [Ovsiannikov et al., 2013] o Tachyon [Li et
al., 2013], entre otros.
[Burrows, 2006] Mike Burrows. The chubby lock service for
loosely-coupled distributed systems. In Proceedings of the
7th symposium on Operating systems design and implementation, pages 335–350. USENIX Association, 2006.
[Chang et al., 2008] Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C Hsieh, Deborah A Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E Gruber. Bigtable: A distributed storage system for structured
data. ACM Transactions on Computer Systems (TOCS),
26(2):4, 2008.
[Ching, 2013] Avery Ching. Scaling apache giraph to a trillion edges. Facebook Engineering blog, 2013.
[Dean and Ghemawat, 2008] Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large
clusters. Communications of the ACM, 51(1):107–113,
2008.
[Ghemawat et al., 2003] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. The google file system. In
ACM SIGOPS operating systems review, volume 37, pages 29–43. ACM, 2003.
[Harter et al., 2014] Tyler Harter, Dhruba Borthakur, Siying
Dong, Amitanand S Aiyer, Liyin Tang, Andrea C ArpaciDusseau, and Remzi H Arpaci-Dusseau. Analysis of hdfs
under hbase: a facebook messages case study. In FAST,
volume 14, page 12th, 2014.
[Hausenblas and Nadeau, 2013] Michael Hausenblas and
Jacques Nadeau. Apache drill: interactive ad-hoc analysis at scale. Big Data, 1(2):100–104, 2013.
[Hindman et al., 2011] Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph,
Randy H Katz, Scott Shenker, and Ion Stoica. Mesos: A
platform for fine-grained resource sharing in the data center. In NSDI, volume 11, pages 22–22, 2011.
[Hunt et al., 2010] Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. Zookeeper:
Wait-free coordination for internet-scale systems. In USENIX Annual Technical Conference, volume 8, page 9,
2010.
Negociadores de recursos: En esta capa se encuentran
las soluciones encargadas de cubrir la necesidad de asignar recursos computacionales como CPUs o memoria
RAM a las diferentes tareas. Como ejemplo de estos
negociadores de recursos podemos encontrar: Apache
YARN [Vavilapalli et al., 2013], Apache Mesos [Hindman et al., 2011] o Google Omega [Schwarzkopf et al.,
2013], entre otros.
Motores de ejecucion:
´ En esta capa se encuentran las
soluciones responsables de la ejecucion
´ de las diferentes tareas y, por tanto, de satisfacer las necesidades de
computacion
´ con el sistema de archivos distribuidos y
el negociador de recursos. Como ejemplo de estos motores de ejecucion
´ podemos encontrar: Google MapReduce [Dean and Ghemawat, 2008] o su implementacion
´
en codigo
´
abierto, Apache MapReduce; Google Dremel [Melnik et al., 2010] o sus implementaciones en
codigo
´
abierto Apache Drill [Hausenblas and Nadeau,
2013] o Cloudera Impala [Kornacker and Erickson,
2012]; Google Pregel [Malewicz et al., 2010] o su implementacion
´ en codigo
´
abierto Apache Giraph [Ching,
2013], entre otros.
Bases de datos: En esta capa se encuentran las soluciones encargadas de ofrecer un acceso aleatorio en tiempo real a pequenas
˜ partes de la informacion
´ almacenada
frente a lo ofrecido por los motores de ejecucion:
´ procesar grandes cantidades de informacion
´ de manera secuencial. Como ejemplo de estas bases de datos distribuidas podemos encontrar: Google BigTable [Chang et
al., 2008] o su implementacion
´ en codigo
´
abierto, Apache HBase [Harter et al., 2014]; también podemos en-
12
[Kornacker and Erickson, 2012] Marcel Kornacker and Justin Erickson.
Cloudera impala: Real time queries
in apache hadoop, for real.
ht tp://blog. cloudera. com/blog/2012/10/cloudera-impala-real-time-queriesin-apache-hadoop-for-real, 2012.
[Li et al., 2013] Haoyuan Li, Ali Ghodsi, Matei Zaharia,
Eric Baldeschwieler, Scott Shenker, and Ion Stoica. Tachyon: Memory throughput i/o for cluster computing frameworks. memory, 18:1, 2013.
[Malewicz et al., 2010] Grzegorz Malewicz, Matthew H
Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty
Leiser, and Grzegorz Czajkowski. Pregel: a system for
large-scale graph processing. In Proceedings of the 2010
ACM SIGMOD International Conference on Management
of data, pages 135–146. ACM, 2010.
[Melnik et al., 2010] Sergey Melnik, Andrey Gubarev,
Jing Jing Long, Geoffrey Romer, Shiva Shivakumar,
Matt Tolton, and Theo Vassilakis. Dremel: interactive
analysis of web-scale datasets. Proceedings of the VLDB
Endowment, 3(1-2):330–339, 2010.
[Ovsiannikov et al., 2013] Michael Ovsiannikov, Silvius
Rus, Damian Reeves, Paul Sutter, Sriram Rao, and Jim
Kelly. The quantcast file system. Proceedings of the
VLDB Endowment, 6(11):1092–1101, 2013.
[Schwarzkopf et al., 2013] Malte Schwarzkopf, Andy Konwinski, Michael Abd-El-Malek, and John Wilkes. Omega: flexible, scalable schedulers for large compute clusters.
In Proceedings of the 8th ACM European Conference on
Computer Systems, pages 351–364. ACM, 2013.
[Shvachko et al., 2010] Konstantin Shvachko, Hairong
Kuang, Sanjay Radia, and Robert Chansler. The hadoop
distributed file system. In Mass Storage Systems and
Technologies (MSST), 2010 IEEE 26th Symposium on,
pages 1–10. IEEE, 2010.
[Vavilapalli et al., 2013] Vinod Kumar Vavilapalli, Arun C
Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar,
Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah,
Siddharth Seth, et al. Apache hadoop yarn: Yet another resource negotiator. In Proceedings of the 4th annual Symposium on Cloud Computing, page 5. ACM, 2013.
13
A multidimensional expertise recommender tool
German
´ Sanchez-Hernandez
´
´
ESADE-URL
Sant Cugat del Vallès
Jennifer Nguyen
ESAII–UPC
Barcelona
Nuria
´
Agell
ESADE–URL
Sant Cugat del Vallès
[email protected]
[email protected]
[email protected]
Abstract
rather than “the right expert”. The ER described in this work
tries to fill this gap by considering additional information regarding some secondary skills not directly related to the areas
of knowledge of the candidates (subskills), their proximity to
the user that is requesting the recommendation and the current
availability of the candidates. In the following, a list of existing expertise recommender systems that come as standard
is provided, as well as a description of their main characteristics.
• MITRE’s Expert Finder [Mattox et al., 1999] was created within the MITRE Corporation to identify experts
within topic domains. The system ranks the experts by
the number of times their names are associated with
specific terms found in corporate documents, newsletters, communications and so forth. The user enters a
key word search and the system returns the top ranked
experts [Maybury, 2007]. The Expert Locator, implemented in MITRE Corporation, retrieves information
from several activity spaces. It applies weights to the
documents at the activity space, evidence type, and subactivity space evidence levels [D’Amore, 2008].
• NASA POPS [Grove and Schain, 2008] aggregates information from multiple databases by leveraging Resource Description Framework (RDF) as the exchange
and modelling format and Semantic Web. Users can
search by organisation, project, and/or competency. The
system displays a list of people matching these attributes
along with the contact information and social network
connecting the candidate and the user.
• IBM SmallBlue [Lin et al., 2008; 2009] uses outbound
emails and AIM chat to analyse the social network for
“who knows what and whom”. It maps search strings
to keywords, identifies candidates matching these keywords and ranks the results by relevance weighting and
social network structure.
• INDURE FacFinder [Fang et al., 2008] collects information on faculty from university profiles, NSF awards,
and faculty homepages. The system indexes each document in entirety and applies weights to the proximity of
terms. It considers the order of terms, the data source,
and document rank when applying weights.
• StrangersRS [Guy et al., 2011] scores people based on
their familiarity and similarity to one another. The sys-
In day-to-day activities, people in organisations
face problems and look for obtaining creative new
ideas and solutions. Reducing the constraints to
communication and knowledge sharing of a globally distributed workforce, will facilitate the workflow. People finder systems are one of the main solutions in such context to reduce these constraints
leading to time and cost savings. Finding expertise
efficiently helps organisations to unlock knowledge
within the enterprise, solve problems, and identify
collaborators, fostering the interaction among users
with different backgrounds, opinions and levels of
expertise. In this work we describe an expertise recommender system based on a fuzzy OWA operator
for ranking candidates in expertise search.
1
Introduction
For global organisations, finding an expert for unlocking
knowledge within the enterprise, solving problems, and identifying collaborators can be challenging as experts are disperse and vary in level of knowledge of a topic. This task can
become overwhelming to new employees looking for assistance, project managers looking to put together project teams,
or employees looking for advice. Expertise Recommender
Systems (ERS) can help people in organisations to find people who have some expertise in a specific area. In this work
we describe the architecture of an ERS that will recommend
people based on an appropriate mixing and an optimal matching of the characteristics of the candidates and the preferences
of the user. This ERS has been developed within project
COLLAGE1 , a European sponsored enterprise social network
implementation.
2
State of the art
ERS, also called Expert Finding Systems (EFS), Expertise
Location Systems (ELS) [Maybury, 2007], or expertise retrieval, connect people to areas of expertise [Balog et al.,
2012]. To date, several ERSs have been developed. They
focus on finding the expert with the “right level of expertise”
1
http://projectcollage.eu
14
• Subskill or quality variables describe knowledge not directly related with the areas of knowledge included in
the previous type of variables. These variable may differ
according to the environment in with the ERS is being
applied. In this work, we have considered specific tools
that the candidate knows.
tem recommends people with similar interests but are
unfamiliar with each other.
Some of the analysed systems take into account more information than just the expertise of the candidates. However, this
information is not treated in the most effective way: in some
cases, it is just used to filter the recommendations; in other
cases, this information is aggregated by means of a weighted
mean, requiring the user to define a set of parameters. Both
approaches could result in a candidate being prematurely discarded due to one of the considered criteria, while meeting
most of the requirements of the user with high scores.
This work describes a system to recommend people by taking into account simultaneously not only the expertise of the
candidates but other relevant features. The system allows the
user to select the requirements that best fit his/her interests.
The recommendation is based on the user’s inputted information, and allows him/her to select among the recommended
experts from a generated ranking.
3
• Proximity information is used to measure the distance
between the user and the candidate, that is to say, the
ease of contact. This can be done by taking into account
the physical distance between the users or by considering the logical distance between the departments or institutions of the implied users, always depending on the
considered use case. In this paper, we have stored the
relationship between departments in form of graph and
computed the distance among them in terms on the distance between their corresponding nodes.
• Availability variable informs the current availability of
the candidate. A system for managing the availability of
each candidate must be designed. We have decided to
develop a system in which the initial availability of the
candidate is set by the own candidate with a maximum
one by default. The availability is decreased when a task
is assigned to a candidate and it trends to the maximum
level with the time.
Architecture
The ER system introduced in this work is a content-based recommendation system based on the definition of candidates’
profiles and the user’s preferences when asking for a recommendation. The user can take advantage of this system to
know who “is good at what” (expertise) while also fulfilling
his/her other requirements such as experts’ subskills, proximity to the user and current availability. Once the possible
experts are selected, the system returns an ordered list ranked
according to the user’s preferences.
As displayed in Figure 1, the ER system can be considered
as an adaptation of the architecture ER-Arch for expertise recommender systems introduced by McDonald and Ackerman
[McDonald and Ackerman, 2000].
Based on the mentioned architecture, the ERS is based on
four modules: profiling, identification, selection and interaction.
3.2
Identification module
The Identification module builds an initial list of candidates
feasible to be recommended. The definition of feasibility can
vary depending on the use case of application. In this work,
we have decided to remove candidates with no availability
(currently set to “do not disturb”).
3.3
Selection module
The Selection module is technically the more complex of the
described modules. It is responsible for analysing and ranking the list of initial candidates. This module consists of the
following tasks:
1. Assessing each candidate according to the user’s requirements. Each requirement set by the user must be evaluated on the profile of each candidate. The assessment of
such variables is computed in the interval [0, 1] in terms
of the distance between the required item and the profile
of the candidate, assessing a maximum value when the
required item is fully fulfilled.
Expertise and Subskill requirements are represented by
the name of the skill, being required expertise associated
with the required level. The assessment of a required
skill s is defined as follows:
3.1 Profiling module
The Profiling module is responsible for building and maintaining profiles to be used in the recommendation process.
Therefore, the main responsibility of this module is to translate data collected from the environment in which the system
is embedded into candidates profiles. Four types of variables
have been considered to characterise the profile of each candidate:
• Expertise variables represent the areas of knowledge of
the candidate useful to solve a problem or to answer to a
enquiry. This information can be considered both in an
explicit or an implicit way. In this work, collecting explicit expertise is related to the manual selection of topics that the candidate knows best. This can be stated by
the own candidate or by other people. On the other hand,
the collection of implicit information implies analysing
documents internal or external to the enterprise to find
key words. The work presented in this paper employs a
predefined list of skills.
min(ps , l)
,
(1)
l
where ps is the level of expertise of a candidate in skill
s and l is the level of expertise required for skill s.
When computing the assessment for a required subskill
s, only the existance of that skill is taken into account:
Ae (ps , l) =
Aq (ps ) = ps ,
15
(2)
Figure 1: Architecture of the ERS developed
where ps is 1 if the candidate has the subskill s and 0 if
not.
Proximity requirement can be stated by the user to be
high or low. Given the distance pd between the institutions of the candidate and the user, its assessment is
computed as follows:
ated with the OWA operator. The objective of the aggregation step is to combine a set of criteria in such a
way that the final aggregation output takes all the single
criteria into account [Dubois and Prade, 1985].
pd −md
Md −md
pd −md
Md −md
1−
if high proximity is required,
otherwise,
(3)
where md and Md stand for the minimum and maximum
distance between all institutions, used to normalise the
input distance.
Finally, a maximum availability is always required an
therefore its computation is as follows:
Ap (pd ) =
Aa (pa ) = pa ,
(4)
where pa is the current availability of the candidate.
2. Aggregating the assessments into an overall degree. The
ERS presented in this work employs an average operator to aggregate, for each candidate, the partial assessments obtained by analysing each requirement. A
lot of families of aggregation operators have been studied. Among them, the OWA operator proposed by Yager
[Yager, 1988] is the most widely used. One of the main
reasons to support this extensive use is that the OWA operator allows the implementation of the concept of fuzzy
majority in the aggregation phase by means of a fuzzy
linguistic quantifier [Zadeh, 1983], which indicates the
proportion of satisfied criteria ‘necessary for a good solution’ [Yager, 1996]. This is done by using the linguistic quantifier in the computation of the weights associ-
Definition 1 An OWA operator of dimension n is a mapping Φ : Rn → R, which has a set of weights W =
T
(w
Pn1 , . . . , wn ) associated with it so that w ∈ [0, 1] and
i=1 wi = 1,
n
X
Φ(a1 , . . . , an ) =
wi · aσ(i) ,
(5)
i=1
where σ is a permutation function such that aσ(i) is the
i-th highest value in the set {a1 , . . . , an }.
Definition 2 Given a function Q : [0, 1] → [0, 1] such
that Q(0) = 0, Q(1) = 1 and if x > y then Q(x) ≥
Q(y), an OWA aggregation operator guided by Q is
given as [Yager, 1988]:
n
X
Φ(a1 , . . . , an ) =
wi · aσ(i) ,
(6)
i=1
being σ : {1, . . . , n} → {1, . . . , n} a permutation such
that aσ(i) ≥ aσ(i+1) , ∀i = 1, . . . , n − 1, i.e., aσ(i) is the
i-th largest value in the set {a1 , . . . , an }; and
i
i−1
wi = Q
−Q
, i = 1, . . . , n. (7)
n
n
That is to say, the computation of the weights W to be
used in the aggregation by means of the OWA operator is guided by the fuzzy linguistic quantifier ‘most of’
represented via a Regular Increasing Monotone (RIM)
function Q. In most cases, the function considered will
be: Q(r) = r1/2 .
16
3. Ranking and selecting candidates. The considered candidates are ranked according to the overall degree obtained from the previous task.
3.4
trieval. Foundations and Trends in Information Retrieval,
6(2-3):127–256, 2012.
[D’Amore, 2008] Raymond J D’Amore. Expert finding
in disparate environments. PhD thesis, University of
Sheffield, 2008.
[Dubois and Prade, 1985] Didier Dubois and Henri Prade. A
review of fuzzy set aggregation connectives. Information
sciences, 36(1):85–121, 1985.
[Fang et al., 2008] Yi Fang, Luo Si, and Aditya Mathur.
FacFinder: Search for Expertise in Academic Institutions.
Technical Report SERC-TR-294, Department of Computer Science, Purdue University, 2008.
[Grove and Schain, 2008] Michael
Grove
and
Andrew Schain.
POPS – NASA’s expertise location
service powered by semantic web technologies.
http://www.w3.org/2001/sw/sweo/public/UseCases/Nasa/,
February 2008. accessed February 6, 2015.
[Guy et al., 2011] Ido Guy, Sigalit Ur, Inbal Ronen, Adam
Perer, and Michal Jacovi. Do you want to know?: recommending strangers in the enterprise. In Proceedings of the
ACM 2011 conference on Computer supported cooperative work, pages 285–294. ACM, 2011.
[Lin et al., 2008] Ching-Yung Lin, Kate Ehrlich, Vicky
Griffiths-Fisher, and Christopher Desforges. Smallblue:
People mining for expertise search. MultiMedia, IEEE,
15(1):78–84, 2008.
[Lin et al., 2009] Ching-Yung Lin, Nan Cao, Shi Xia Liu,
Spiros Papadimitriou, Jimeng Sun, and Xifeng Yan.
Smallblue: Social network analysis for expertise search
and collective intelligence. In IEEE 25th International
Conference on Data Engineering, 2009. ICDE’09, pages
1483–1486. IEEE, 2009.
[Mattox et al., 1999] David Mattox, Mark T. Maybury, and
Daryl Morey. Enterprise expert and knowledge discovery.
In Hans-Jrg Bullinger and Jrgen Ziegler, editors, HCI (2),
pages 303–307. Lawrence Erlbaum, 1999.
[Maybury, 2007] Mark T Maybury. Discovering distributed
expertise. Regarding the “Intelligence” in Distributed Intelligent Systems MITRE, 2007.
[McDonald and Ackerman, 2000] David W. McDonald and
Mark S. Ackerman. Expertise recommender: a flexible
recommendation system and architecture. In Proceedings
of the 2000 ACM conference on Computer supported cooperative work, pages 231–240. ACM, 2000.
[Yager, 1988] Ronald R Yager. On ordered weighted averaging aggregation operators in multicriteria decisionmaking. Systems, Man and Cybernetics, IEEE Transactions
on, 18(1):183–190, 1988.
[Yager, 1996] Ronald R Yager. Quantifier guided aggregation using OWA operators. International Journal of Intelligent Systems, 11(1):49–73, 1996.
[Zadeh, 1983] Lotfi A Zadeh. A computational approach to
fuzzy quantifiers in natural languages. Computers & Mathematics with Applications, 9(1):149–184, 1983.
Interaction module
The Interaction module provides the user with the necessary
tools to define the recommendation requirements. The main
task of this module is to translate the request made by the user
to specific requirements understandable by the ERS. Firstly,
expertise and subskill requirements are collected in the same
way. If the user explicitly chooses which expertise or subskills are required, the system will create directly a requirement for each of them. Otherwise, if the user asks the system
to get a recommendation with the same expertise or subskills
as him/her, the system must analyse the profile of the user and
take note of his/her expertise or subskills. Secondly, the user
can select if s/he prefers recommendations of people close to
or far from his/her department or institution. S/he can also
disable this item in the case of not caring about the proximity
of the candidates. In the first case, the distance between the
candidate and the user is analysed by considering the graphs
(both social network and departments’ graph). Conversely,
the second one considers the inverse distance as people far
away have low proximity. Finally, availability is always set
as one of the requirements and it is preferred to be high.
4
Conclusion and Future Research
The ERS presented in this work focuses on finding “the right
person” that will provide the user not only with the right level
of expertise but also with the right fit of his/her needs. This
work details the architecture of such ERS by describing all
modules that are part of the system. The ERS developed carries out the recommendation by requesting the user a minimum set of inputs enhancing its usability.
The proposed variables to describe a candidate in this context are derived from literature on expertise recommender
systems. However, it should be recognised that there may be
additional variables related to specific use cases and to their
availability which can further enhance matching between user
requirements and characteristics of the recommended candidates. Future research may include studies of different use
cases to define for each one the proper set of variables to use.
Moreover, a new user interface is being designed to improve
its usability and readability of the results.
Acknowledgements
The research reported in this paper is partially supported
by the SENSORIAL Research Project (TIN2010-20966-C0201), funded by the Spanish Ministry of Science and Information Technology and by the European Commission
funded project COLLAGE: Creativity in Learning through
Social Computing and Game Mechanics in the Enterprise,
(GA318536) 2012-15.
References
[Balog et al., 2012] Krisztian Balog, Yi Fang, Maarten
de Rijke, Pavel Serdyukov, and Luo Si. Expertise re-
17
Tracking infrastructure for large crowd movement
Andreas Eberlei1, Manuel Herzberg1, Jonas Kaltenbach1, Dominik Ringgeler1,
Patrick Datko1, Wilhelm D. Scherz1, Tatjana Thimm2 and Ralf Seepold1
HTWG Konstanz
1
Faculty of Computer Science, 2Faculty of Economics and Social Sciences
Brauneggerstr. 55, 78462 Konstanz (Germany) [email protected]
quantitative movement data is collected and via surveys
Abstract
qualitative results will complement the quantitative approach.
The following research questions derived from literature
review and website analysis:
• What are the movement patterns of tourists at Lake
Constance?
• Is there a correlation between movement patterns
and duration of stay?
• Is there a correlation between movement patterns
and weather conditions or special days, holidays or
different seasons?
• Are there differences between movement patterns
from tourists of different ages and gender?
• What are the most used means of transportation,
which tourists use to get around?
• What are the consequences for the development of
tourism offers or packages?
Recognizing and understanding movement patterns
from tourists is very important for big cities and
holiday destinations. Analyzing this movement
patterns can yield to important knowledge for the
tourism management experts. Understanding these
different movement patterns is a difficult task. For
this reason the demand for an infrastructure, which
is able to record and process this movement data, is
very high. In the following paper we describe our
approach for a system implementation which is able
to collect movement data from a large amount of
tourists and it is able to create, visualize and analyze
different movement patterns. Keywords, Tracking,
logging, grouping
1
Motivation
Understanding how tourists travel is a difficult challenge.
Tourism includes the movement of different people through
time and space [McKercher et al., 2006]. One big major
aspect of tourism is to examine activities of different tourists
in relation to our environment, weather conditions and other
aspects. Each tourist has a different behavior how he explores
and moves to different destinations. This fact results in
different movement patterns for each individual tourist [Lau
and McKercher, 2006]. Movement patterns can be defined as
dimensional changes of location of tourists under certain
influences [Leung et al., 2012]. The analysis of these
movement patterns includes information, which can help
getting a better understanding of touristic movement
behaviors. With this information it is possible to create better
travel guides and other strategies for tourists. The knowledge
how tourists move through different areas has a significant
influence for managing public transportation and
infrastructure and traffic forecasting, planning of new
attractions, restaurants and shopping facilities [Leung et al.,
2012]. Based on these facts the demand for an infrastructure
which is able to track movements of large crowds is very high.
2
3
State of the art
Tracking the paths of tourists and all of the technological
background that is there has been a popular as well as
promising area of research for more than a decade now. Older
works in this field of study have been created in 2004 and
2005 for example [O’Conner et al. 2005] or [Pavón et al.
2004]. While in this time definitely important steps were
made, the last years included many new technological as well
as social developments, which influenced this field of
research and offered it new and attractive possibilities [Weber
& Bauder 2012].
As a result, the development and increased usage of
smartphones made it possible to track the routes of tourists
more precise- and consistently [Edwards et al. 2010]. Hence,
detectors and cameras for example have been replaced, since
they are more circular and less detailed. While tracking the
movement patterns of tourists primarily has been researched
in urban areas, such as Paris [Bauder et al. 2014] or Hong
Kong [Shoval et al. 2011], there are also works that focus on
travel destinations of tourists within a certain recreation area.
These works can help managers of that field to better
understand tourist movement patterns and thus allocate more
Objectives
In our project we combine a quantitative and a qualitative
approach. Via the Android application and GPS tracker
18
efficiently there resources, e.g. planning the road network,
etc. [Smallwood et al., 2012].
In this research, we want to focus our studies on the Lake
Constance area, in Baden-Württemberg, Germany. There are
already several mobile applications available to track travel
data. Open GPS Tracker1 can track your data and also draw
them in real time using services like Google Maps. A mobile
Android Application called GeoTracker2 makes it possible to
share traveled routes with your friends. While there are
multiple possibilities to track your route using an Android
application, we think that none of these unites the key
elements that we would like to use:
At first, it is important to incorporate many users, hence we
like to provide an alternative to the Android application. As
of this we also use hardware GPS trackers for those, who are
non-tech-savvy. Additionally, we do not only make the travel
route available on screen, but also create the possibility to
automatically distinguish several methods of traveling, e.g. by
car or by bike. To reach an adequate amount of users, it is
important to make the application as attractive as possible. To
achieve this goal we implement additional features, like travel
diaries. Finally, providing a web application, that represents
and lets us analyze all of the travel data collected, will help us
developing cluster-/heat maps presenting spots being visited
most. With these data we can provide users with new features,
like ‘hot spots’ in cities, where many people meet or certain
events take place, but also calm areas, which could be visited
for recreation.
4
store GPS data in raw format. The existing infrastructure can
handle these data format well. Movement data from the GPS
tracker can manually be imported into the database. Figure 1
illustrates our infrastructure. In the backend area of the
infrastructure we analyze the
Figure 1: Tracking infrastructure
recorded movement data. We use a web service to visualize
traveled routes of tourists. Routes are displayed on a map,
depending on the mean of transportation, which the tourist has
chosen, routes are dyed differently. The web service is also
responsible for displaying information to individual tourists.
Tourism management experts are able to filter the recorded
movement data. There are several filter options, for example
age or chosen mean of transportation. After applying these
filter the map shows only the appropriate tracks.
Methodology
In our infrastructure we track movement data from different
tourists using a smartphone application and hardware GPS
trackers. Movement data is stored on a server where tourism
management experts can analyze them. The following section
describes the methodology used.
4.2 Tourist Tracking
The project combines a quantitative and a qualitative
approach. Via app and loggers quantitative data is collected
and via the survey qualitative results will complement the
quantitative approach. Thus the methodology follows the
concept of triangulation of methods and data simultaneously.
4.1 Technology platform
The developed tracking application works on Android
smartphones and is responsible for several tasks: The primary
task is to record the movement from the tourists, furthermore
the app includes several helpful features for the user. After
registration, the tourist is able to use our application. Tourists
can choose their mean of transportation they are currently
using. Also tourists can search hotels, restaurants and other
points of interests. They can create travel diaries and they are
able to access helpful information from the Lake Constance
area.
Next to the Android application tracking method, hardware
GPS tracker can be used to capture movements of tourists.
GPS tracker can be assigned to tourist during their stay at the
Lake Constance Area. These trackers are easy to use and are
able to record the movement all day long. These GPS trackers
1
5
Intermediate Results
During the current period of time several goals were achieved.
The development of the tracking application is finished. The
application is able to track movements and is able to send
these information to the server. The application also includes
all features we described in chapter 4. A basic server
infrastructure is already developed as well. It includes the web
service and the database where all information is stored. The
web service is capable to display information of tourists,
which use our application and it is possible to visualize the
covered routes from each tourist on a map. Furthermore,
several basic filters can be applied to the recorded movement
2 https://play.google.com/store/apps/developer?id=
Ilya+Bogdanovich&hl=en
http://opengpstracker.org
19
data. The described GPS trackers from chapter 4 are
functional and are ready to be integrated in the existing
infrastructure.
6
Conclusion
At the current point of time in the project, all milestones have
been achieved and all functionality, which has been planned
so far were integrated. Tracking movement patterns of
tourists has been researched for many years, often times the
approaches made, considered special circumstances though,
as tracking travel routes in a national park for example
[Smallwood et al., 2012].
In our work we go for a more general approach by developing
an Android application, which offers us a wide range of users.
The server infrastructure and the possibility to select among
different means of transportation has been implemented as
well. Furthermore, the web interface for access to the
backbone server is available and yet provides different kinds
of data filtering. Although our work proceeds constantly,
many interesting objectives will be achieved in the future.
7
Figure 2: Clustering with OPTICS algorithm
[Ankerst et al., 1999]
References
[Ankerst et al., 1999] Mihael Ankerst, Markus M. Breunig,
Hans-Peter Kriegel, and Jörg Sande, OPTICS: ordering
points to identify the clustering structure, SIGMOD Rec.
28,
Pages
49-60,
2
June
1999,
http://doi.acm.org/10.1145/304181.304187
[Asakura and Hato, 2004] Yasuo Asakura, Eiji Hato,
Tracking survey for individual travel behavior using
mobile communication instruments. Transportation
Research Part C: Emerging Technologies, Volume 12,
Issues 3–4, Pages 273-291, ISSN0968-090X, June–
August 2004, http://dx.doi.org/10.1016/j.trc.2004.07.010
Future Work
In our future work we want to focus more on the backend
related tasks. Enrich our server infrastructures with more
useful features is our main goal, which will also be a benefit
for the end users and tourism management experts. Making
the GPS tracker data also available for the server application
is one of these aspects. To be able to draw conclusions from
the visualization in the web interface of the collected
movement data, we want to provide several filtering options.
For example, differing the movement patterns considering
different seasons could be very useful. Or even more fine
grained filters to only visualize movements in a certain period
of time, e.g. holidays can be beneficial in many ways. Since
tracking routes can differ immensely with the weather
situation, filtering movement only with sunny or stormy
weather can help of better understanding the movements in its
entirety.
One of our key tasks of the future work will be the
implementation of so called ‘heat maps’. That means, we
want to visualize spots in the web interface: show areas with
a high density of people. Considering the implementation of
these maps, we need to use certain techniques of clustering.
Basically, that means collecting a high amount of GPS
coordinates and summarizing those who are close to each
other, into one cluster. To be able to do this we will be using
certain techniques already developed, called the ‘OPTICS’
algorithm. While essentially, this algorithm implements in
fact the clustering of the given coordinates, it can even
implement more features. As seen in figure 2 the ‘OPTICS’
algorithm also can use different density parameters for certain
sub areas. This can be useful, e.g. by computing clusters in an
urban area, as well as in rural environments.
[Bauder et al., 2014] Michael Bauder, Tim Freytag, Marie
Gérardot, Exploring tourist mobility in Paris. A combined
visitor survey and GPS tracking study, EspacesTemps.net,
Objects, 17.02.2014,
http://www.espacestemps.net/en/articles/analyserlesmobilites-touristiques-a-paris-en-combinantenquetevisiteurs-et-gps
[Edwards et al., 2010] Deborah Edwards, Tracey Dickson,
Tony Griffin, Bruce Hayllar, Tracking the Urban Visitor:
Methods for Examining Tourists’ Spatial Behaviour and
Visual Representations, Cultural tourism research
methods, Pages 104-115, 2010, ISBN 9781845935184,
http://dx.doi.org/10.1079/9781845935184.0000
[Lau and McKercher, 2006] Gigi Lau, Bob McKercher,
Understanding Tourist Movement Patterns in a
Destination: A GIS Approach. Tourism and Hospitality
Research, Volume 7 No. 1, Pages 39-49, November 2006,
http://dx.doi.org/10.1057/palgrave.thr.6050027
[Leung et al., 2012] Xi Yu Leung, Fang Wang, Bihu Wu,
Billy Bai, Kurt A. Stahura, Zhihua Xie, A Social Network
Analysis of Overseas Tourist Movement Patterns in
Beijing: the Impact of the Olympic Games.
20
INTERNATIONAL
JOURNAL
OF
TOURISM
RESEARCH, Volume 14, Pages 469–484, 2012,
dx.doi.org/dx.doi.org/10.1002/jtr.876
[McKercher et al., 2006] Bob McKercher, Celia Wong, Gigi
Lau. How tourists consume a destination. Journal of
Business Research, Volume 59, Issue 5, Pages 647-652,
May
2006,
ISSN
0148-2963,
http://dx.doi.org/10.1016/j.jbusres.2006.01.009
[O’Conner et al. 2005] A. O’Connor, A. Zerger, B. Itami,
Geo-temporal tracking and analysis of tourist movement,
Mathematics and Computers in Simulation, Volume 69,
Issues 1–2, Pages 135-150, 20 June 2005, ISSN 03784754, http://dx.doi.org/10.1016/j.matcom.2005.02.036
[Pavón et al., 2004] Juan Pavón , Juan M. Corchado , Jorge J.
Gómez-Sanz , and Luis F. Castillo Ossa, Mobile Tourist
Guide Services with Software Agents, Mobility Aware
Technologies and Applications, Lecture Notes in
Computer Science, Volume 3284, Pages 322-330, 2014,
http://dx.doi.org/10.1007/978-3-540-30178-3_31
[Shoval et al. 2011] Noam Shoval, Bob McKercher, Erica Ng,
Amit Birenboim, Hotel location and tourist activity in
cities, Annals of Tourism Research, Volume 38, Issue 4,
Pages 1594-1612, October 2011, ISSN 0160-7383,
http://dx.doi.org/10.1016
[Smallwood et al., 2012] Claire B. Smallwood, Lynnath E.
Beckley, Susan A. Moore, An analysis of visitor movement
patterns using travel networks in a large marine park,
north-western Australia, Tourism Management, Volume
33, Issue 3, Pages 517-528, June 2012, ISSN
0261-5177,
http://dx.doi.org/10.1016/j.tourman.2011.06.001
[Weber and Bauder 2012] Hans-Jörg L. Weber, Michael
Bauder, Neue Methoden der Mobilitätsanalyse: Die
Verbindung von GPS-Tracking mit quantitativen und
qualitativen Methoden im Kontext des Tourismus,
Raumforschung und Raumordnung, Volume 71, Issue 2,
Pages
99-113, April 2013, http://dx.doi.org/10.1007/s13147013-0218-y
21
ENGAGE: an Evidence-based Model of Engagement for Dementia
Giulia Perugia
Universitat Politècnica de Catalunya
Technical Research Centre for Dependency Care and Autonomous Living
Neapolis Building, Rambla de l'Exposició, 59-69
08800 Vilanova i la Geltrú. Barcelona. Spain
[email protected]
Marta Díaz Boladeras
[email protected]
Andreu Català Mallofré
[email protected]
Abstract
based on Csikszentmihaly’s theory of flow and
Tickle-Degnen and Rosenthal’s theory of rapport.
The increasingly ageing society is pushing
scientists to find solutions to issues posed by
enormously widespread diseases related to
senescence. It is particularly dementia that is seen
as an urgent problem to stem. Dementia is a
neurodegenerative
disorder
that
impairs
functioning, cognition, mood and behaviour.
The importance of non-pharmacological treatments
to improve the Quality of Life of people with
dementia is a matter of fact for practitioners.
Nonetheless, given the lack of appropriate tools to
evaluate the outcomes of these treatments, their
success is only rarely demonstrated. This issue gets
along with another problem that gaming
technologies for dementia (e.g. Serious Games) are
facing, the need to evaluate their interactive
effectiveness according to an appropriate rationale.
We propose to use engagement as a unit of
measure to evaluate the success of both nonpharmacological
treatments
and
gaming
technologies for dementia.
Engagement is a complex psychological state in
which resources are mobilised to achieve a goal for
intrinsic motives. Within this position paper, we
present the state of the art on engagement and
propose a definition of engagement for dementia
1
Introduction
Dementia is a neurodegenerative disorder that causes people
to progressively lose their reasoning and planning abilities,
producing cognitive (mnemonic, linguistic, attentional) and
functional impairment (i.e. inability to dress and care for
themselves), but also affecting orientation in time and space,
mood and behaviour. Indeed, dementia brings about several
disorders [Cummings et al., 1994], just to cite the
commonest: apathy, depression, agitation and anxiety
[Robert et al., 2005]. Juxtaposed to this condition, as body
ages, mobility gets constrained and perceptual abilities (e.g.
sight, hearing) greatly reduce.
The pervasive reduction of mobility and cognition, the sense
of loss of oneself and of reference points, the inability to
carry out activities of daily living and to function
independently constitute the basic condition of dementia and
contribute consistently in damaging Quality of Life (QoL).
In such context, engagement becomes crucial. Indeed, being
engaged in something meaningful and rewarding,
appropriate for a person’s cognitive level and motor
resources, could improve psychological wellbeing [Hutson
et al., 2011] and augment self-esteem [Benveniste et al.,
2012], thus enhancing QoL [Banerjee et al., 2006].
22
2
To better express this concept of sociality, we borrowed the
notion of rapport [Tickle-Degnen and Rosenthal, 1990].
Rapport is the main condition for a good social interaction.
In more detail, it is ‘the dynamic structure of three
interrelating components: mutual attentiveness, positivity
and coordination’.
If Csikszentmihaly’s flow is mainly an experiential concept,
Tickle-Degnen and Rosenthal's rapport is also a behavioural
construct. Indeed, Tickle-Degnen and Rosenthal described
specific non-verbal correlates for rapport, such as head
nodding, eye-contact, mutual gaze, postural mirroring, turntaking cues and interactional synchrony. Our intention is to
study engagement in dementia through a triangulation of
behavioural, experiential and physiological measurements.
Thus, adding further levels of knowledge to the notions of
flow and rapport.
Research goals
The effectiveness of non-pharmacological treatments for
dementia is far from being conclusive. Indeed, albeit nonpharmacological treatments are deemed useful, several
Cochrane reviews report that their success is only rarely
demonstrated [Forbes et al., 2013; Vink and Birks, 2013].
This is mainly due to the wrong outcome measurements
chosen. Indeed, quite often cognitive and diagnostic
assessment tools (e.g. Mini-Mental State Examination,
Neuropsychiatric Inventory) are used to gauge the success
of non-pharmacological treatments and the use of such
measurements presupposes non-pharmacological treatments
to have a range of action they actually do not have.
The newborn business of gaming technologies for people
with dementia (e.g. Serious Games for Dementia - SG4D) is
facing more or less the same problem. Even in the field of
gaming technologies, where the use of the engagement
rationale to judge success is pervasive, when it comes to
dementia, a mix of non-specified behavioural measures,
cognitive assessment and time spent on the game is used to
evaluate involvement [McCallum and Boletsis, 2013].
We would like to measure the success of nonpharmacological treatments and gaming technologies for
people with dementia through the amount of engagement
they elicit, thus collocating their outcomes within the scope
of psychological wellbeing [Kitwood and Bredin, 1992].
A robust tool able to measure the engagement state of a
person with dementia could be crucial to demonstrate the
success of non-pharmacological treatments and helpful in
customising care and entertainment according to each
individual’s preferences.
3
4
Engagement defined
An Observational Model of Engagement (OME) in
dementia has been developed by Cohen-Mansfield et al.
[2010 and 2011]; engagement is here described as ‘the act
of being occupied or involved with an external stimulus’ and
is measured through four outcome variables (duration,
attention, attitude and refusal) referred to the behaviour of
an older adult with dementia with respect to a set of stimuli
he is presented with (e.g. live human social stimuli,
simulated social stimuli). The OME is interesting and
provided with psychometric validity, what we argue is the
decontextualised use of stimuli, which does not attain to the
reality of care facilities.
We would like to promote a different concept of
engagement for dementia, which is of a complex
psychological state in which human resources (cognitive,
physical, emotional, social) are mobilised and run after an
objective that one wants to attain for pure enjoyment.
Engagement is a state where tiredness is hardly perceived
since what is being done is meaningful and rewarding. This
concept of engagement is context-dependent and multilayered. Indeed, it is passible of understanding only in the
context of its appearance (e.g. non-pharmacological
treatments) and it is made of different levels of
comprehension (physiological, emotional, social, etc.) that
express themselves only sometimes through observable
behavioural features.
Theoretical framework
The most famous theory about engagement is
Csikszentmihaly’s theory of flow. The term flow refers to
the way interviewees named the experience of being
positively immersed in engaging activities [Nakamura and
Csikszentmihaly, 2002].
The flow state is described as composed by a series of
features: an intense and focused concentration, the union of
awareness and action, the loss of self-consciousness, the
feeling of being in control of one’s own actions, the
distortion of temporal axis and the perception of an activity
as intrinsically rewarding.
Csikszentmihaly’s flow have been successively integrated in
the concept of optimal experience [Csikszentmihaly and
LeFevre, 1989]. An optimal experience occurs when the
opportunities for action and the skills in the situation are
very high, in such circumstances, the quality of experience
is likely to be very positive.
Csikszentmihaly’s theory provides a nice and well-defined
conceptual framework for engagement and we have
borrowed plenty of notions from it. Nevertheless, what we
lack here is a dimension of sociality which is crucial when it
comes to dementia. Indeed, we conceive engagement as a
shared experience of communication and reciprocal support.
5
Related work
A first attempt to implement the features of a psychosocial
model of engagement on technologies has been done by
Sidner et al. [2005]. Sidner et al. described engagement as
‘the process by which two (or more) participants maintain
and end their perceived connection during interactions they
jointly undertake’. They studied human-human facial
tracking and implemented two gaze behaviours (i.e. mutual
facial gaze and directed gaze) on a penguin robot named
Mel. Results showed that, when gaze behaviours are
present, the robot attitude is considered more natural, the
23
gaze coordination is enhanced and the interaction time
augments.
Rich et al. [2010] conceived perceived connection in
human-human interaction as behaviourally embodied in
Directed Gaze, Mutual Facial Gaze, Adjacency Pair (e.g.
question-answer) and Backchannel (feedbacks to signal
message reception). They successfully implemented these
behavioural gestures in the human-robot architecture,
enabling the robot to recognise human engagement through
gaze, head nods, head shakes and pointing gestures and to
return a coherent behavioural feedback to the human using
the same set of gestures.
In the field of Human-Computer Interaction, BianchiBerthouze 2008] produced a model of the relationship
between body movement and quality of engagement.
Bianchi-Berthouze borrowed Lazzaro’s definition of
engagement [2004] and considered engagement as a
composite of hard fun and easy fun, emotional (altered
state) and social experiences (person factor).
Testing different computer-games, based on different
interactive modalities, Bianchi-Berthouze noticed that, when
engagement arises, different movement patterns show up.
For instance, when participants were asked to play Guitar
hero in a dual-pad controller condition, just using features
of the game they could control by hands, several
expressions of frustrations appeared when committing a
mistake. On the contrary, when participants were told how
to use the tilt sensor in the neck of the guitar to play the
game, guitar-like player movements and expressions of
excitement emerged.
Similarly to these works, we would like to study
engagement as a psychological state with behavioural,
experiential and physiological correlates to inform nonpharmacological treatments, gaming technologies and
technology-based activities in the form of guidelines for
design intervention, online recognition of behavioural and
physiological patterns and interaction strategies.
towards activity, facial expressions), experiential (self- and
expert estimations) and physiological measures (Skin
Conductance Level and Mean Motor Activity measured
through actigraphy).
The enlisted physiological measures were chosen since there
is promising evidence on the ability of skin conductance
levels to capture affective valence and arousal and of
actigraphy to grasp motivational states in dementia [Treush
et al., 2015; David et al., 2012; Kuhlmei et al., 2013].
In this context, our main research questions are: ‘Which are
the behavioural correlates of engagement in dementia?’,
‘Which are the physiological correlates of engagement in
dementia patients? Can we identify patterns of arousal
(SCL) and motor activity (Actigraphy) during engagement
states?’ and ‘Are these physiological patterns consistent
with behavioural ones?’
7 Conclusions
Within this position paper, we have presented a first
approximation to an evidence-based model of engagement
for dementia. At the beginning, we have proposed two
possible areas of application of the model: nonpharmacological treatments evaluation and technologybased gaming activities design. Successively, we have
exposed the related work in both HRI and HCI. We have
then shed a light on our definition of engagement and
described its theoretical framework, Csikszentmihaly’s
theory of flow and Tickle-Degnen and Rosenthal’s theory of
rapport.
At the end of the paper, we have given an overview of the
triangulation of measurements we are going to use to
produce an evidence-based model of engagement and we
have exposed the research questions we would like to reply
to.
The model we will build will be used to design new
interventions and new technologies for users with dementia,
to adapt online the behaviour of interactive technologies
according to specific behavioural and physiological states of
the user and to evaluate the quality of the interactive
experience.
6 Research questions
In 2012, the special session ‘Measuring Engagement:
Affective and Social Cues in Interactive Media’ was set up
within the yearly conference ‘Measuring Behaviour’ openly
exposing the importance of understanding engagement for
the design of interactive media. In this context, the research
questions posed were ‘How can we design and predict
engagement?’, ‘How can we adapt a game to its users and
audience to increase and decrease engagement?’. A focus
was put on the idea that sensors could be thought of as
providers of input modalities for interaction (e.g. postures,
gestures, body movements, facial expressions and brain
activity).
In line with this, but with some limitations due our target
profile we are willing to work with, we are going to use a
triangulation of measures to gain an insight of the
engagement state of the person with dementia within a
certain activity (non-pharmacological treatments and
technology-based activities). We selected behavioural (e.g.
postures, proxemics, attitude towards people, attitude
Acknowledgments
This work was supported in part by the Erasmus Mundus
Joint Doctorate (EMJD) in Interactive and Cognitive
Environments (ICE), which is funded by Erasmus Mundus
under the FPA no. 2010-2012.
References
[Banerjee et al., 2006] Banerjee, S., Smith, S. C., Lamping,
D. L., Harwood, R. H., Foley, B., Smith, P., Murray, J.,
et al. (2006). Quality of life in dementia: more than just
cognition. An analysis of associations with quality of life
in dementia. Journal of neurology, neurosurgery and
psychiatry, 77(2), 146-148.
[Benveniste et al., 2012] Benveniste, S., Jouvelot, P., Pin,
B., & Péquignot, R. (2012). The MINWii project:
24
Renarcissization of patients suffering from Alzheimer’s
disease through video game-based music therapy.
Entertainment Computing, 3(4), 111-120.
[McCallum and Boletsis, 2013] McCallum, S., & Boletsis,
C. (2013). Dementia Games: a literature review of
dementia-related Serious Games. In Serious Games
Development and Applications (pp. 15-27). Springer
Berlin Heidelberg.
[Bianchi-Berthouze, 2008] Bianchi-Berthouze, N. (2008).
Body movement as a means to modulate engagement in
computer games. In Proc. Workshop on Whole Body
Interaction, HCI (Vol. 8).
[Nakamura and Csikszentmihalyi, 2002] Nakamura, J., &
Csikszentmihalyi, M. (2002). The concept of flow. The
handbook of positive psychology, 89-105.
[Cohen-Mansfield et al., 2009] Cohen-Mansfield, J.,
Dakheel-Ali, M., & Marx, M. S. (2009). Engagement in
persons with dementia: the concept and its measurement.
The American journal of geriatric psychiatry : official
journal of the American Association for Geriatric
Psychiatry, 17(4), 299-307.
[Rich et al., 2010] Rich, C., Ponsler, B., Holroyd, a, &
Sidner, C. L. (2010). Recognizing engagement in
human-robot interaction. Human-Robot Interaction
(HRI), 2010 5th ACM/IEEE International Conference
on, 375-382.
[Cohen-Mansfield et al., 2011] Cohen-Mansfield, J., Marx,
M. S., Freedman, L. S., Murad, H., Regier, N. G., Thein,
K., & Dakheel-Ali, M. (2011). The Comprehensive
Process Model of Engagement. American Journal of
Geriatric Psychiatry.
[Robert et al., 2005] Robert, P. H., Verhey, F. R., Byrne, E.
J., Hurt, C., De Deyn, P. P., Nobili, F. et al. Grouping for
behavioral and psychological symptoms in dementia:
clinical and biological aspects. Consensus paper of the
European Alzheimer disease consortium. European
Psychiatry.2005;20(7):490–496.
[Csikszentmihalyi and LeFevre, 1989] Csikszentmihalyi,
M., & Lefevre, J. (1989). Optimal Experience in Work
and Leisure. Journal of Personality and Social
Psychology, 56(5), 815-822.
[Sidner et al., 2005] Sidner, C. L., Lee, C., Kidd, C. D.,
Lesh, N., & Rich, C. (2005). Explorations in engagement
for humans and robots. Artificial Intelligence, 166(1-2),
140-164.
[Cummings et al., 1994] Cummings, J. L., Mega, M., Gray,
K., Rosenberg-Thompson, S., Carusi, D. A., &
Gornbein, J. (1994). The Neuropsychiatric Inventory
comprehensive assessment of psychopathology in
dementia. Neurology, 44(12), 2308-2308.
[Tickle-Degnen and Rosenthal, 1990] Tickle-Degnen, L., &
Rosenthal, R. (1990). The Nature of Rapport and Its
Nonverbal Correlates. Psychological Inquiry, 1(4), 285293.
[David et al., 2012] David, R., Mulin, E., Friedman, L., Le
Duff, F., Cygankiewicz, E., Deschaux, O., ... & Zeitzer,
J. M. (2012). Decreased daytime motor activity
associated with apathy in Alzheimer disease: an
actigraphic study. The American Journal of Geriatric
Psychiatry, 20(9), 806-814. ISO 690.
[Treush et al., 2015] Treusch, Y., Page, J., van der Luijt,
C., Beciri, M., Benitez, R., Stammler, M., & Marcar, V.
L. (2015). Emotional reaction in nursing home residents
with dementia-associated apathy: A pilot study. Geriatric
Mental Health Care.
[Vink and Birks 2013] Vink, A. C., Bruinsma, M. S., &
Scholten, R. J. (2003). Music therapy for people with
dementia. The Cochrane Library.
[Forbes et al., 2013] Forbes, D., Thiessen, E. J., Blake, C.
M., Forbes, S. C., & Forbes, S. (2013). Exercise
programs for people with dementia. Cochrane Database
of Systematic Reviews, 12(4), CD006489.
[Hutson et al., 2011] Hutson, S., Lim, S. L., Bentley, P. J.,
Bianchi-Berthouze, N., & Bowling, A. (2011).
Investigating the suitability of social robots for the
wellbeing of the elderly. Lecture Notes in Computer
Science (including subseries Lecture Notes in Artificial
Intelligence and Lecture Notes in Bioinformatics), 6974
LNCS, 578-587.
[Kitwood and Bredin, 1992] Kitwood, T., & Bredin, K.
(1992). Towards a theory of dementia care: personhood
and well-being. Ageing and society, 12(1992), 269-287.
[Kuhlmei et al., 2013] Kuhlmei, A., Walther, B., Becker, T.,
Mueller, U., & Nikolaus, T. (2013). Actigraphic daytime
activity is reduced in patients with cognitive impairment
and apathy. European Psychiatry, 28(2), 94-97.
[Lazzaro, 2004] Lazzaro, N. (2004). Why We Play Games:
Four Keys to More Emotion Without Story. Game
Developer Conference (GDC), 1-8.
25
Plataforma BREATHE de apoyo al cuidador informal: ¿Estamos preparados para
el Gran Hermano?
Ángel Martínez, Juan Pablo Lázaro, Isabel Martí
Soluciones Tecnológicas para la Salud y el Bienestar S.A
Departamento Labs
{amartinez,jplazaro,imarti}@tsbtecnologias.es
Resumen
1
En la actualidad, la mayor parte del cuidado a largo
plazo de las personas mayores dependientes en Europa es proporcionada por los familiares de los mismos (también conocidos como cuidadores informales o cuidadores familiares). A la falta de experiencia y formación se une el hecho de que no existen
herramientas específicas que reduzcan la carga de
trabajo y mejoren el día a día de los cuidadores informales lo que provoca, a medio/largo plazo, un
fuerte desgaste físico y emocional que se traduce en
el conocido efecto del síndrome del cuidador: aislamiento social, estrés, ansiedad, agotamiento, depresión, sobrecarga, falta de autoestima y sentimiento
de culpabilidad son solo algunas de las principales
consecuencias de la acción prolongada de cuidar, en
las personas que desempeñan el rol de cuidador
principal de una persona dependiente.
El objetivo del proyecto BREATHE es, en primer
lugar, promover el envejecimiento activo de las personas mayores de forma que éstas puedan permanecer, si así lo desean, en su propia vivienda de forma
saludable, segura e independiente el máximo tiempo
que sea posible. En segundo lugar, es objetivo prioritario del proyecto BREATHE, el apoyar la toma de
decisiones complejas a las que se enfrentan los cuidadores informales a diario así como combatir el
aislamiento social, reducir el nivel de estrés y la
carga de trabajo de los mismos por medio de una
plataforma tecnológica en la nube compuesta por un
sistema inteligente de AAL (Active Assisted Living) y una herramienta de gestión que propicie la
toma de buenas decisiones gracias a la monitorización continuada (de forma ética, legal y preservando
la dignidad y privacidad de las personas mayores) y
al reconocimiento automático de actividades gracias
al sistema de AAL disponible en casa de la persona
en situación de dependencia.
Situación de contexto
En la actualidad, existen en toda Europa ciertas similitudes
en lo que respecta al cuidado a largo plazo de personas mayores y/o dependientes que han sido debidamente documentadas en la bibliografía científica. De forma específica, podemos señalar las siguientes como punto de partida del Proyecto
BREATHE:
 En Europa, el 80% del cuidado a largo plazo de
las personas mayores y/o dependientes es realizado por cuidadores informales (i.e. personas sin
formación específica socio-sanitaria que no reciben compensación económica alguna a pesar de
que desempeñan un rol importante en el cuidado
diario que se traduce en una carga considerable de
trabajo) [1].
 En Europa, las personas mayores y/o dependientes prefieren permanecer el mayor tiempo posible
en su casa y ser cuidadas por sus familiares más
cercanos [2].
 En Europa, el perfil tipo de cuidador informal más
común es: mujer (76%), mediana edad (+55), normalmente hija/nuera del mayor y cuya dedicación
aproximada al cuidado es de 46 horas/semana durante 60 meses (5 años). Asimismo, menos de la
mitad tienen la oportunidad de conciliar la vida
personal con la laboral (60% en situación de desempleo) y pese a que tienen unas habilidades tecnológicas limitadas, la mayor parte de ellas sí
hace un uso frecuente de Internet (fijo y móvil),
dispone de uno o varios dispositivos tecnológicos
en casa que usa habitualmente (ej. ordenador portátil, Tablet PC, Smartphone, etc.) y es usuaria activa de las redes sociales (ej. Facebook) y los programas de mensajería instantánea disponibles en
los teléfonos móviles en la actualidad (ej.
WhatsApp, etc.) [4], [5].
 El cuidado a largo plazo de forma continuada entraña una serie de riesgos en la salud del propio
cuidador informal como son, aislamiento social,
estrés, ansiedad, depresión, pérdida de autoestima
y sentimiento de culpabilidad que se traducen en
26
el conocido síndrome del cuidador (i.e. alta probabilidad de que el propio cuidador se convierta
en una persona que necesite ser cuidada) [3].
2
2.2 Sistema inteligente de AAL
El sistema inteligente de AAL está compuesto por un conjunto de dispositivos tecnológicos que se instalan en casa de
la persona mayor y/o en situación de dependencia con el objetivo de recoger cierta información de las actividades que
realiza a diario (i.e. de forma totalmente transparente para la
persona mayor puesto que ésta no tiene que accionar ningún
mecanismo que arranque/garantice el buen funcionamiento
del sistema ni llevar ningún dispositivo adherido al cuerpo).
Asimismo, y con el fin de preservar la privacidad y garantizar
la toma de decisiones de la persona que está siendo monitorizada, es parte fundamental del sistema BREATHE un dispositivo de interacción que permita a la persona mayor activar/desactivar el sistema de monitorización de forma que ésta
tenga todo el control del mismo y decida cuándo y bajo qué
condiciones permite al sistema recabar información acerca de
las actividades que está realizando.
Por tanto, el sistema inteligente de AAL que está siendo
instalado en la actualidad consta de los siguientes elementos:
dispositivo de interacción, set de sensores Z-Wave®, dispositivo inalámbrico de captación de imagen y elemento concentrador con capacidad de proceso y acceso al sistema de
almacenamiento y procesado en la nube.
Plataforma BREATHE
BREATHE es una plataforma tecnológica que proporciona
un guiado diario y un apoyo continuado a los cuidadores informales en el cuidado a largo plazo de las personas mayores
y/o dependientes. Si bien es cierto que tradicionalmente siempre se ha tratado de poner la tecnología al servicio de las personas mayores para abordar la problemática del cuidado a
largo plazo, en la actualidad existe una corriente de proyectos
de investigación y soluciones comerciales que ponen el foco
en el cuidador informal y no tanto en la persona que precisa
ser cuidada [2]. BREATHE cree firmemente en esta filosofía
de trabajo puesto que entiende que una mejor formación, así
como una reducción en el nivel de estrés, saturación y carga
de trabajo del cuidador informal repercutirán de forma positiva en la atención prestada al mayor. Por este motivo,
BREATHE se plantea no sólo como una herramienta puntual
y estática que resuelve un problema concreto en un momento
determinado sino más bien como una solución continua y dinámica que acompañe al cuidador informal durante todo el
proceso que dure el cuidado del mayor y vaya evolucionando
con el paso del tiempo a medida que las condiciones y las
necesidades tanto de la persona mayor como del propio cuidador informal vayan variando.
Asimismo, el diseño y desarrollo de la plataforma de apoyo
BREATHE parten de la premisa de que cuidar a una persona
mayor con ciertas dependencias por parte de alguien que carece de formación socio-sanitaria explícita es un proceso altamente complejo por las particularidades intrínsecas y únicas de las propias personas implicadas en el mismo (i.e. persona mayor y cuidador). Es por eso que, con el objetivo de
reducir al máximo la complejidad del problema inicial y poner algunos límites a la casuística elevada que se da en un
entorno como éste, el escenario tipo a validar durante las
pruebas piloto que se están realizando con usuario reales en
España, Reino Unido e Irlanda están restringidas a personas
mayores que viven solas y que tienen además un elevado
grado de autonomía. Asimismo, es importante mencionar
que, bajo ningún concepto, el sistema BREATHE se plantea
como una alternativa a cualquier sistema tecnológico que requiera de un tiempo de respuesta crítico frente a una situación
de emergencia.
A grandes rasgos, la plataforma de apoyo BREATHE se
compone de los siguientes módulos: (1) sistema inteligente
de AAL (AAL home system), (2) sistema de almacenamiento
y procesado en la nube (backend) y (3) herramienta de gestión y toma de decisiones para el cuidador informal (frontend).
Figura 2 Sistema inteligente de AAL
Elemento
Dispositivo de interacción
Figura 1 Arquitectura de la plataforma BREATHE
27
Descripción/Objetivo
Tablet PC con sistema operativo Android
+4.0. Permite a la persona mayor o en situación de dependencia activar/desactivar el
sistema inteligente de AAL por un periodo
de tiempo determinado (i.e. 1 hora, 6 horas
o 24 horas). Pasado este tiempo, el sistema
inteligente de AAL se pone de nuevo en funcionamiento de forma automática tras notificar tanto a la persona mayor como a su cuidador principal que éste vuelve a reanudar
su actividad y por tanto la persona mayor
vuelve a estar monitorizada. El objetivo del
mismo es preservar la privacidad del mayor
dotándole de una herramienta que le permita
Set de sensores ZWave®
Dispositivo
inalámbrico de
captación
de imagen
controlar el sistema y decidir cuándo y bajo
qué condiciones ser monitorizado.
Z-Wave® es una tecnología estándar de comunicación inalámbrica ampliamente aceptada en todo el mundo (tanto en el entorno
académico/investigación como a nivel comercial). Ha sido éste el protocolo de comunicación empleado por los distintos sensores
(de forma específica: sensores de contacto
magnético, sensores de movimiento y medidores de intensidad o consumo eléctrico)
que han sido distribuidos por la vivienda de
la persona mayor o en situación de dependencia con el objetivo de identificar, de
forma autónoma y sin intervención de la persona, un conjunto específico de actividades
que forman parte del día a día de las personas. En concreto: actividades relativas al uso
de la cocina (uso de la cafetera, uso de la nevera, uso del microondas y uso de la encimera) así como actividades relativas al movimiento de la persona en el interior (movilidad en el pasillo) y exterior de su vivienda
(entrada y salida de la vivienda).
El dispositivo inalámbrico de captación de
imagen es la base del sistema de monitorización basado en vídeo. Se trata de una cámara
con conexión Wifi que se instala en aquellas
zonas de la vivienda donde la persona mayor
realiza el mayor número de actividades (i.e.
en la cocina y/o la sala de estar) cuyo objetivo es (1) permitir al cuidador informal ver
qué está pasando en tiempo real en casa de
su familiar, (2) detectar e identificar, de
forma automática y gracias al procesado de
imagen y la visión artificial, un conjunto específico de actividades cotidianas (en concreto las mismas actividades detectadas por
el Set de sensores Z-Wave® anteriormente
descritas además del nivel de actividad de la
persona en aquella/s estancia/s dónde la cámara haya sido instalada) y (3) generar un
mapa de calor (heatmap) que permita al cuidador informal conocer de forma rápida y de
un vistazo cómo de activa ha sido su persona
mayor a lo largo del día. Dado que instalar
una cámara en un entorno doméstico es
siempre una tarea compleja y delicada por
los riesgos obvios que entraña, es importante
mencionar en este punto que a la hora de
desarrollar esta parte concreta del sistema ha
sido un requisito fundamental impuesto por
el propio equipo BREATHE maximizar la
Elemento
concentrador con
capacidad
de proceso
seguridad y preservar la privacidad de la
persona (cuando el cuidador informal accede de forma remota a la vivienda de su
persona cuidada, la imagen de la persona
mayor es reemplazada por un avatar con
forma de silueta por lo que nunca se muestra
la imagen real de la persona). Asimismo, es
a su vez importante mencionar que todas las
pruebas piloto que se han llevado a cabo con
usuarios reales han sido supervisadas y
aprobadas con anterioridad por el Comité
Ético de la Facultad de Ciencias para la Salud del Trinity College de Dublín1 (Irlanda).
En la actualidad y dado que nos encontramos en la primera iteración de las pruebas
piloto que se realizarán a lo largo del proyecto con usuarios reales, existe un elemento concentrador con capacidad de proceso por cada uno de los diferentes sistemas
de monitorización instalados en la vivienda
de la persona mayor o en situación de dependencia. Concretamente, disponemos de un
ordenador personal con alta capacidad de
proceso y sistema operativo GNU/Linux
Ubuntu dedicado al procesamiento y análisis de las imágenes adquiridas por el dispositivo inalámbrico de captación de imagen y
un ordenador de bajo coste (Raspberry Pi
Model B+ con módulo de comunicación
RaZberry2) para posibilitar la transmisión y
recepción de tramas Z-Wave® generadas
por el set de sensores instalados en la vivienda. El objetivo principal del elemento
concentrador es hacer llegar los eventos detectados de forma automática al sistema de
almacenamiento y procesado en la nube.
2.3 Sistema de almacenamiento y procesado en
la nube
El objetivo principal del sistema de almacenamiento y procesado en la nube es (1) almacenar todos los eventos (datos
en bruto o RAW) que son detectados e identificados de forma
automática por el sistema inteligente de AAL disponible en
casa de la persona mayor o en situación de dependencia y (2)
convertirlos, por medio de una serie de algoritmos, en un conjunto de actividades que proporcionen un mayor nivel de información a los cuidadores informales alimentando de forma
ininterrumpida la herramienta de gestión y toma de decisiones que éstos tienen a su disposición para mejorar su calidad.
La plataforma en la nube propiedad de la empresa Google3
(Google App Engine, GAE4) ha sido la tecnología que ha
sustentado el backend del sistema BREATHE. Aunque es-
28
capa de los objetivos perseguidos por este documento el comparar las distintas soluciones en cloud existentes en la actualidad, los principales motivos por los cuales el Proyecto
BREATHE se ha decantado por esta plataforma y no otra han
sido: (1) posibilidad de desplegar distintas instancias de la
misma versión del software por país de origen donde se van
a llevar a cabo las pruebas piloto (España, Reino Unido e Irlanda), (2) posibilidad de alojar los datos que son recogidos
de forma automática por el sistema inteligente de AAL en
servidores que se encuentren físicamente dentro de las fronteras de Europa (y no en Estados Unidos cuya política de protección de datos es particularmente distinta a la Europea), (3)
posibilidad de crear aplicaciones en diferentes lenguajes de
programación (Java, Python, PHP o Go), (4) escalado automático de los recursos de la aplicación de forma automática
en función de la demanda del servicio, (5) disponibilidad de
plug-ins que facilitan la integración de GAE con las principales herramientas de desarrollo disponibles hoy en día
(Eclipse, IntelliJ, Maven, Git, etc.), (6) buena documentación
actualizada y multitud de ejemplos funcionales, librerías y (7)
servicios REST disponibles (Google Cloud Endpoints5) para
posibilitar el intercambio de información entre GAE y los dispositivos móviles (iOS, Android OS o clientes en Javascript).
a hacer en España, Reino Unido e Irlanda. La segunda iteración (Julio, 2015) puso a disposición de 7 usuarios reales (5
parejas en España y 2 en Reino Unido) la tecnología
BREATHE de forma que, durante 3 meses y de forma continuada e ininterrumpida (7x24), el sistema inteligente de AAL
estuvo monitorizando y recogiendo información al respecto
de las actividades llevadas a cabo por 7 personas mayores en
su propia vivienda y presentándola de forma fácil e intuitiva
a sus principales cuidadores informales a través de la herramienta del cuidador informal. La última iteración (Octubre,
2015) permitió desplegar la infraestructura BREATHE en 15
viviendas con usuarios reales (5 parejas en España, 5 en
Reino Unido y 5 en Irlanda) con el objetivo de validar el sistema hasta final de año (3 meses).
Tal y como se aprecia en la figura 1 de la página 3, existen
dos variantes diferentes de la herramienta de gestión y toma
de decisiones para el cuidador informal. Concretamente,
existe una versión home más completa y que dispone de todas
y cada una de las funcionalidades que han sido implementadas a lo largo del proyecto BREATHE. Aunque para acceder
a esta solución el usuario tan solo necesita abrir un navegador
web y acceder a una URL7 específica donde está alojada la
herramienta (i.e. el cuidador informal no necesita instalar
nada en su equipo puesto que todo el software está desplegado en la nube), esta versión ha sido pensada para aquellas
situaciones en las cuales el cuidador informal está en casa y
dispone de un ordenador personal con teclado y ratón para
manejar la herramienta. Por el contrario, existe una versión
reducida (mobile) para aquellas circunstancias en las cuales
el cuidador informal no está en casa, pero necesita conocer
en tiempo real las actividades que está realizando la persona
mayor. Se trata ésta de una aplicación para el móvil (sistema
operativo Android +4.0) con un número reducido de funcionalidades: notificaciones en tiempo real e informe de actividad con tendencias (alta, baja o igual) acerca de las actividades cotidianas que ha realizado la persona mayor a lo largo
del día.
Asimismo, dentro de la herramienta para el cuidador informal han sido definidos dos roles que complementan la labor
del cuidador informal y aseguran la disponibilidad/viabilidad
de la plataforma con el paso del tiempo. De forma específica,
existe un rol de administrador para gestionar las cuentas de
los usuarios en la base de datos y dotar de contenidos a la
plataforma que sirvan de apoyo/guía al cuidador informal en
función de su situación personal. Existe además un rol de administrador técnico que permite identificar de forma rápida
posibles caídas del sistema y contactar con los responsables
técnicos de los sitios pilotos para solucionar el problema en
la mayor brevedad. Los tres perfiles anteriormente descritos
(cuidador informal, administrador y administrador técnico)
tienen a su alcanza un conjunto de servicios y herramientas
disponibles en la propia plataforma BREATHE.
Al respecto de las principales funcionalidades de las que
dispone la herramienta de gestión y toma de decisiones para
el cuidador informal, podemos destacar:
2.4 Herramienta de gestión y toma de decisiones para el cuidador informal
La herramienta de gestión y toma de decisiones es el equipamiento que tienen a su disposición los cuidadores informales para conocer, de primera mano, qué está pasando alrededor de la persona mayor o en situación de dependencia. Dado
que una de las principales consecuencias del proceso de cuidar es el deterioro en la propia salud y estado de ánimo del
propio cuidador, es también un objetivo prioritario de esta herramienta el monitorizar al cuidador informal para así conocer cómo está evolucionando con el paso del tiempo y detectar con suficiente antelación si existe algún riesgo para la persona (es importante mencionar que la monitorización del cuidador informal no se realiza a través de un sistema tecnológico de recogida de datos como es el sistema inteligente de
AAL que tenemos a disposición de la persona mayor sino a
través de una serie de sencillos cuestionarios de 5 preguntas
y una herramienta automática de análisis de sentimiento que
permite identificar si el estado de ánimo de una persona es
positivo, negativo o neutro así como cuál es el nivel de sobrecarga/saturación de la persona).
La herramienta del cuidador informal ha sido desarrollada
y puesta a disposición de los usuarios reales en tres iteraciones distintas. La primera iteración (Julio, 2014) permitió validar al equipo técnico BREATHE tanto las tecnologías (Java
EE7, JPA para la capa de persistencia y el framework Vaadin6 para el desarrollo del frontend) como la propia infraestructura en la nube (GAE) que iba a permitir albergar los datos generados de al menos 30 parejas reales (persona mayor
y cuidador informal) a través de las pruebas piloto que se iban
29


Información acerca del nivel de actividad de la persona mayor a nivel global, en la cocina o relativa a la
movilidad en el interior de su propia vivienda
Heatmap o mapa de calor que revela, cómo de activa
así como en qué zonas de la estancia ha pasado la mayor parte del tiempo la persona mayor a lo largo del
día.

Herramienta de análisis de sentimiento que permite
conocer de forma automática si un texto libre (limitado a 140 caracteres, por sencillez) escrito por el cuidador informal es positivo, negativo o neutro.

Cuestionario semanal compuesto por 5 preguntas (requiere menos de 1 minuto en ser cumplimentado) y
recordatorio de cuestionario pendiente por rellenar
para el cuidador informal que nos permite, con el paso
del tiempo, identificar si una persona necesita ayuda
(sobrecarga) como consecuencia de la elevada dedicación semanal y del estrés ocasionado por el hecho de
cuidar.
Línea temporal, en tiempo real, con las actividades
que ha realizado/están siendo realizadas por la persona mayor en su vivienda.

30

Estadísticas, tendencias (alta, baja y alta) y evolución
de las actividades que ha realizado la persona mayor
en su vivienda en tres periodos de tiempo distintos: las
últimas 24 horas, los últimos 7 días y los últimos 30
días.

Acceso en tiempo real por parte del cuidador informal
a la(s) cámara(s) instaladas en la vivienda de la persona mayor o en situación de dependencia. Por aspectos éticos y con el fin de preservar la privacidad de la
persona, la imagen real es reemplazada por un avatar
con forma de silueta. Asimismo, la persona mayor es
notificada cuándo el cuidador informal accede a la cámara (i.e. está viendo en tiempo real el interior de la
vivienda) y esta funcionalidad sólo está disponible
para el cuidador principal del mayor.

Evolución del estado del cuidador informal en términos de sobrecarga y dedicación semanal teniendo en
cuenta las respuestas proporcionadas por el propio
cuidador a los cuestionarios semanales que tiene a su
disposición, así como el resultado de la herramienta
de análisis de sentimiento. En función del estado en el
que se encuentre el cuidador (bajo nivel de sobre), la
herramienta BREATHE le proporcionará una serie de
acciones/actividades con el objetivo de reducir su nivel de estrés y sobrecarga.

3. Conclusión
La plataforma BREATHE no es un sistema/servicio de
emergencias que proporcione una respuesta inmediata frente
a una situación crítica de emergencia. Por el contrario,
BREATHE se concibe como un sistema de monitorización
remoto que proporciona un apoyo continuo al cuidador informal en el proceso de cuidado a largo plazo de una persona
mayor y/o dependiente. Dado que las pruebas piloto que se
están realizando con usuarios reales en España, Reino Unido
e Irlanda todavía siguen en marcha (hasta Diciembre, 2015),
el equipo BREATHE no dispone a día de hoy de las conclusiones finales como consecuencia de la valoración del servicio por parte de los usuarios. Sin embargo, gracias a las entrevistas que hacemos con los usuarios y el uso que los cuidadores informales hacen de la herramienta, a día de hoy sabemos que:
 Al respecto de tener la oportunidad de ver en
tiempo real que está pasando en casa de la persona
mayor o en situación de dependencia, la mayor
parte de los mayores (80%) y de los cuidadores
informales (68%) encuentran útil y aceptable el
tener las cámaras instaladas y accesibles desde la
herramienta del cuidador informal.
 Las estancias valoradas de forma positiva en las
cuales las personas mayores estarían de acuerdo
con tener una cámara instalada son (en orden de
prioridad): la sala de estar, la cocina y el dormitorio.
 Las personas mayores aceptan que su cuidador
principal (familiar) sea la única persona quien
pueda acceder a ver en tiempo real qué está pasando en el interior de su vivienda porque asumen/entienden que es una funcionalidad útil para
el/ella.
 Bajo ningún concepto, es aceptable por las personas mayores ni sus cuidadores informales que se
almacene imágenes o vídeos (ni siquiera de forma
parcial).
 Aunque bajo determinadas circunstancias las personas mayores valoran de forma positiva que su
cuidador principal pueda ver la imagen real de lo
que está pasando en su vivienda, prefieren ser reemplazadas por un filtro o avatar de forma que se
preserve su imagen.
 El eje de tiempos (timeline) con las actividades
que está realizando la persona mayor y la posibilidad de verla en tiempo real (live view) son las
funcionalidades mejor valoradas de la herramienta del cuidador informal.
 Aunque la herramienta del cuidador informal fue
concebida como una aplicación compleja de trabajo que requería de un ordenador personal, teclado y ratón para sacar el máximo provecho de
la misma, la mayor parte de los cuidadores informales prefieren acceder a la misma a través de
dispositivos móviles (en orden: Tablet PC y teléfono móvil) de reducidas dimensiones.




En media, los cuidadores informales acceden a la
herramienta del cuidador informal entre una y dos
veces al día.
El miércoles entre las 09.00 y las 10.00 am es el
momento más favorable para los cuidadores informales para acceder a la herramienta del cuidador informal (i.e. mayor número de accesos). Por
el contrario, el domingo es el día que menos tráfico genera.
En media, los cuidadores informales emplean entre 1 y 3 minutos por sesión (donde una sesión
significa la diferencia de tiempo entre el logout y
el login en la plataforma).
En las pruebas piloto que permitieron validar la
segunda iteración del software (desde Julio hasta
Octubre del 2015), el sistema inteligente de AAL
ha recogido 191.989 eventos en 7 viviendas. Distribuidos como sigue: 144.669 (75,53%) por parte
del set de sensores y 47.094 (32,55%) del sistema
de vídeo.
La valoración que hacen los cuidadores informales del sistema BREATHE es, hasta la fecha, muy
positiva. Confían plenamente en la información
que el sistema pone a su disposición y el hecho de
saber qué está haciendo su persona mayor, así
como poder verlo en un momento determinado
les da calma y tranquilidad.
Agradecimientos
El Proyecto BREATHE ha sido financiado de forma conjunta por el Ambient Assisted Living Joint Programme (Call
5, 2012) así como algunas autoridades y programas de investigación locales en España, Reino Unido, Irlanda e Italia.
Bibliografia
[1] Who cares? Care coordination and cooperation to enhance quality in elderly care in the EU. Marin et al. 2009
[2] Long-term care challenges in an ageing society: the role
of ICT and migrants. Results from a study on England,
Germany, Italy and Spain. JRC scientific and technical reports. European Commission (EUR 24382 EN). 2010. Public document available on: http://ftp.jrc.es/EURdoc/JRC58533.pdf
[3] Negative caregiving effects among caregivers of Veterans with dementia. Bass D, Judge K, Snow A, et al. American Journal of Geriatric Psychiatry 20(3):239-247. 2012.
[4] EUROFAMCARE Project (contract number QLK6CT-2002-02647). Services for supporting family carers of
elderly people in Europe: characteristics, coverage and
usage. International re-search project funded within the 5th
Framework Programme of the European Community.
http://www.uke.de/extern/eurofamcare (Last access June
2015).
[5] Deliverable D1.1 – Needs and requirements of AAL
and ICT solutions for the informal. Public document available on: http://www.breathe-project.eu/en/publications.
(Last access June 2015).
31
Filtering process and data exchange architecture over ECG custom-hardware
platform
Luis Miguel Soria Morillo1 , Daniel Scherz 2 , Ralf Seepold 2 and Juan Antonio Ortega Ramı́rez1
1
University of Seville. Seville, Spain
[email protected], [email protected]
2
Ubiquitous Computing Lab (UC-Lab), HTWG Konstanz. Konstanz, Germany
1
Introduction
del riesgo de olvido), el problema de estos sensores suele ser
la precision.
´ Para determinados ámbitos entornos de monitorizacion
´ clı́nica y continua, esta alternativa no es viable. Por
otro lado, los wearables suelen contener un sensor ECG tan
solo
´ válido para calcular el pulso, pero inutil
´ para obtener el
perfil eléctrico del comportamiento cardiaco.
Para paliar el inconveniente de la precision,
´ mantener la comodidad y poder acceder al perfil fisio-eléctrico, ha sido desarrollado un dispositivo basado en hardware low-cost para
este proposito.
´
La senal
˜ capturada por este dispositivo,
aunque extremadamente clara, posee ciertos niveles de ruido
y artefactos que deben ser filtrados.
En este trabajo, por un lado, se propone la aplicacion
´ de un
conjunto de filtros sobre la plataforma hardware elaborada
para reducir el impacto del ruido en la senal.
˜ Dicho proceso
es lo suficientemente rápido para ser llevado a cabo en tiempo
real. Finalmente, usando este procesamiento, un aumentando
en la claridad de la senal
˜ es conseguido, permitiendo su uso en
determinados entornos clı́nicos como, por ejemplo, sistemas
de monitorizacion
´ del sueno.
˜
Por otro lado, gracias al desarrollo modular seguido para construir el sensor, en este trabajo se propone el uso de una
plataforma de integracion
´ de sensores, basada en frameworks,
que posibilite el intercambio de datos entre las diferentes motas del sistema y un dispositivo central. Este dispositivo principal, aunque podrı́a aplicarse a otros perfiles hardware, se
trata de un dispositivo movil.
´
Las principales razones de esta
eleccion
´ han sido la versatilidad, el reducido coste y la expansion
´ de estos dispositivos en la sociedad. Para ilustrar el
uso de esta arquitectura de conexion,
´ se ha elegido un escenario de monitorizacion
´ del sueno.
˜ En este entorno, además
del uso de motas ECG, es necesario sistemas EEGs, electromiografı́a y sensores para la medicion
´ de galvanic skin response.
ECG sensors have become popular in recent years thanks
largely to the integration with different wearables. Despite
the advantages of this integration (comfort, reduced risk of
forgetting), the problem of these sensors is usually the lack
of accuracy. For certain clinical environments fields and
continuous monitoring environments, this alternative is not
viable. Furthermore, wearables’ hardware typically only
allows to obtain valid information to calculate the heartrate,
but they are often useless for electrical heart ECG profile. To
overcome the disadvantage of the lack of accuracy, keeping
comfort and the physio-electric access profile capabilities,
a device based on low-cost hardware has developed for
this purpose. The signal captured by this device, although
extremely clear, has certain levels of noise and artifacts that
must be filtered.
In this paper, first, a set of filters have been applied on the
signal from the hardware platform developed, in order to
reduce the impact of the noise on the signal. This process
is fast enough to be carried out in real time. Finally, by
using this processing a clearer signal has been achieved,
allowing its use in certain clinical settings, for example,
in sleep monitoring systems. On the other hand, thanks
the modular development followed during the build of the
sensor, a platform sensor integration based on frameworks is
proposed in this paper. This platform makes possible a data
exchange between different spots of the system and a core
device. This central device, although it could be applied to
other hardware profiles, is a smartphone. The main reasons
for this choice were versatility, reduced cost and spreading
out of these devices in the society.
With the aim of illustrating the use of this architecture and
the use of these sensors as well, a sleep monitoring environment has been chosen. In this environment, the developed
ECG sensor has been incorporated; besides, EEG systems,
electromyography and galvanic skin response sensors have
been successfully integrated.
2
Introduccion
´
El uso de sensores ECG se ha popularizado en los últimos
anos
˜ gracias a la integracion
´ con diferentes wearables. A pesar de las ventajas de esta integracion
´ (comodidad, reduccion
´
32
Drive assistant combined with EEG data applied to aggressive driving perception
Emre Yay1 , Luis Miguel Soria Morillo2 , Natividad Martı́nez Madrid1 and Juan Antonio Ortega Ramı́rez2
1
University of Reutlingen
Reutlingen, Germany
2
University of Seville
Seville, Spain
Abstract
1
in real time. The infrastructure used in this study will allow
the staff to communicate with the driver, in order to calm the
driver down and, thus, to reduce the aggressiveness of the
driving behaviour. In this study, this process is performed
in an autonomous way, by showing audio-visual alerts to the
driver. Furthermore, if the car is compatible, the system will
play relaxing music in order to work towards stress reduction.
Introduction
According to a report of the Foundation for Traffic safety,
aggressive driving actions were found in 56 percent of fatal
crashes occurred in the last years. Furthermore, aggressive
driving accounts for more than half of all traffic fatalities. Aggressive drivers tend to have less concern for other drivers,
and this provokes bad habits such as speeding, tailgating,
honking frequently or gesturing angrily at other drivers. Alcoholic rehabilitation programmes are quite spread all along
the world. However, this kind of programs are not oriented to
aggressive behaviour on drivers. In contradistinction to alcohol problems, aggressive patterns can be determined throughout a physiologic study. For example, an electroencephalography (EEG) can help to determine the risk of a driver to be
involved in a road rage incident.
Rehabilitation programmes in these cases are focused on
showing the consequences of aggressive driving to this kind
of drivers by means of multimedia. However, in some cases
driving schools offer practical lessons to correct inappropriate behaviours related to aggressive profiling.
In the absence of intensive enforcement of driving laws, aggressive and stressed driving behaviours are not punished,
just the effect of this mental state on the road. For instance, a
driver will be sanctioned for speeding, but not because he/she
is under stress or irritated.
In this work, an approach to a fully functional ubiquitous system to detect aggressive behaviours is presented. Based on
biometrical parameters and by means of a connection to the
in-vehicle CAN-bus, the system can detect the presence of
several factors that lead to an increasing rates of the accidents
and road safety violations.
Using a portable and commercial EEG sensor, the brain activity of the driver is monitored in real time. From a database
that is built in this study, the system will analyse the current state and its deviation from an aggressive driving profile.
Based on this deviation that is obtained by applying an artificial neural network, the system will determine the risk level
of the situation.
This system will be connected to the server at a driving school
that allows qualified staff to check the profile of each driver
33
Evaluation of an energy-efficiency and safety relevant driving system on a driving
simulator
Emre Yay and Natividad Martı́nez Madrid
Juan Antonio Ortega Ramı́rez
Reutlingen University
University of Seville
School of Informatics
Department of Languages and Computer Systems
Alteburgstr. 150
Av. Reina Mercedes s/n
72762 Reutlingen, Germany
41012 Seville, Spain
{emre.yay, natividad.martinez}@reutlingen-university.de
[email protected]
Abstract
to educate the driver in energy-efficient or safe driving. Some
driving systems, like Artemisa, show recommendations
or warnings to the driver in order to support the driver
in efficient or safe driving. Other driving systems, like
eco:Drive or DriveDiagnostics, generate reports, in which
the driving behaviour is rated in terms of energy-efficiency
or safety and driving hints are shown to improve the driving
behaviour. However, existing driving systems cover either
the area of safety or energy-efficiency. Furthermore, they do
not consider the driver condition and do not adapt itself to the
individual driving behaviour, which would allow to increase
the user acceptance of the driving system and to increase
the road safety, as the driving system would not bother the
driver with recommendations that are not interesting in the
sense of the driver. Furthermore, recommendations could be
shown to the driver only in situations, in which the driver
is for example not stressed. On the basis of these facts,
a prototype of a driving system was developed that tries
to educate the driver in energy-efficient and safe driving.
Therefore, the driving system shows recommendations to
the driver when the driver is driving energy-inefficient or
unsafe. Furthermore, the driving system adapts itself to
the individual driving behaviour and considers the driver
condition. This allows to eliminate bad driving habits that
caused an inefficient or unsafe driving behaviour, while
considering the driver needs.
A driving simulator and 40 test drivers will be used to evaluate the prototype of the driving system. The evaluation includes the testing if the usage of the driving system has an
effect on the driving behaviour in terms of energy-efficiency
and safety. Furthermore, it will be evaluated if the adaptiveness of the driving system increases the user acceptance of
the driving system, which can lead to a sustainable use of
the driving system and, thus, to an increase of the energyefficiency and safety.
The following section introduces the developed driving
system and describes briefly the architecture of the driving
system. Section 3 shows the architecture of the driving simulator and explains the integration of the developed driving
system in the environment of the driving simulator. As the developed driving system will be evaluated, Section 4 presents
the evaluation plan of the driving system. Finally, a conclusion of this paper is presented and the further steps of the
work are given.
An energy-efficiency and safety relevant driving
system was developed that tries to improve the driving behaviour in terms of energy-efficiency and
safety by showing recommendations to the driver.
During the evaluation, the influence of the driving system on the energy-efficiency and safety of
the driving behaviour will be measured on a driving simulator. In this paper, the evaluation plan as
well as the preparation of the driving system and
the driving simulator are explained. In order to allow the creation of a recommendation, the driving
system has to be connected to the driving simulator.
However, the driving simulator does not provide all
needed information, why the limitations of the driving simulator have to be considered in the evaluation plan. Furthermore, the driving simulator has
to be prepared for the evaluation of the driving system regarding the scenario defined in the evaluation
plan. On the basis of the evaluation plan as well as
the prepared driving system and driving simulator
the evaluation will be done using 40 test drivers.
1
Introduction
Energy-efficiency and safety became more important in the
last decades due to the human-made climate change and the
increasing number of cars on the road that led to more accidents and fatalities. Thus, several laws were enacted to reduce the energy consumption, such as the CO2 limitations
for passenger cars in the European Union. Due to the increasing importance of energy-efficiency and road safety, car
manufacturers are trying to optimise the car and its individual parts, like the car body or the engine. However, there is
also the potential to increase the energy-efficiency and road
safety by adapting the driving behaviour to the current driving situation. According to Xiaoqui et al. [2011] and Chin and
Quek [1997], the driving behaviour has effects on road safety.
Furthermore, several studies showed that adapted driving
can save energy up to 30% [Haworth and Symmons, 2001;
Helms et al., 2010; Mierlo et al., 2004].
There are already existing driving systems, like,
eco:Drive [Fiat, 2010], Artemisa [Magana and Organero,
2011] or DriveDiagnostics [Lotan and Toledo, 2006] that try
34
2
Driving system
compares the current driving behaviour of the driver against
the driving rules and the driving profile. On recognition of
an unsafe or inefficient driving behaviour or when the current
driving behaviour differs significantly from the typical driving behaviour in a negative way, the processing layer starts
to decide whether to create a recommendation. During the
decision, the driving system considers the driver condition,
the driving profile and the future car state. In case, the driver
is for example not stressed, did not ignore the recommendation repeatedly in the past and will not improve the driving
behaviour in the near future, the processing layer decides to
show a recommendation and, thus, creates the corresponding
recommendation and passes it to the graphics layer.
The graphics layer of the driving system is responsible for
rendering the graphical user interface and presenting the recommendation to the driver as well as processing the input of
the driver. The graphical user interface is rendered on the invehicle display unit. The driver has the opportunity to switch
the user and to modify the user profile, i.e. name or surname,
using for example the touch capability of the display. The
recommendations are presented to the driver as a text on the
graphical user interface and using an audio voice that reads
the recommendation to the driver.
The driving system collects information from the car, the
driver and the environment in order to check if the driver is
driving energy-efficient or safe and to adapt itself to the individual driving behaviour of the driver. Furthermore, the driving system shows recommendations to the driver when the
current driving behaviour differs significantly from the typical driving behaviour. The driving system avoids to show
recommendations to the driver when the driver is in stress.
This allows to increase the road safety further, as the driving
system does not distract the driver for example in a stressful
driving situation.
3
Driving simulator
The driving simulator consists of three displays and a several
speakers to present the virtual world to the driver, as shown in
Figure 2. Furthermore, the dashboard of the driving simulator
shows the speed and engine speed of the virtual vehicle to
the driver. For steering the virtual vehicle, the driver has the
opportunity to use a steering wheel, pedals and a gear shift
knob. Besides, the controls for steering the virtual vehicle
and the presentation of the virtual environment, the driving
simulator allows the monitoring of the driver conditions, like
the brain activity or the heart beat, using an EEG and an ear
sensor.
The driving simulator is using three computers for simulation, data collection and presentation of applications to the
driver. The vehicle controls and the ear sensor as well as
the displays, speakers and the dashboard are connected to the
simulation computer that simulates the virtual environment
using the open source driving simulator OpenDS1 and calculates the driver stress level using emWave2 . OpenDS and
emWave send the information of the vehicle, like speed or
current gear, and the driver stress level to the data collector
computer using the tcp/ip interface. Additionally, OpenDS allows to steer the virtual vehicle by sending information about
the steering wheel and the pedal position using its tcp/ip interface.
The data collector computer is responsible for collecting
and logging the information from the virtual vehicle and the
user. Furthermore, it simulates the in-vehicle serial-bus sys-
Figure 1: Architecture of the driving system with the three
layers and their modules
Figure 1 illustrates the architecture of the driving system
including the different layers and their tasks. The architecture of the driving system is separated in three layers: data
layer, processing layer and graphical layer. The data layer is
responsible for gathering needed data from the car, the driver
and the environment using the in-car serial-bus systems and
additionally attached sensors. Furthermore, it aggregates the
incoming information and creates or updates a driving profile that represents the typical driving behaviour of the driver.
Finally, the data layer stores the driving profile and the collected data from the serial-bus systems and attached sensors
for further processing. Additionally, the data layer consists of
driving rules that describe an energy-efficient and safe driving behaviour, such as shift the gear as soon as possible or do
not exceed the speed limit.
On the basis of the information stored in the data layer, the
processing layer first predicts the car state. This allows to
show recommendations to the driver before the driver breaks
a rule and, thus, the driver is able to prevent the breaking
of a energy-efficient or safety relevant driving rule. Furthermore, the predicted car state allows the driving system to
avoid showing a recommendation when the driver will improve the driving behaviour in the near future. If the reader
is interested in the prediction of the car state, the reader is
encouraged to read [?]. Based on the predicted car state, the
driving profile and the information that is stored in the data
layer, the processing layer analyses the driving behaviour and
1
More information about the driving simulation software
OpenDS can be found on www.opends.de
2
emWave is a software of the company HeartMath. It allows the
monitoring of the heart-rate variability, the pulse and the stress of
the user. For more information please visit www.heartmath.com
35
Speaker
or engine speed of the engine control unit. The messages
have an hexadecimal identifier that allows to access the message. The incoming information from the data logger is put
into their corresponding control unit. For example, the information speed and engine speed is put in the control unit
engine, whereas the information about the EEG is passed to
the control unit EEG-monitor. This allows to access the gathered information from the virtual environment or the driver
monitoring using the CAN serial-bus system.
As shown in Figure 2, the data collector computer, respectively the CAN-bus simulation, is connected to the embedded
computer using a hardware CAN-interface. Thus, the embedded computer is able to access the data that is stored in the
virtual control units of the CAN-bus simulation for further
processing. The embedded computer represents the in-car
computer on which the driver relevant applications are running, like driving systems or the navigation. The prototype
of the driving system will run during the evaluation on the
embedded computer and will gather all needed data from the
CAN-bus simulation using the CAN-interface of the embedded computer. Therefore, the driving system registers all ids
of the virtual control unit messages in the hardware CANinterface, which then starts to listen for that messages on the
simulated CAN-bus. The incoming information from the simulated CAN-bus will be then used in the driving system for
showing recommendations to the driver.
Speaker
Displays
Speaker
Dashboard
Steering wheel /
Pedals /
Gear shift knob
Touchscreen
Driver seat
Ear
sensor
EEG
Speaker
Speaker
Simulation computer
emWave
TCP/IP
Embedded computer
Application
OpenDS
TCP/IP
CAN-Interface
Data Collector
computer
CAN-Interface
4
CANoe
The evaluation of the driving system will be done using 40
test drivers. During the evaluation it will be tested if the driving system has an influence on the driving behaviour in terms
of energy-efficiency and safety. Furthermore, it will be tested
if the user acceptance increases when using the adaptive feature of the driving system. For each test 20 test drivers will
be used to drive 16 km on a highway and 6 km within a city.
The highway track is shown in Figure 3 and consists of four
lanes, two in each direction. One lap on the highway is about
12 km. The track consists of different speed limits, which are
70 km/h, 80 km/h, 100 km/h and 120 km/h. However, to give
the driver the feeling to drive on a German highway, there are
also parts without a speed limit. The traffic on the highway
Control
Unit
TCP/IP
Control
Unit
CAN-Bus
COM-Server
Data Logger
COMClient
Evaluation plan
Engine
Control
Unit
Figure 2: The architecture of the driving simulator
tem using the software CANoe3 . The EEG sensor that monitors the brain activity of the driver is attached to the data
collector computer. The data logger within the data logger
computer collects the information from the EEG, OpenDS
and emWave and sends them to the CANoe software using
the COM connection between the data logger and CANoe.
Within CANoe, the CAN serial-bus system of the car is simulated that contains the simulation of the different control
units, e.g. engine control unit. Each control unit in the simulation contains different messages, like the messages speed
3
CANoe is a development and testing software tool of
the company Vector GmbH. For more information please visit
www.vector.com
Figure 3: The highway track with its speed limitations
36
Table 2: The evaluation plan to test if the driving system increases the user acceptance
In the second part of the evaluation 20 test drivers will be
used to test if the adaptive feature of the driving system increases the user acceptance of the driving system. In order
to measure the user acceptance, the Usefulness, Satisfaction
and Ease of Use (USE) questionnaire of Lund [2001] will be
used, as it is categorised in usefulness, ease of use, ease of
learning and satisfaction. However, questions about ease of
use and other questions will be removed that did not fit into
the evaluation. Furthermore, new questions will be added that
are relevant for the evaluation, such as a question about the
frequency of the recommendation. According to the evaluation plan shown in Table 2, the test drivers will drive four
journeys. In two journeys the adaptive feature of the driving system will be turned off, whereas in next two journeys
the adaptive feature will be turned on. Test drivers with an
even subject number will start with the highway track and the
adaptive feature turned off. An odd subject number will lead
that the test drivers will start with the city track and the adaptive feature turned on. After the first two journeys and after
the last two journeys the test drivers will fill out the modified
USE questionnaire, in which they will rate their experience
with the driving system in the last two journeys.
Figure 4: The city track showing the driving route and the
positions of the traffic lights
will consider the traffic limits, however, the traffic will drive
at a maximum of 120 km/h on the left lane and 80 km/h on
the right lane, that allows to represent the average speed of
the lorries on a real highway.
Figure 4 shows the city track and the driving route within
the city track. Within the city a general speed limit of 50 k/h
is set as the typical speed limit within cities is 50 km/h in
Germany. One lap on the driving route is about 1.2 km and
consists of three traffic lights and one roundabout. The traffic
within the city will be driving on a two lane road, one lane for
each direction, at a maximum of 30 km/h and will consider
the traffic lights.
For testing the effect of the driving system on the driving
behaviour regarding the energy-efficiency and safety, 20 test
drivers will drive once with and once without the driving system on both tracks. During the journeys the fuel consumption, the driven distance and the time when the speed limit
exceeded will be measured. However, the first kilometres of
the journey will not be measured, as the drivers have to get
familiar with the driving simulator. Table 1 shows the evaluation plan, in which the test drivers will drive in the first
two journeys without the driving system. This allows to measure the usual driving behaviour of the test drivers. In the last
two journeys the driving system will be introduced to the test
drivers, who will get energy-efficient and safety relevant driving recommendations from the driving system. Test drivers
with an even subject number will start the evaluation with
driving on the highway track, while test drivers with an odd
subject number will start driving within the city track.
5
Conclusion and Further Work
In this paper the evaluation of an energy-efficiency and safety
relevant driving system on a driving simulator was introduced. Therefore, the driving system was introduced and explained. Furthermore, the driving simulator was explained
and the different tracks that will be used in the evaluation
were shown. The connection between the prototype of the
driving system and the driving simulator was explained, as
well. Besides the presentation of the driving system and
the driving simulator, the evaluation plan was introduced that
showed the order of the test drives. To measure the effect of
the driving system to the driving behaviour the metrics driven
distance, fuel consumption and the time when the speed limit
was exceeded will be used. The effect the driving systems
adaptive feature to the user acceptance will be measured using the USE questionnaire.
Future work comprises the application of the evaluation using the driving simulator and the prototype of the driving system on the basis of the evaluation plan. Furthermore, the results of the evaluation will be analysed and another evaluation
will be done using a real car in order to verify the results of
the driving simulator evaluation in a real environment. On the
Table 1: The evaluation plan to test if the driving system has
an effect on the driving behaviour
37
basis of the experiences with the driving simulator gathered
during the evaluation, the driving simulator will be improved
by adding more sensors to the virtual car and by improving
the driving simulation.
References
[Chin and Quek, 1997] Hoong-Chor Chin and Ser-Tong
Quek. Measurement of Traffic Conflicts. Safety Science,
26(3):169–185, 1997.
[Fiat, 2010] Fiat. Eco-driving uncovered: The benefits and
challenges of eco-driving, based on the first study using
real journey data, 2010.
[Haworth and Symmons, 2001] Narelle Haworth and Mark
Symmons. Driving to reduce fuel consumption and improve road safety. Proceedings Road Safety Research,
Policing and Education Conference, 2001.
[Helms et al., 2010] Hinrich Helms, Udo Lambrecht, and
Jan Hanusch. Energieeffizienz im Verkehr. Energieeffizienz, pages 309–329, 2010.
[Lotan and Toledo, 2006] Tsippy Lotan and Tomer Toledo.
An In-Vehicle Data Recorder for Evaluation of Driving
Behavior and Safety. Transportation Research Board of
the National Academies, 2006.
[Lund, 2001] Arnold M. Lund. Measuring Usability with
the USE Questionnaire. Usability and User Experience,
8(2), 2001.
Available at http://web.archive.org/web/20141206144115/
http://www.stcsig.org/usability/newsletter/0110 measuring
with use.html
Last visit 13.06.2015.
[Magana and Organero, 2011] Victor Corcoba Magana and
Mario Munoz Organero. Artemisa, Using an Android device as an Eco-Driving assistant. Cyber Journals: Multidisciplinary Journals in Science and Technology, Journal
of Selected Areas in Mechatronics (JMTC), 2011.
[Mierlo et al., 2004] Joeri Van Mierlo, Gaston Maggetto,
Erik van de Burgwal, and Raymond Gense. Driving style
and traffic measures - influence on vehicle emissions and
fuel consumption. Proceedings / Institution of Mechanical
Engineers, 218, Part D: J. Automobile Engineering:43–50,
2004.
[Xiaoqiu et al., 2011] Fan Xiaoqiu, Ji Jinzhang, and Zhang
Guoqiang. Impact of Driving Behavior on the traffic safety
of Highway Intersection. Third Int. Conf. on Measuring
Technology and Mechatronics, 2:370–373, 2011. (2011).
38
Towards emotion pattern extraction with the help of
stress detection techniques in order to enable a healthy life
Wilhelm D. Scherz1, Juan Antonio Ortega2 and Ralf Seepold1
1
HTWG Konstanz
Faculty of Computer Science
Brauneggerstr. 55, 78462 Konstanz (Germany)
[email protected]
2
Department of Computer Language and Systems
Avda. Reina Mercedes s/n, 41012 Sevilla (Spain)
[email protected]
Abstract
Emotions can be manifold but their single occurrences depend on several factors, like in objective
situations of stress, happiness, sadness etc. and
they depend on individual perceptions and attitudes
in case a relevant situation occurs. The objective of
this work is to discuss how emotions patterns can
be extracted with the help of sensors, which are
able to detect emotions. As a first step towards the
question of emotions detection, a small prototype
system has been developed to detect stress with the
help of a mobile and wearable ECG.
Keywords: emotion pattern, stress, sensor, ECG.
1
Figure 1 Symptoms and physical response of stress.
Nowadays, stress is the result of the exposure to high
demands and pressure in daily life that can be both mental
and physical [8], e.g. constant desertions demand or constant time pressure; constant presence of stress, causes variety of disorders, like symptoms of overabundance of stress
are fatigue, sleep problems, etc. [9]
Stress can be artificially induced in a laboratory environment using different methods like the Trier test [10] or the
Strop test [11].
The effects of stress have not changed over the time but
the lifestyle, technologies and habits of modern life has
hanged a lot. Figure 1 illustrates some symptoms of stress:
In case of a thread, like shown in Figure 1, the body prepares him selves to fleeing or to confront the thread. In this
case, the brain releases cortisol and adrenaline hormones.
This is aiming to reduce the functionality of systems not
necessary for imminent surviving like genitourinary system,
digestion, hearing, peripheral view, etc. Also the activity of
systems supporting flee or fight strategies are increasing, for
example the heart rate or dilated pupils.
Stress is very often underestimated because of the subjective perception. This is one of the main reasons that complicate its detection. Very often people show immediately
symptoms of stress while others do not notice when passing
the threshold of just ‘being busy’ to an objective high stress
level [12].
Introduction
Besides several emotions, which have influence on our life
and life quality, stress is recognized as a factor with negative impact. Of course, stress may have a positive aspect in
case it helps us to come out of dangerous situations because
the body tries to join forces e.g. run away from the dangerous place. Short time stress is a natural response. Stress
appeared as a mechanism that allows people and animals to
react fast and effectively in dangerous situation. Stress releases biological mechanism that reorganises the body priorities and functions and tries to reach the maximum performance when there is danger. This is called the ‘fight or
flight response’ [1].
Moving towards a more ordinary scenario, stress is a negative sensation that is recognized as a disease [2] by organisations like the World Health Organisation (WHO). Some
well-known consequences of perdurable stress is failure to
respond adequate to fiscal, mental and emotional demands
[3, 4, 5].
Stress has also consequences for modern society: Longterm height stress levels lead to many diseases like burnouts
or cardiac infarcts [6] [7]. As a consequence for society, the
amount of people that will face limitations is increasing and
this leads to a growth of the treating and healing costs of
people suffering from long term stress.
39
Because of the subjective perception of stress, it is important to develop methods to determine stress objectively,
and if possible, to find a method to reduce the stress or support a better management of stress. In summary, a person’s
emotion is influence by many factors and stress is one factor
that influences in a quite objective way our behavior. The
purpose of this work is to make a first step towards the detection of emotion patterns with the focus on stress. Therefore, a small model has been developed to detect a stress
pattern. In order to evaluate this model, a prototype has been
developed and tested. In the following chapter, the state of
the art with respect to stress detection is reported. Chapter 3
presents the systems architect developed, and chapter 4 will
introduce the method to detect stress patterns. Chapter 5
interprets the results of the measurement. Chapter 6 concludes this word, and finally, an outlook on future activities
is given.
2
The third group uses external biological sensors like in
[16]. An example for these approaches is the measurement
of stress while driving. The driver is monitored with an
electrocardiogram (ECG) and an electromyogram that records the electrical activity of muscles (EMG), skin conductivity (SC), breath sensor and video camera that observes
the driver. The main disadvantage is the limitation in the
degree of movement. A second disadvantage is the missing
online analysis of the data while the data is collected. In this
case, the driver is not obtaining any kind of feedback about
his current status.
In contrast to the mentioned state of the art approaches,
the new system uses a compact low cost ECG that is wearable, not invasive and real-time capable. The developed system can report to the user, via a simple user-interface the
current status and while the data is visualised it can also be
buffered and stored for later analysis by professionals or for
validation. This work also is based on previous works for
stress measurement [3, 5] [17].
State of the art
As mentioned in the introduction, a first step towards emotion recognition will be taken with the help of stress detection. Therefore, this state of the art concentrates on stress
detection. Monitoring of the stress indicators is often used
but usually only for capturing exclusively the physical characteristic of the indicator without correlating the parameter,
for example the heart rate. In most of the cases, there is no
direct feedback to the user. Furthermore, these systems
work like a black box in a non-transparent way for the user.
After tracking and storing the data, data processing and
interpretation is done offline. Finally, a diagnosis is reported
to the user. In parallel, a shift that can be observed, from
highly professional systems into an area of semi-/nonprofessional systems that support medical recommendation
and monitoring systems. Generally, the systems can be
divided in three categories:
• Approaches without sensors
• Approaches requiring a laboratory environment
• Approaches with sensors
The first group covers approaches that do not require sensor. These approaches analyse small differences in behavior
that occurs between stressed or not stressed. Examples can
be found in [13, 14] in which the way of typing while being
stressed is monitored. The disadvantage of these approaches
is the dependency to the context and the complexity for
adapting to multiple environments. These approaches are
often context based and not human based because they are
focused on the context not the person.
The second group covers tests that are realised in controlled laboratory environments. The results provided by
them are the usually accurate and precise. A method that is
used often is the measurement of stress hormones like cortisol, adrenaline and others that are released and measurable
in the saliva and blood. Measuring these indicators the stress
can be determined [15]. The main drawback of these approaches is the limited mobility and the lack of real time
detection. Also the method is invasive and expensive due
the necessity of laboratory equipment.
3
System architecture
The system architecture should reflect the possibility to
connect a sensor to a light weighted mobile unit with preprocessing, storing and communication capability. .The
focus in this architecture is on ECG signals extraction, because an accurate stress detection should be possible with
capturing only this biological parameter. One of the advantages of an ECG sensor is that the sensor will not interfere with other sensors because the ECG sensor does only
measures the propagated electric field of the heart. In a
second step, it should be flexible and open to wear two ECG
sensory without having interferences. At a future point in
time, more parameters should be tracked with the system.
For the first prototype, we used a smartphone as the
communication platform for providing connectivity to external servers. The general system architecture is shown in
Figure 2.
Figure 2 System architecture for collecting, pre-processing and
visualizing of biological data.
The ECG e-Health kit module is in charge of continues
recording and measuring of biological data; the preprocessing module is in charge of receiving, processing and
storing the signal, and the user interface module is in charge
of giving feedback the user.
The ECG module computes the electrical impulses obtained from the electrodes. In this case, the e-Health kit
40
module is an ECG module with three electrodes. The ECG
module generates the ECG wave as an analogue output to
the pre-processing module. Currently the microcontroller is
only used for digitalising, AD converting the analogue wave
of the ECG, prepossessing and filtering the data. We use the
digital signal for calculating the heart rate (HR) and the RR
interval. The RR interval is defined as the interval between
two R peaks as shown in Figure 3
and generally is used for diagnostic [18]. Other uses of ECG
data is the identification of persons using unique data that
can be found in the ECG signal [19].
Figure 3 shows us the RR interval. The RR interval is defined as the R peak of a QRS complex to the next R peak
(1)(the time between two consecutive R waves in an ECG).
We use the RR interval or the HR to calculate the heart rate
variability (HRV). The HRV is later used to determine the
stress. The HRV correlates strong with the respiration sinus
as it is shown in [20].
𝑅𝑅!"#$%&'( = 𝑅! − 𝑅!!! , (1)
We calculate the HRV by examining the relations between two heart beats. We can assume that the HRV stays
constant (constant to the respiration) when a person is not
stressed and when a person is stressed the value will change
stronger.
The interval of values of the ECG used in our system is
from 0 to 350 mV. X axis is in msec. To reduce the resources needed for detecting the R peaks and for the calculation of the RR interval we defined a threshold of 250 mV.
This means we only consider values bigger than 250 mV.
Figure 3 Definition of an RR interval.
The difference between an RR interval and the HR is that
the HR is calculated counting and approximating the number of heart beats per minute and the RR interval indicates
the duration between two heart beats. The RR interval
changes also from one to other beat.
The stress is later determined using the RR Interval and
the HR. Later on with help of a visualization device like a
smartphone or a prototyping board the user can be informed
about his current status. In privies prototypes a traffic light
interface display has been used. The advantage of this interface is that the information is easy to understand and the
realization can be part of the board or via a small app in the
smart phone.
4
𝑅! > 250!" ,(2)
The criteria for a peak is that R has to be a maxima (5).
After a peak was detected we wait for 100 msec before we
start scanning again for R peaks.
𝑅!!! < 𝑅! ⋀ 𝑅! > 𝑅!!!
!!!"
𝑅! = 𝑅! ,(3)
As mentioned before, the RR interval is defined as the
time between two consecutive R waves in an ECG. After the
RR is calculated we correlate the RR intervals and compute
the HRV. If the results are visualised we obtain a two dimensional space with the correlation plot. See Figure 4.
The values for the plot are defined as
Stress detection method
Our method that was used for detecting stress is based on
the ECG signal. There are a lot of different possibilities to
capture biological parameters, but the ECG is easier to capture than other biological parameters and ECG is les subjective to external influences. Based on the system architecture
proposed before, the prototype is able to direct access and
process the sensor data in real time. The ECG also called
EKG describes the he electrical characteristic of the heart
𝑋, 𝑌 = {𝑅𝑅! , 𝑅𝑅!!! }, (4)
Figure 4 shows a height in the centre of the plot (800
msec for the RR interval). In this case the diagram indicates
Figure 4 Correlation plot of RR intervals that visualizes the HRV and their self-similarity.
41
Figure 5 Candidate left (yellow) with high HR and no stress. Candidate right (blue) with low
HR but stress.
a lower stress level. If the values would spread wider, it
would indicate that the person is under stress.
The variability can be expressed as a product of the derivations SD! andSD! .
𝑆𝐷!,! = 𝑣𝑎𝑟 𝑥! → 𝑥! = 5
𝑥! ± 𝑥!!!
2
logue output of the module is used as input for the second
module: the pre-processing module. This module has several core tasks:
• Reception of input data stream
• Pre-processing/pattern detection
• Storing of data
• Pushing of data to an external server
The current implementation has been realized on an Arduino UNO R3 prototyping board. The pattern recognition
algorithm has been implemented on this board. For storing
proposes a micro SD module has been added. The pushing
of data has been implemented via different channels: USB,
Bluetooth and via a Smartphone. (see Figure 6)
, (5)
Prototype and first results
5.1 Prototype
According to the system architecture described in chapter 3
the prototype has been implemented Figure 1. As e-Health
kit module an ECG module has been used where three elec-
5.2 Stress benchmarking
Currently, we are using two methods for inducing Stress on
the volunteers. Firstly, we used the Trier Social Stress Test
[10] because this method for inducing stress can be easily
replicated in our installations. Secondly, and we used a
driving simulator to create stress while driving a competition.
The Trier Social Stress Test was realised in three phases
(anticipation period, presentation period and cool down
period), the duration of each was 5 minutes. During the test,
the volunteer has to prepare and make a small presentation
on a random topic.
In the experiment, the used driving simulator induces
stress using two different mechanisms. A reward system
(Points) is introduced that rewards the volunteer for fast and
complex driving manoeuvres. The difficultly level increases
over time. The consequence of this system is that a fast
driver usually makes more complicated manoeuvres to earns
more points but at the same time increasing the risk of losing all points earned in case of a mistake.
Figure 6 Arduino UNO with ECG module.
trodes are connected to the body of a person (like in traditional ECG). This kit has been a customized development
but in principle it is working like any other ECG. The ana-
42
5.3 Results of the experiment
All candidates have been volunteer students aged between
23 and 28, none of them were smokers, alcoholics or have a
pacemaker. The methods used in this work assume that the
candidates do not have suffered from cardiac problems or
mental anomalies.
The obtained data from the experiment has been processed and visualised using the presented approach. The
result of two volunteers with special characteristics is shown
below. The first volunteer has a lower heart rate but is being
under constant stress and the second volunteer has a higher
heart rate but is not under stress. Both volunteers where
analysed for the same duration of time. The data of both
candidates is visualised in Error! Reference source not
found..
The right dataset clearly shows that the values in blue are
wider spread than the values of the candidate visualised in
yellow (left). The spreading of the blue values is caused by
the stress (see Figure 1). Stress influences the heart rate and
as the result the variation between two heartbeats becomes
bigger.
The mean values of the blue volunteer are between 0.7
and 0.9 sec for the RR interval (heart rate interval is between 66.7 bpm and 85.7 bpm). The main values of the
yellow volunteer are between 0.45 sec and 0.6 sec (130 bpm
and 85 bpm).
6
References
[1] X. V. N. V. K. T. C. M. A. D. L. Arthur S. P. Jansen,
„Central command neurons of the sympathetic
nervous system: basis of the fight-or-flight
response,“ in Science , Bd. Vol. 270 , 1995, pp.
644-646.
[2] WHO, „Cross-national comparisons of the prevalences
and correlates of mental disorders,“ Bulletin of the
World Health Organization, pp. 413-426, 2000.
[3] J. C. A. R. S. a. N. M. M. J. Martínez Fernández, „A
Sensor Technology Survey for a Stress Aware
Trading Process,“ IEEE Trans. On Systems, Man
and Cybernetics Part C: Applications and reviews,
Bd. 42, Nr. 6, pp. 809 - 824, 11 2012.
[4] J. Martínez Fernández, J. C. Augusto, R. Seepold und
N. Martínez Madrid, Why Traders Need Ambient
Intelligence, Germany: Springer Berlin Heidelberg,
2010.
[5] J. Martínez Fernández, J. C. Augusto, G. Trombino, R.
Seepold und N. Martínez Madrid, „Self-Aware
Trader: A New Approach to Safer Trading,“ in
Journal of Universal Computer Science, 2013.
Conclusion and future work
The work presents is a system that still under development
but is shows a system architecture and an algorithm used for
calculating stress based on the RR intervals and HRV. The
system is designed to be light weighted, to be integrated in
mobile, cheap and easy to use devices like a smartphone or
other mobile (embedded) devices. The algorithm was developed to have a small footprint and that it can be ported to
small prototyping boards like Arduino.
The results of the experiment shows that the detection of
stress is successful even when different conditions occur,
like a naturally low or high heart rate.
7
[6] K. Orth-Gomér, S. Wamala, M. Horsten , K. SchenckGustafsson, N. Schneiderman und M. Mittleman,
„Marital stress worsens prognosis in women with
coronary heart disease: The Stockholm Female
Coronary Risk Study,“ in Journal of the American
Medical Association, 2000.
[7] Y. Mei, M. D. Thompson, R. A. Cohen und X. Tong,
„Autophagy and oxidative stress in cardiovascular
diseases,“ in Biochimica et Biophysica Acta (BBA) Molecular Basis of Disease, 2015.
[8] T. Kidd, L. A. Carvalho und A. Steptoe, „The
relationship between cortisol responses to
laboratory stress andcortisol profiles in daily life,“
Biological Psychology, pp. 34-40, 25 02 2014.
Future work
Some of the main problems from the developed system is
the sensibility of movements for example during sports, and
interference sensibility of the wires of high frequency
sources for example energy lines. These problems can generate strong interference and artefacts in the signal. Possible
solutions for these problems are adding more filtering or
changing to another sensor that does have not these limitations. As an alternative for an ECG approach a pulse oximeter can be used but this option but this still has to be tested.
An untouched point in this approach is how the verification of the results and the reduction of false positives can be
realized.
Another question that has to be answered is if the person
has an influence the results/approach and if this is the case
how can this be managed? Is calibration of the sensor a
solution for the problem?
[9] Å. Torbjörn, A. John, L. Mats, O. Nicola und K.
Göran, „Do sleep, stress, and illness explain daily
variations in fatigue?,“ Journal of Psychosomatic
Research, pp. 280-285, 20 01 2014.
[10] C. Kirschbaum, K.-M. Pirke und D. H. Hellhammer,
„The 'Trier Social Stress Test'- A Tool for
Investigating Pyschobiological Stress Responses in
a Laboratory Settings,“ Neuropychobiologie, pp.
78-81, 1993.
[11] J. R. Stroop, „Studies of interference in serial verbal
reactions,“ Journal of Experimental Psychology, pp.
43
643-662, 1935.
[23] T. Akerstedt, A. Knutsson, P. Westerholm, T. Theorell,
L. Alfredsson und G. Kecklund, „Sleep
disturbances, work stress and work hours,“ Journal
of Psychosomatic Research, pp. 741-748, 5 March
2001.
[12] N. Martinez Madrid, J. Martinez Fernandes, R. Seepold
und J. C. Augusto, „Ambient assisted living (AAL)
and smart homes,“ Springer Series on Chemical
Sensors and Biosensors Volume 13, pp. 39-71,
2013.
[13] S. D. Gunawardhane, P. M. De Silva, D. S. Kulathunga
und S. M. Arunatileka, „Non Invasive Human
Stress Detection Using Key Stroke Dynamics and
Pattern Variations,“ in International Conference on
Advances in ICT for Emerging Regions (ICTer),
Colombo, 2013.
[14] L. Vizer, L. Zhou und A. Sears, „Automated stress
detection using keystroke and linguistic features: an
exploratory study,“ Int. J. of Human-Computer
Studies, Bd. 67, Nr. 10, pp. 870-886, 2009.
[15] H. Juliane und S. Melanie, „The physiological response
to Trier Social Stress Test relates to subjective
measures of stress during but not before or after the
test,“ Psychoneuroendocrinology, Volume 37, Issue
1, 2012.
[16] J. A. Healey und R. W. Picard, „Detecting Stress
During Real-World Driving Tasks Using
Physiological
Sensors,“
in
Intelligent
Transportation Systems, IEEE Transactions on
(Volume:6 , Issue: 2 ), 2005.
[17] W. D. Scherz, J. A. Ortega, N. M. Madrid und R.
Seepold, Heart Rate Variability indicating Stress
visualized by Correlations Plots, Bd. 9044,
Granada: Springer International Publishing, 2015,
pp. 710-719.
[18] D. Dubin, Rapid Interpretation of EKG’s, Tampa,
Florida: COVER Publishing Co., 2000.
[19] S. A. Israel, J. M. Irvineb, A. Chengb, M. D.
D.Wiederholdc und B. K. Wiederholdd, „ECG to
identify individuals,“ Pattern Recognition, pp. 133142, 21 05 2004.
[20] J. A. Hirsch und B. Bishop , „Respiratory sinus
arrhythmia in humans: how breathing pattern
modulates heart rate,“ in American Journal of
Physiology - Heart and Circulatory Physiology,
New York, 1981.
[21] D. Leger, „The cost of sleep-related accidents: A report
for the National Commission on Sleep Disorders
Research,“ Sleep, p. 84–93, 17 May 1995.
[22] J. de Miguel-Díez, P. Carrasco-Garrido, R. JiménezGarcía, L. Puente-Maestu, V. Hernández-Barrera
und A. López de Andrés, „Obstructive sleep apnea
among hospitalized patients in Spain, analysis of
hospital discharge data 2008–2012,“ Sleep and
Breathing, 8 January 2015.
44
SmartDriver: An assistant for reducing stress and improve the fuel consumption
V. Corcoba Magaña and M. Muñoz Organero
Dpto. de Ingeniería Telemática
Universidad Carlos III
[email protected]
Abstract
increase may be due to multiple causes as: deceleration lane,
a call, traffic density, etc.
There are many proposals to detect stress and measure the
workload [Healy and Picard, 2000] [Healy and Picard, 2005].
Most of them are based on physiological features such as
electromyogram, electrocardiogram, respiration, and skin
conductance. In [Healy and Picard, 2000] the authors propose
to use pattern recognition techniques to detect stress in automobile drivers. They employed four physiological sensor signals: electromyogram, galvanic skin, and respiration through
chest cavity expansion. They were able to detect the stress by
86.6% (combining all the features). The success rate was
62.2% using a single sensor. Reference [Zhai and Barreto,
2006] presented an algorithm based on support vector machines to classify the affective states between “stress” and
“relaxed”. They monitored users using the following non-invasive and non-intrusive sensors: Galvanic Skin Response,
Blood Volume Pulse, Pupil Diameter and Skin Temperature.
In this case the percentage of success was even a 90.10%.
Reference [Rani et al., 2002] proposed a real-time method for
stress detection based on heart rate variability (HRV) using
Fourier and wavelet analysis. [Ji et al., 2004] presented a
probabilistic model for detecting fatigue based on visual
characterize such as eyelid movement, gaze movement, head
movement, and facial expression. This method was extended
to detect “Nervous” and “Confused” affective states. This
work demonstrates the suitability of Bayesian Network for
information fusion and estimation of the stress level.
The main problem in this research topic is that there are no
solutions to reduce the driver stress. The majority of the researchers are focused on detecting the stress level of the
driver. On the other hand, most of the proposals are only validated through simulators. In this paper, we propose a system
that makes recommendations in order to reduce the driving
workload. These tips are constructed taking into account:
The stress, safety and fuel consumption are variables that are strongly related. If the stress is high, the
driver is more likely to make mistakes and have accidents. In addition, he or she will make decisions at
short notice. The acceleration and deceleration increases, minimizing the use of energy generated by
the engine. However, the stress can be reduced if we
provide information about the environment in advance. In this paper, we propose a driving assistant
which issues tips to the driver in order to improve
the stress level. These tips are based on speed. The
solution estimates the optimal average speed for
each road section. In addition, the solution provides
a slowdown profile when the user is close to a stress
area. The objective is the initial vehicle speed minimizes the stress level and the sharp acceleration
(positive and negative). In addition, the system employs gamification tools to encourage the driver to
follow the recommendations. On the other hand, the
proposal provides information about the driver and
the road state in an anonymous way in order to improve the management of the city traffic. The proposal is run on an Android device and the driver
stress is estimated using non-intrusives sensors and
telemetry from the vehicle.
1
Introduction
Most of traffic accidents are due to human errors. In [National
Highway Traffic Safety Administration, 2008] risk factors of
traffic accidents are categorized as follows: human factors
(92%), vehicle factors (2.6%), road/environmental factors
(2.6%), and others (2.8%). Fatigue and stress are the cause of
many accidents.
Stress can be defined as a change from a calm state to an
excitation state in order to preserve the integrity of the person.
Most stressors are intellectual, emotional, and perceptual.
There are two types of stress: eustress and distress. Eustress
is a good stress that improves performance and motivates.
The stress is also classified as “eustress” when it leads us to
a favorable state. In opposition, if the stress is negative and
causes a degradation of performance, it is called "distress".
This type of stress is due to an increase in the workload. This




45
Driver habits
Vehicle Telemetry
Road Information
Driver Profile
Data Layer
Driver
Features
Vehicle
Telemetry
Road
Conditions
Location
Road Pictures
Pre-Processing Layer
Select Driving Samples
Driving Samples
Driving
Database
Processing Layer
Feedback
Recommendations
Rankings
GreenHouse Gases
Badges
Fuel Consumption
Deceleration
Optimal
Pattern
Average Speed
Information
Traffic
Incidents
Stress Region
Traffic
Signs
Unusual Values
while driving
Fig. 1. Schema of the driving assistant.
2.1 Data Acquisition System
2
SmartDriver Assistant
We need a lot of information in order to make recommendations that will improve safety and fuel consumption. This information can be classified into three groups: driver features,
life style tracking, and driving tracking.
There is currently a large number of sensors (fixeds and mobiles) which allow us to monitor all user activity. The information captured is employed to build recommendations that
reduce pollution caused by vehicles and improve the safety.
Figure 1 captures a schema of the proposal. The objective
of the recommendations is smooth driving. This type of driving improves safety and reduces stress because the driver has
more time to make decisions. In this case, the risk of accidents decreases. On the other hand, fuel consumption and the
emission of gaseous pollutants also improves because the
power produced by the engine is used. Then, we are going to
describe the elements of the solution.
Driver features
They affect both fuel consumption and safety. In the previous
section, we saw how the appearance of fatigue and stress occurs gradually and is strongly related to the driver.
The age, gender, driving experience and personality are
factors that determine the beginning of fatigue and stress. In
previous research studies, many authors have highlighted that
older drivers experience fatigue before young drivers.
On the other hand, the women stop driving when they feel
sleep. However, men are still driving. Personality is another
element that influence. Extroverted people are more likely to
46
fall asleep. On the other hand, people uninhibited make more
mistakes during the driving without being drowsy and the
fuel consumption is higher. In our case, we are going to take
into account the driving style (aggressive, normal or efficient). The driver profile can be obtained objectively analyzing the vehicle telemetry.
Life Tracking
Driver activity affects his or her stress level, fatigue and
health. It is also frequent that drivers worsen the driving style
from the point of view of fuel consumption when they have
not rested enough. The proposed solution monitors the following variables:

Sleeping Time: Many research works conclude
sleep has a strong relationship with the traffic accidents and the tiredness. If driver has not slept
enough hours, the solution must adapt the recommendations to maximize safety. For example, warn
the user in advance when he or she is approaching a
crossroads.

Awakened Time: Sleep-related factors such as sleep
deficit, sleep fragmentation and sleep apnea also increase accident risk. In [Young et al., 1997], the authors studied the effects of the sleep-disordered in
the probability of accidents. The conclusions were
that you people with apneas are more likely to have
accidents

Working time: If the driver is sitting many hours at
the same location, it could indicate that he or she is
in the workplace. This is another variable related to
fatigue. The trip from home to work is different that
return trip from the point of view of safety.

Acceleration and deceleration: The acceleration
(positive and negative) may indicate the presence of
stress or fatigue. The cause of sudden accelerations
could be the driver wants to arrive early to the destination, while a sharp slowdown may mean that he
or she was asleep.

Standard deviation of vehicle speed: Maximize the
driving at constant speed has a positive effect on fuel
consumption. In this situation, there is no acceleration resistance force. Therefore, the tractive force required to move the vehicle will be less. On the other
hand, if the driver has to be changing the speed constantly, the probability of making driving mistakes
increases as well as the stress level.

Positive Kinetic Energy (PKE) measures the aggressiveness of driving and depends on the frequency
and intensity of positive accelerations [Firth and
Cenek, 2012]. A low value means that the driver is
not stressed and drives smoothly. An unusual high
value may indicate that driver are driving in an area
that requires special attention such as acceleration
lanes or roundabouts. It is calculated using the following equation:
𝑃𝐾𝐸 =
𝑑
(1)
where v is the vehicle speed (m/s) and d is the trip
distance (meters) between 𝑣𝑖 and 𝑣𝑖−1 .
Driving Tracking
The parameters that are monitored can be classified into three
groups: driver, vehicle, and environment.
Driver:
 Heart Rate Variability: Stress, fatigue and sleepiness has a great impact on the automatic nervous
system. HRV signals are employed as an indicator
of ANS neuropathy for normal, fatigued and drowsy
states because the ANS is influenced by the sympathetic nervous system and parasympathetic nervous
system. This indicator is not intrusive. A high variability means the driver has stress. In opposition, a
low variability could be since the driver is tired or
asleep.

∑(𝑣𝑖 − 𝑣𝑖−1 )2
Skin Temperature: It changes due to the activity of
the central and peripheral nervous system. The emotional nervous excitement affects to the sweat
glands. If the driver is relaxing, sweats will decrease.
47

Time: Fuel consumption is increased at rush hours.
The driver has to accelerate and slow down more
frequently. In addition, the engine is switched on
during more time. This situation causes stress, increasing the accidents risk. On the other hand, night
driving maximizes the likelihoods of sleep despite
he or she has previously slept. This is due to the
sleep cycle.

Traffic state: When traffic is heavy, the stress level
increases. In these cases, the tips must be adapted in
order to avoid acceleration and deceleration.

Weather conditions: The number of vehicles on the
road grows when the weather conditions are bad, increasing the likelihood of traffic incidents. Moreover, the roll coefficient changes. Therefore, the advice have to take into account that factor. In addition, many studies highlight that when it is hot, the






fatigue appears before. On the other hand, cognitive
capacity of the driver worsens when it is cold.


Traffic signs: there are a great number of traffic
signs which force the user to stop or decrease speed.
Therefore, they cause stress. In addition, sometimes
the visibility conditions are not good and the driver
cannot take decisions in advance. The result is an
increase in fuel consumption due to sharp slowdowns and even traffic accidents.
2.3 Stress region
On the road there are areas in which increase the driver workload. The causes can be: curves, traffic signs, roundabouts,
crossings, acceleration lanes, departure lanes and road topology. In this section, we propose an algorithm to reduce the
stress level in this type of regions. The proposal estimates an
optimal slowdown pattern when drivers are approaching to a
stress area. The objective is that the driver come into the
stress region driving at a speed that minimizes workload.
Slope: energy demand depends on this variable. If
the slope is ascending, the tractive force required to
move the vehicle will increase. On the contrary, if
the slope of the road is down it will act as a force
that helps to move the vehicle.
Devices
We employ the following devices to monitor the driver and
the vehicle:
Mobile Device: Current mobile devices have a large number
of sensors that allow us to obtain information about the user
and the environment. In this work, we use the camera to detect traffic signs. It is also used to take a picture when the
vehicle is located in a stress area. On the other hand, GPS is
used to obtain the vehicle location and the vehicle telemetry:
speed, acceleration, deceleration, the percentage of time driving at constant speed, etc. The Internet connection is employed to obtain information about the road state and weather
conditions. Also, it allows to send data (most stressed users,
traffic incidents and unusual telemetry values) that can be
used by third parties to improve the management of traffic.
Microsoft Band: This wereable is cross-platform and provides an SDK for Windows Phone, Android, and IOS. It allows us to estimate the stress level using multiple sensors:










Age
Driving Experience
Profile of the driver (Aggressive| Normal| Efficient)
Gender (Male| Female)
Weather
Traffic
Prediction of driving workload
A multi-layer perceptron (MLP) [Rumelhart et al., 1985] is
employed in order to predict the driver workload. It is based
on the results obtained by other drivers with similar characteristics. This algorithm is an artificial neural network that
has multiple layers and whose main advantage is to allow
non-linearly-separable problems. Neural networks were proposed in the 1940, when Warren McCulloch (a psychiatrist
and neuroanatomist) and Walter Pitts (a mathematician) explored the computational capabilities of networks made of
very simple neurons [Widrow and Hoff, 1960]. Later, in
1943, [Kennedy and Eberhart] introduced the perceptron, the
simplest form of a neural network. The perceptron consists of
a single neuron with adjustable synaptic weights and a threshold activation function. This proposal guaranteed the convergence only if the problem was linearly separable due to the
basic properties of the perceptron which separate entries into
two outputs.
The basic MLP structure consists of an input layer, output
layer and one or more hidden layers. The number of layers
determines the kind of problem that we can solve. The single
layer perceptron is limited to calculating a single line of separation between classes. On the other hand, a three layer perceptron can produce arbitrarily shaped decision regions. The
single layer perceptron is limited to calculating a single line
of separation between classes. On the other hand, a three
layer perceptron can produce arbitrarily shaped decision regions (Kolmogorov theorem), and are capable of separating
any classes. Each layer has a set of neurons. The number of
neurons depends on the type of problem to be solved. The
neurons are connected with other neurons using weighted
connections. Figure 2 captures the neuronal network to estimate the driving workload. Figure 3 shows a schema of the
Optical heart rate monitor
Three-axis accelerometer
Gyrometer
GPS
Microphone
Ambient light sensor
Galvanic skin response sensors
UV sensor
Skin temperature sensor
Capacitive sensor
2.2 Retrieving Driving Samples
The degree of stress is not the same for all users, even if they
are driving under equal conditions. As mentioned above,
there are a multitude of factors such as age, gender, and driving experience that affect stress. We have to take into account
all these factors to improve the accuracy of recommendations. Tips have to be personalized for each driver. E.g.: an
old driver feels more stress than a young when they drive at
high speed. In this case, we propose to build a cluster considering the following driver features and the road state:
48
Fig. 2. Multilayer Perceptron to estimate the heart rate
solution. The proposal estimate the optimal initial speed
based on the previous driver behavior using the neural network. A deceleration profile is then calculated if the driver
must slow down.
Driving
Behavior
Deceleration
Stress Region
Profile
Optimal
Vehicle
Initial
Speed
Speed
Fig. 3. Proposal to reduce driver stress in stress areas.
go out from the road. However, the optimal speed from the
point of view of stress and safety is a dynamic value. It
changes depending on age, gender, physical activity, level of
fatigue, driver skill, traffic and weather conditions, etc.
Particle Swarm Optimization (PSO) is used to estimate the
average speed for each section road. It was proposed by Kennedy and Eberhart en 1995 [Kennedy and Eberhart, 1995]. It
is an evolutionary algorithm based on the social behavior of
bird flocks. The PSO algorithm maintains multiple potential
solutions at one time. Each solution is represented by a particle in the search space. It has the following elements:




Position: In our case is the recommended speed.
Pbest: This is the best position on the current particle
(speed that minimizes the heart rate)
Gbest: It is the best position among all particles
Speed: Is calculated using equation 2. It determines
what will be the next speed of the particle.
2.4 Optimal Average Speed
Speed is a factor that is closely related to demand attention
from the driver. When the speed is high, the driver has less
time to make decisions. This causes an increase in the stress
level and the likelihood of traffic accidents. In this paper we
propose an algorithm to estimate the optimal speed.
On the roads, we can find road signs which recommend a
speed. These tips are only based on the road topology (slope,
angle, and the road width). Its aim is to prevent the vehicle
𝑣𝑖 (𝑡 + 1) = 𝑤𝑣𝑖 (𝑡) + 𝑐1𝑟1 ∗ (𝑃𝑏𝑒𝑠𝑡(𝑡) − 𝑥𝑖 ) +
𝑐2𝑟2 ∗ (𝐺𝑏𝑒𝑠𝑡(𝑡) − 𝑥𝑖 (𝑡))
(2)
where vi(t) is the particle’s velocity at time t, w is the inertia
weight, xi(t) is the particle’s position at time t, Pbest is the
particle’s individual best solution as of time t, and gBest(t) is
Fig. 1. Magnetization as a function of applied field. Note that “Fig.” is abbreviated. There is a period after the figure number, followed by
two spaces. It is good practice to explain 49
the significance of the figure in the caption.
the swarm’s best solution as of time t, c1 and c2 are two positive constants, and r1 and r2 are random values in the range
[0-1]
The particles “fly” or “swarm” through the search space to
find the minimum value. During each iteration of the algorithm, they are evaluated by an objective function to determine its fitness. Next position is calculated by equation 3
𝑥𝑖 (𝑡 + 1) = 𝑥𝑖 (𝑡) + 𝑣𝑖 (𝑡 + 1)
accomplish a certain behavior or to compare the performance
of users. Achievements do not normally imply monetary
compensation, but they are based on an emotional reward. It
has a positive impact on the user to reach a pre-set of objectives and is based on the concept of "status". For this reason,
we have incorporated the following achievements to our
smart-driving assistant:
(3)
1.
2.
3.
Complete a lap without accelerating sharply
Complete a lap without decelerating sharply
Complete a lap with a low value of PKE (Positive
Kinetic Energy)
4. Complete a lap following the deceleration pattern
provided by the assistant
5. Complete a lap driving at the average speed recommended
6. Get 5 points in the Driving Style/Safe Ranking
7. Get 7 points Ranking Driving Style/Safe Ranking
8. Get 10 points in the Driving Style/Safe Ranking
9. Position in the top 5 in the Driving Style ranking
10. Position in the top 5 in the Safe Driving Ranking
11. First position in the Driving Style Ranking
12. First position in the Safe Driving Ranking.
where xi(t) is the current particle’s position and vi(t + 1) is
the new velocity.
The definition of the fitness function is very important in
this type of algorithms. In our case is a multi-layered perceptron (MLP). The input variable are the following: Average
Speed, number of high accelerations (positive and negative),
standard deviation of vehicle speed, positive kinetic energy,
sleeping time, awakened time, working time, and temperature. The challenge is to minimize the stress level. Therefore,
the output is the heart rate. Fitness Function is trained using
driving samples obtained by other drivers under similar conditions.
2.4 Providing feedback
2.5 Information provided by the assistant
As we saw in previous works, it is important to encourage
and motivate the driver in order to he or she follows the recommendations of the assistant. We employ techniques of
gamification to achieve this goal. Gamification can be defined as the use of game design elements in non-game contexts such as learning environments. The idea is to use concepts from games like: the challenge, the competitiveness and
progression in order to motivate the user for improving the
driving style. In our case, the system provides the following
feedback when the driver completes the trip:
The driving assistant described in this paper is one of the elements of the HERMES project [HERMES, 2015]. The objective of this project is to integrate different agents and infrastructure elements of a Smart City in a cooperative and
massive system that optimizes urban movements, minimizes
the emission of pollutant gases, maximizes the well-being of
the citizens and offers new opportunities for business on the
Smart City.
In this context, the vehicles act as mobile sensors that collect data and send them to a central system. This central system labels the data semantically and analyzes them using artificial intelligence algorithms. The results are recommendations to improve the management of the city. For example, it
is possible to predict the traffic or infer which is the most polluted area of the city. In addition, the HERMES project allows access to these information flows in real time following
the philosophy of Linked Open Data. The aim is to build an
ecosystem for business development in the Smart City.
SmartDriver assistant provides the following information:
Driving style ranking: the position is obtained using a fuzzy
logic system. This system evaluates the driving from the
point of view of safety and energy saving. The output depends on variables such as: acceleration, deceleration, positive kinetic energy, etc. If the user is driving smoothly, he or
she will get a high score. A smooth driving influences positively both fuel consumption and safety. We employ fuzzy
logic because it allows us to simulate the human knowledge
when carrying out certain tasks such as driving. The objective
is to model the behavior of an efficient and safe driver.
Traffic incidents: Its number has increased in recent years due
to the rapid growth of the metropolitan population and the
number of vehicles in circulation. A traffic incident is defined
as a non-recurrent and unpredictable event that interrupts the
flow of normal traffic by reducing the capacity of the road
[Srinivasan et al., 2003]. Traffic incidents include events such
as: accidents, disabled vehicles, bad weather conditions, rock
falls, road works, and malfunction of traffic signs. The solution detects traffic accidents taking into account the usual
driving profile of the driver and the real time telemetry.
Safe Driving Ranking: The position in this ranking depends
on the degree of compliance with safety recommendations
provided by the assistant.
Green House Gases and Fuel Consumption: Driver can compare the result with the value obtained in other similar journeys (road type, weather, and traffic) made in the last month.
Badges: They are an important element for many people in
order to encourage them to save fuel and to drive safely.
Achievements are a traditional gamification method used to
50
Traffic signs: In regions where there are many traffic signs,
the stress level and fuel consumption increases. If the driver
know them in advance, he or she can adjust the speed, avoiding sudden downturns. The result is that the energy generated
by the engine is maximized and the driving mistakes are reduced. In the literature there are many methods to detect traffic signs. In our case, we use an algorithm with three stages:



References
[National Highway Traffic Safety Administration, 2008] National Highway Traffic Safety Administration. (2008).
National motor vehicle crash causation survey: Report to
congress. National Highway Traffic Safety Administration, Washington, DC, DOT HS, 811, 059.
[Healy and Picard, 2000] J. Healy and R. Picard, Smartcar:
detecting driver stress. 3rd Int’l Workshop on Nonlinear
Dynamics and Synchronization (INDS), 2000.
[Healy and Picard, 2005] J. Healy and R. Picard. Detecting
Stress During Real-World Driving Tasks Using Physiological Sensors. IEEE Transactions on Intelligent Transportation Systems, vol. 6, no. 3, 2005.
[Zhai and Barreto, 2006] A. Zhai and A. Barreto. Stress Detection in Computer Users Based on Digital Signal Processing of Noninvasive Physiological Variables. Engineering in Medicine and Biology Society, 2006. EMBS
'06. 28th Annual International Conference of the IEEE,
vol., no., pp.1355,1358, Aug. 30 2006-Sept. 3 2006. doi:
10.1109/IEMBS.2006.259421
[Rani et al., 2002] P. Rani, J. Sims, R. Brackin, and M.
Sarkar. Online stress detection using psychophysiological
signals for implicit human-robot cooperation. Robotica,
vol. 20, no. 6, pp. 673–685, Nov. 2002.
[Ji et al., 2004] Q. Ji, Z. Zhu, and P. Lan,. Real-time nonintrusive monitoring and prediction of driver fatigue. IEEE
Trans. Veh. Technol., vol. 53, no. 4, pp. 1052–1068, Jul.
2004.
[Young et al., 1997] T. Young, J. Blustein,, L. Finn and M.
Palta, 1997. Sleep-disordered breathing and motor vehicle
accidents in a population-based sample of employed
adults. Sleep 20, 608–613.
[Firth and Cenek, 2012] William Frith and Peter Cenek. AA
Research: Standard Metrics for Transport and Driver
Safety and Fuel Economy, Opus International Consultants Central Laboratories, November 2012.
[Rumelhart et al., 1985] Rumelhart, David E., Geoffrey E.
Hinton, and Ronald J. Williams. Learning internal representations by error propagation. No. ICS-8506. California, 1985.
[McCulloch et al., 1943] McCulloch, W. S. and Pitts, W. H.
(1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics, 5:115133.
[Rosenblatt, 1958] F. Rosenblatt. The perceptron: a probabilistic model for information storage and organization in
the brain. Psychological review 65.6 (1958): 386.
[Widrow and Hoff, 1960] B. Widrow and M. E. Hoff. Adaptive switching circuits,” WESCOM Conv. Rec., pt. 4, pp.
96-140, 1960.
[Kennedy and Eberhart, 1995] J. Kennedy, R. Eberhart. Particle swarm optimization. Neural Networks, 1995. Proceedings., IEEE International Conference on , vol.4, no.,
Shape: We look for geometric figures which coincide with traffic signs.
Color detection: We choose shapes which contain
colors present at traffic signs
Viola and Jones: This algorithm is applied on the
shapes selected in the previous step.
Stress Region: There are road sections where driving is more
difficult due to the visibility problems (curve), topology of
the road (roundabout) or the activities that driver should make
(deceleration lane, acceleration lane). Data Envelopment
Analysis [Charnes et al., 1985] is used to estimate the efficiency of driving and the stress level in each area. Data envelopment analysis (DEA) is a linear programming methodology to estimate the efficiency of multiple decision making
units (DMUs) when the production process presents a structure of multiple inputs and outputs. This method was proposed by Charnes, Cooper, and Rhodes [11]. In our proposal,
each DMU represents a different driving samples obtained
under similar conditions (time, weather, traffic, sleeping
time, and road type) by the same driver. The aim is to detect
the road points where the driver workload is high and fuel
consumption increases. This algorithm compares different regions by assigning them an efficiency value. If it is low, it
means that it is a stress area.
Unusual driving values: The driving assistant presented in
this paper is continuously monitoring the driver behavior
driver. When it detects an exceptional value, it is immediately
sent to a central system (HERMES). These values can be:
sudden accelerations (positive and negative), high speed,
standard deviation of vehicle speed, heart rate, etc. This information can be used to improve traffic management and
safety in a Smart City.
Acknowledgments
The research leading to these results has received funding
from the “HERMES-SMART DRIVER” project TIN201346801-C4-2-R within the Spanish "Plan Nacional de I+D+I"
under the Spanish Ministerio de Economía y Competitividad
and from the Spanish Ministerio de Economía y Com-petitividad funded projects (co-financed by the Fondo Europeo
de Desarrollo Regional (FEDER)) IRENE (PT-2012-1036370000),
COMINN
(IPT-2012-0883-430000)
and
REMEDISS
(IPT-2012-0882-430000)
within
the
INNPACTO program.
51
pp.1942,1948
vol.4,
Nov/Dec
1995.
doi:
10.1109/ICNN.1995.488968
[HERMES, 2015] HERMES project. URL: http://madeirasic.us.es/hermes. Last access: June 2015
[Srinivasan et al., 2003] D. Srinivasan, L. Wee Hoon, R.L.
Cheu. Traffic incident detection using particle swarm optimization. Swarm Intelligence Symposium, 2003. SIS '03.
Proceedings of the 2003 IEEE , vol., no., pp.144,151, 2426 April 2003.
[Charnes et al., 1985] A Charnes, W.W Cooper, B Golany, L
Seiford, J Stutz, Foundations of data envelopment analysis for Pareto-Koopmans efficient empirical production
functions, Journal of Econo-metrics, Volume 30, Issues
1–2, October–November 1985, Pages 91-107, ISSN
0304-4076.
[Charnes et al., 1978] A. Charnes, W.W. Cooper and E.
Rhodes (1978), "Measuring the Efficiency of Decision
Making Units," European Journal of Operational Research 2, pp.429-444.
52
Plataforma para gestión de información de ciudadanos de una SmartCity
Jorge Yago Fernández1, Álvaro Arcos García1, Juan Antonio Álvarez-García1,Juan Antonio Ortega Ramírez1, Jesús Torres1, Jesús Arias Fisteus2, Víctor Córcoba Magaña2, Mario Muñoz Organero2, Luis Sánchez Fernández2
1
Depto. Lenguajes y Sistemas Informáticos
Avenida Reina Mercedes S/N, 41012
2
Universidad Carlos III de Madrid
Depto. Ingeniería Telemática,
Avda. de la Universidad, 30, 28911
1
{jorgeyago,aarcos,jaalvarez,jortega,jtorres}@us.es
2
{jaf,vcorcoba,mario.munoz,luiss}@it.uc3m.es
Abstract
El objetivo de este trabajo es desarrollar una plataforma
que no sólo sea útil a los ciudadanos, sino que además sirva
de base para realizar posteriores estudios con los datos recogidos, de modo que se puedan analizar éstos para tomar
mejores decisiones, anticiparse a los problemas para resolverlos de forma proactiva y coordinar los recursos para
actuar de forma eficiente. Aunque, actualmente, existen
plataformas capaces de analizar los datos recogidos para
sugerir actividades para mejorar nuestra salud (Zenobase1,
Microsoft Health2 o Google Fit3), el objetivo de ésta es,
además, proveer de datos anónimos a profesionales que
puedan analizarlos y obtener conclusiones, dando la posibilidad de incluirlas en una base de conocimiento de la propia
plataforma y que ésta sea capaz de generar recomendaciones
automáticamente. Una plataforma similar aunque con diferente concepción es Apple Research Kit4 ya que Apple
permite acceder a los datos de una determinada aplicación y
nuestra plataforma permite acceder a todas la información
subida por aplicaciones y dispositivos.
Para la recopilación de los datos, utilizamos dispositivos
vestibles como los pulseras o relojes inteligentes o bandas
de pecho, que están teniendo bastante difusión y cuyo coste
no es muy elevado. Además también utilizamos los teléfonos móviles, con otro conjunto de capacidades que pueden
servirnos para complementar los datos recogidos por los
‘wearables’. Estos dispositivos son un medio ideal para
poder llegar a los ciudadanos y crear alertas tempranas para
prevenir problemas de salud.
El aumento de la población en áreas urbanas y el
ritmo de vida cada vez más sedentario es una preocupación creciente. Por otra parte, los avances tecnológicos en sensores y redes de comunicaciones
permiten obtener mucha información, que antes no
era posible conocer, prácticamente en tiempo real.
Este trabajo en progreso utiliza estos avances,
para recopilar datos de los habitantes de una zona
urbana en una plataforma web en que en un futuro,
los profesionales puedan obtener datos anónimos,
analizarlos y suministrar patrones de salud en base
a los mismos, dotando al sistema de la capacidad
de crear planes de acción comunes y personalizados a los perfiles de los ciudadanos, con el fin de
mejorar su calidad de vida.
1
Introducción
Según Naciones Unidas, las ciudades de todo el mundo
están cada vez más pobladas, siendo la tendencia a que siga
aumentando [1]. En las ciudades, las oportunidades se multiplican y es posible cubrir rápidamente todo tipo de necesidades. El problema surge cuando las infraestructuras de la
ciudad no son capaces de soportar el crecimiento de la población al mismo ritmo, y el día a día de los ciudadanos se
va volviendo más estresante, lo que va degradando su salud,
además de aumentar los riesgos de sufrir accidentes [2].
Por otro lado, en las ciudades, los ciudadanos disponen de
todo tipo de servicios, prácticamente, sin tener que moverse
de su casa. Por ejemplo, existen multitud de servicios de
comida a domicilio, sin necesidad de que el cliente tenga
que desplazarse al local. En las ciudades, los habitantes son
cada vez más sedentarios y esto aumenta las posibilidades
de sufrir enfermedades cardiovasculares y coronarias [3]
1
https://zenobase.com/
https://www.microsoft.com/microsoft-health
3 https://fit.google.com/
4
https://www.apple.com/researchkit/
2
53
2
de los ciudadanos, para que dejen sus vehículos propios a
favor del uso del transporte público.
En nuestro trabajo, el sistema analizaría las rutas y los
horarios de los usuarios, en sus trayectos de casa al trabajo y
viceversa, para proponer posibilidades de uso de transporte
público o incluso tramos que pudiera hacer a pie, para aumentar su actividad física. Adicionalmente, estas recomendaciones iniciales, se relacionarían con información obtenida de fuentes de datos externas, como por ejemplo, la meteorológica, para que las sugerencias sean más adecuadas y
sean mejor aceptadas por el usuario.
Estado del arte
Son muchas las ciudades que ya están analizando los datos que pueden ser recogidos desde diversos tipos de sensores para evaluarlos y aplicar técnicas de minería de datos
para obtener conclusiones objetivas. Por ejemplo, en el
trabajo de Boulos [4] toma Barcelona como ejemplo, exponiendo la utilidad de la información recogida por sensores
conectados a Internet, para monitorizar el nivel de polución,
combinado con el uso de una aplicación para smartphone,
que llevan los ciudadanos (algunos de ellos), para monitorizar el nivel de ruido y de esta forma poder crear alertas de
salud ambiental.
En nuestro estudio, se relaciona la información recogida
por sensores que llevará el ciudadano en los trayectos de ida
y vuelta al trabajo en su vehículo, así como los datos recogidos de actividad física y sueño en su día a día, para centrarse en analizar tres de los problemas más generales que
afectan a las grandes ciudades:
2.3 Salud
Los puntos anteriores están relacionados con la salud de
los ciudadanos, pero aparte de éstos, la vida en las ciudades
hace que también aumente el sedentarismo, lo que incrementa el riesgo de padecer enfermedades cardiovasculares.
En este estudio, el sistema propuesto analizaría la actividad física recogida por los sensores, complementándola con
los patrones históricos de actividad del ciudadano e incluso
datos de fuentes externas, para sugerir hábitos más saludables.
2.1 Tráfico y riesgo de accidente
Es uno de los campos de estudio más extenso, debido al
incesante aumento de vehículos en las ciudades. El objetivo
global es optimizar los desplazamientos del conjunto de la
sociedad dentro de una ciudad, para reducir el impacto medioambiental, así como reducir al mínimo los riesgos de
accidente.
3
Desarrollo
Se está elaborando una plataforma capaz de analizar los
datos recogidos por sensores que llevarán un conjunto de
usuarios, para poder inferir patrones de comportamiento de
éstos y así definir un sistema automático que sea capaz de
realizar sugerencias para mejorar la calidad de vida, tanto
desde el punto de vista personal, como de la sociedad en
general. Para poder realizar recomendaciones con base científica, se dotará a la plataforma de un conector para que se
puedan consultar datos de manera anónima por personal
médico y profesionales de la salud, así como una forma para
que puedan registrar las conclusiones a las que lleguen en
base a los datos analizados, de modo que el sistema pueda
usar las conclusiones para aplicarlas a futuros conjuntos de
datos.
Uno de los factores que más afecta en el riesgo de accidentes, es la falta de atención por una mala calidad de sueño
o por no dormir las horas suficientes, como expone Horne
en su artículo acerca de cómo afecta el sueño en la conducción [5].
Otro factor que afecta al riesgo de tener un accidente es el
estrés en la conducción, aunque éste depende de muchos
aspectos, tanto externos como propios de la persona. En este
estudio analizaremos el ritmo cardiaco en los trayectos de
ida y vuelta del trabajo, ya que existe una relación directa
entre el estrés y el ritmo cardiaco [6].
2.2 Factores energéticos y medioambientales
Al producirse un aumento de la población, se produce un
mayor número de desplazamientos, que genera un mayor
volumen de contaminación. En las grandes ciudades podemos encontrarnos muchos elementos contaminantes en el
aire, pero según análisis como el realizado por Mayer [7], el
tráfico de vehículos es uno de los que más elementos tóxicos genera y se ha probado, la relación entre la polución del
aire y el aumento del riesgo de padecer enfermedades cardiovasculares o incluso cáncer de pulmón [8].
Una de las soluciones más evidentes es la de promover el
uso del transporte público, en lugar de usar el coche privado, para reducir los niveles de contaminación. En el estudio
de Beirão [12] se analizan los factores que motivan y desalientan a las personas a usar el transporte público: es necesario un compromiso dual entre una mejora de las infraestructuras de transporte público y una mayor concienciación
Para alcanzar este objetivo, el sistema debe desarrollarse
en varias etapas:
 En una primera etapa, ha de recogerse una cantidad
suficiente de datos de los ciudadanos como para
que puedan ser analizados y extraer unos patrones
de conducta de éstos. Los datos se obtendrán de
sensores que llevarán consigo los ciudadanos.
 En una segunda etapa, al mismo tiempo que se siguen recogiendo datos de los ciudadanos, un equipo de profesionales de la salud los analizará e irá
dotando al sistema de una base de conocimiento
para ser capaz de presentar recomendaciones au-
54
tomáticas personalizadas, que puedan llevar a que
el ciudadano tenga una vida más saludable.
 En una tercera etapa, se combinarán estas recomendaciones con otras fuentes de datos externas al
propio ciudadano, como pueden ser datos meteorológicos, tráfico, niveles de polución, horarios y
disponibilidad de transportes públicos, etc. para
realizar recomendaciones más precisas e interesantes para el usuario.
La plataforma, al poder actuar como suministradora de
información anonimizada, puede ser usada por otros equipos
profesionales de otras disciplinas de investigación, para
realizar estudios con los datos recogidos y elaborar, del
mismo modo, nuevos resultados y conclusiones con los que
se podría alimentar también el sistema.
pone a sugerirle que podría hacer un tramo de la ruta de
vuelta caminando, para mejorar su estilo de vida. Antes de
emitir la recomendación en su dispositivo móvil, obtiene y
analiza los datos de temperatura y meteorología de la zona
en la que se encuentra, de modo que si está o va a llover o
hace mucho frío, no plantee esa alternativa.
Tipología de datos recogidos
Para este estudio, se recogerán los siguientes tipos de datos
de las pulseras:



Recogida de datos de los ciudadanos
Tras analizar varios tipos de dispositivos físicos, así como
las APIs que proveen los fabricantes para poder recoger los
datos, se ha optado inicialmente, por 2 tipos de dispositivos:
 Pulseras Fitbit [13]
Actividad física
Pulso
Horas de sueño
Por otra parte, se recogerán los siguientes tipos de datos
de los dispositivos móviles:
 Móviles con Sistema Operativo Android
Las ventajas del uso de las pulseras con sensores, es que
son ligeras, no molestan en la actividad diaria y al contrario
que los teléfonos móviles, se siguen llevando puestas cuando se llega a casa. Los modelos más básicos son capaces de
recoger datos de actividad física del usuario, como la cantidad de pasos que da o la intensidad del ejercicio físico que
realiza, así como las horas de sueño y la calidad de éste. Los
modelos más avanzados, son capaces de detectar el pulso
cardíaco e incluso la ubicación geográfica.
Por otro lado, se ha optado por móviles con sistema operativo Android por ser el más extendido en el mercado, con
el objetivo de llegar a un mayor número de ciudadanos.
Adicionalmente, con cada medida que se recoge mediante
los dispositivos que llevan los ciudadanos, se obtiene una
información que le da un valor añadido al dato: el contexto
en el que se produce. Los sistemas y la forma de interactuar
de éstos con el ser humano, deben adaptarse a las circunstancias y el entorno éste [9]. Esta información contextual,
hace que se puedan obtener conclusiones más adecuadas, ya
que, por ejemplo, no sería lo mismo obtener una lectura de
un ritmo cardiaco elevado en el trayecto al trabajo un día
lluvioso, que una tarde en un parque haciendo deporte.
Consecuentemente, de cara a poder realizar las recomendaciones más adecuadas al ciudadano en cada momento, es
necesario tener en cuenta esta información contextual. Mostremos un posible escenario: Supongamos que el sistema, en
base a los datos recogido por los sensores durante un tiempo, concluye que el ciudadano lleva un estilo de vida muy
sedentario últimamente. Analizando los parámetros de sueño del día actual, obtiene que ha tenido un correcto descanso. A continuación analiza las rutas que sigue habitualmente
de casa al trabajo y del trabajo a casa y las horas a las que
suele empezar cada recorrido, además del medio de transporte que usa. Ese día, el sistema detecta que el ciudadano
comienza algo antes su trayecto a casa y que ha usado el
transporte público para la ida al trabajo, con lo que se dis-


Ubicación GPS
Acelerometría
Los datos serán enviados periódicamente a la plataforma
web, donde serán analizados automáticamente para establecer una relación en base a criterios médicos y poder definir,
cuando se tenga un conjunto suficiente de datos, recomendaciones para mejorar sus hábitos.
Más adelante, se analizarán estos datos y se contrastarán
con otras fuentes de datos externas, como datos de tráfico de
la API de Google Maps5, datos meteorológicos de AEMET 6
o calidad del aire de AirNow7, por poner algunos ejemplos.
Envío de datos
Para la transmisión de los datos recopilados por los distintos sensores, se usará el sistema de intercambio de información Ztreamy8, que suministra una interfaz para publicar
y consumir los flujos de datos generados por los sensores,
siendo posible añadir información contextual, con lo que se
pueden conseguir sensores web semánticos.
Ztreamy [10] permite, hacer filtrados y búsquedas por la
información semántica en las tramas de datos recibidos, con
lo que en el destino del envío se pueden localizar un dato
concreto sin tener que procesar toda la información.
5
Conclusiones
El disponer de un sistema unificado capaz de recoger y
analizar los datos de sensores provenientes de los propios
ciudadanos y de sensores distribuidos por el entorno, y
5
https://developers.google.com/maps/
http://www.aemet.es/es/serviciosclimaticos
7 http://airnowapi.org/webservices
8
http://www.ztreamy.org/
6
55
combinar estos datos con otras fuentes externas, hará posible una relación persona-ciudad inteligente en la que el
beneficio será mutuo.
El sistema propuesto será capaz de cuidar del ciudadano,
conociendo el conjunto de parámetros que definen sus hábitos, como los desplazamientos que realiza cada día, el tiempo que descansa o el estrés que sufre, y el contexto en el que
se producen, pudiendo de realizar recomendaciones que
puedan mejorar su salud.
Además, el sistema planteado sería escalable y, en un
futuro, conforme vayan evolucionando y surgiendo nuevos
sensores biométricos, se irán pudiendo añadir los datos
recogidos a la base de conocimiento que ya se tuviera, para
elaborar sugerencias más precisas al ciudadano, así como
evaluar la salud del conjunto de la sociedad de la ciudad
inteligente.
Por último, pero no menos importante, se podrá utilizar
también como suministrador de información para otras
plataformas y estudios de mejora de la calidad de vida en las
ciudades.
6
9.
10.
11.
12.
13.
Agradecimientos
Este trabajo ha sido parcialmente financiado por el proyecto
del Ministerio de Economía y Competitividad HERMES
(TIN2013-46801-C4-1-r) y los proyectos de excelencia de la
Junta de Andalucía Simon (P11-TIC-8052) y ContextLearning (P11-TIC-7124).
Referencias
1.
2.
3.
4.
5.
6.
7.
8.
United Nations, Department of Economic and Social Affairs, Population Division (2014). World
Urbanization Prospects: The 2014 Revision, Highlights (ST/ESA/SER.A/352).
Hennessy, Dwight A., and David L. Wiesenthal.
Traffic congestion, driver stress, and driver aggression. Aggressive behavior 25.6 (1999): 409423.
Bernstein, Martine S., Alfredo Morabia, and Dorith
Sloutskis. Definition and prevalence of sedentarism in an urban population. American Journal of
Public Health 89.6 (1999): 862-867.
Boulos, Maged N Kamel, and Najeeb M AlShorbaji. On the Internet of Things, smart cities
and the WHO Healthy Cities. International journal
of health geographics 13.1 (2014): 10.
Horne, Jim, and Louise Reyner. Vehicle accidents
related to sleep: a review. Occupational and environmental medicine 56.5 (1999): 289-294.
Simonson, Ernst, et al. Cardiovascular stress (electrocardiographic changes) produced by driving an
automobile. American Heart Journal 75.1 (1968):
125-135.
Mayer, Helmut. Air pollution in cities. Atmospheric environment 33.24 (1999): 4029-4037.
Pope III, C. Arden, et al. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine
56
particulate air pollution. Jama 287.9 (2002): 11321141.
Abowd, Gregory D., et al. Towards a better understanding of context and context-awareness.
Handheld and ubiquitous computing. Springer Berlin Heidelberg, 1999.
Fisteus, Jesus Arias et al. Ztreamy: A middleware
for publishing semantic streams on the Web. Web
Semantics: Science, Services and Agents on the
World Wide Web 25 (2014): 16-23.
Beirão, Gabriela, and JA Sarsfield Cabral. Understanding attitudes towards public transport and
private car: A qualitative study. Transport policy
14.6 (2007): 478-489.
Takacs, Judit, et al. Validation of the Fitbit One activity monitor device during treadmill walking.
Journal of Science and Medicine in Sport 17.5
(2014): 496-500.
Bernstein, Martine S., Alfredo Morabia, and Dorith
Sloutskis. Definition and prevalence of sedentarism in an urban population. American Journal of
Public Health 89.6 (1999): 862-867.
Sistema de reconocimiento de señales de tráfico para una SmartCity
Álvaro Arcos García1, Mario Soilán2, Belén Riveiro2, Juan Antonio Álvarez-García1, Jorge Yago
Fernández1, Juan Antonio Ortega1, Pedro Arias-Sánchez2
1
2
Universidad de Vigo
Depto. Departamento de Ingeniería de los Recursos Naturales y Medio Ambiente
Torrecedeira, 86, 36208
1
{aarcos,jaalvarez,jorgeyago,jortega}@us.es
2
{msoilan, belenriveiro, parias}@uvigo.es
Abstract
tro del ámbito de Smart Mobility, es decir, mejorar la movilidad de los ciudadanos avanzando en el transporte y los
sistemas que lo gestionan. Concretamente describimos un
trabajo para la detección y clasificación de señales de tráfico
mediante técnicas de visión por computador.
El reconocimiento de señales de tráfico es de gran
relevancia en los sistemas de transporte y aumentará en Smart Cities superpobladas. En este trabajo
en progreso se plantean las técnicas usadas para detectar y clasificar las señales de tráfico españolas.
Aunque los resultados son prometedores nos encontramos con el problema de la inexistencia de un
dataset específico para España anotadas y clasificadas manualmente.
1
El reconocimiento de señales de tráfico (RST) es de gran
relevancia e interés para el mercado automovilístico, el
sector industrial, y para las autoridades locales encargadas
del mantenimiento de señales. Además, una correcta identificación de las señales desde los vehículos que las transitan
permitiría suministrar ayuda a la conducción [28,29].
Introducción
A continuación describimos el estado del arte del reconocimiento de señales de tráfico, continuamos describiendo el
desarrollo de nuestro sistema basado en un detector LiDAR
y RGB, y en un clasificador HOG+SVM. Por último mostramos las conclusiones obtenidas
Más de la mitad de la población mundial vive actualmente
en áreas urbanas y se espera que esta cifra siga creciendo
hasta alcanzar los 6.300 millones de personas en el año 2050
[1]. Este crecimiento plantea nuevos retos y desafíos en el
desarrollo sostenible de ciudades que deberán abordar puntos críticos como la gestión de residuos, tráfico, recursos,
educación, energía, agua, salud [2], desempleo o seguridad
[3,4]. Para solventar estos retos, han surgido colaboraciones
entre gobiernos, empresas e instituciones académicas con el
fin de desarrollar proyectos en los que se estudian, diseñan y
construyen soluciones para hacer frente a dichas dificultades
usando las tecnologías de la información y la comunicación
(TIC) [5,6]. IBM Smart Planet and Smart Cities [7], Oracle
iGovernment [8], Amsterdam Smart City [9], Dubai
SmartCity [10], European Smart Cities [11] son algunos de
los proyectos líderes relacionados con Smart City.
2
Estado del arte
Mediante los sistemas de reconocimiento de señales de
tráfico (SRST) se pueden desarrollar aplicaciones inteligentes que permitan crear un sistema de transporte público y
privado sostenible y eficiente en las ciudades. Los principales problemas son que las señales de tráfico muestran un
amplio rango de diferencias entre clases en términos de
colores, formas, y la presencia de textos o pictogramas. Los
SRST tienen que hacer frente a distintas variaciones visuales debido a los cambios de iluminación, rotaciones, condiciones meteorológicas, oclusión parcial, etc. Además, la
variedad de diferentes señales de tráfico (clases) que tienen
que ser distinguidas es muy amplia y diversa en cada país
(más de 200 tipos de señales en España [30], Alemania [31]
y Bélgica [32]).
Una ciudad es inteligente cuando la inversión en capital
humano y social y en infraestructuras de comunicación
produce un crecimiento sostenible de la economía, un alto
grado de calidad de vida y la gestión eficaz de los recursos
naturales, todo ello a través de un gobierno participativo
[12,13]. En este contexto, nuestro trabajo se encuentra den-
57
A continuación se enumeran cronológicamente soluciones
relevantes al problema del RST.
En 1997, De La Escalera [18] desarrolló un sistema basado
en colores del espacio RGB, cuyas detecciones se pasaban a
un clasificador que usaba redes neuronales.
En 2005, Bahlmann [14] presentó un sistema integrado de
seguimiento, reconocimiento y detección específico para
señales de límites de velocidad. El clasificador fue entrenado con 4000 muestras divididas en 23 clases, con un rango
de 30 a 600 muestras por clase. Para medir el rendimiento
del componente de clasificación del sistema, utilizaron un
conjunto de pruebas de 1700 señales de tráfico con el que
obtuvieron un 94% de acierto.
En 2007, Broggi [16] propuso utilizar redes neuronales para
clasificar diferentes señales de tráfico. Para seleccionar la
red neuronal adecuada, se basaron en la información de
forma y color extraída en la fase de detección. Solo se
muestran resultados cualitativos en esta propuesta.
Un sistema de reconocimiento de señales de límites de velocidad de Europa y Estados Unidos basado en el reconocimiento individual de dígitos usando redes neuronales, fue
propuesto por Moutarde en 2007 [15]. El sistema completo
que incluye detección y seguimiento, logra un porcentaje de
acierto del 89% para las señales de Estados Unidos y del
90% para las europeas, aplicado a 281 muestras.
Maldonado y Bascón [22] presentaron en 2007 un sistema
automático de detección y reconocimiento de señales de
tráfico basado en máquinas de soporte vectorial (SVM). En
él, realizan una segmentación de colores en el espacio HSV,
mediante un SVM linear, clasifican la forma de las señales
utilizando vectores de características basados en la distancia
hasta los bordes, y por último, clasifican los pictogramas
mediante un SVM con kernel Gaussiano.
En 2008, Keller [17] publicó un sistema de clasificación de
señales de límites de velocidad basado en dígitos, con el que
lograron un porcentaje de acierto del 92.4% utilizando un
conjunto de 2880 imágenes de entrenamiento y otro de 1233
muestras de pruebas.
En 2010, Bascón [19] desarrolla un clasificador que obtiene
un porcentaje de acierto del 95.5% usando SVM. La base de
datos está compuesta de aproximadamente 36000 muestras
de señales de tráfico españolas dividas en 193 clases. Dicha
base de datos no está disponible públicamente.
El problema de estos estudios es la imposibilidad de comparación debido a que el conjunto de señales para cada uno de
ellos es muy variado.
Recognition Benchmark (GTSRB) [20] comprendido por
más de 50000 imágenes divididas en 43 clases. Las publicaciones de los mejores resultados obtenidos en el GTSRB
fueron descritas brevemente por Stallkamp en 2011 [20].
Para solventarlo, recientemente se han publicado varias
bases de datos que incluyen imágenes de señales de tráfico
anotadas y divididas en distintas clases, teniendo así un
conjunto genérico y estandarizado de imágenes. El Belgium
Traffic Sign Dataset (BelgiumTS) [23] contiene más de
13000 imágenes de señales de tráfico anotadas y divididas
en 62 clases. BelgiumTS está separado en dos subconjuntos,
uno para detección (BTSD) y otro para clasificación
(BTSC). El German Traffic Sign Detection Benchmark
(GTSDB) [21] que incluye 600 imágenes de entrenamiento
y 300 de evaluación. Y por último, el German Traffic Sign
A continuación se describe el sistema propuesto en el marco
del proyecto nacional HERMES. En primer lugar se describirá el sistema de detección desarrollado por la Universidad
de Vigo y a continuación el de clasificación conjuntamente
desarrollado entre Vigo y la Universidad de Sevilla.
Si bien la mayor parte de aplicaciones en RST se basan
exclusivamente en el análisis de imagen o de vídeo, el uso
de Sistemas de Mapeado Móvil (SMM) permite darle nuevos enfoques a esta problemática. Un SMM comprende
distintos sensores (GNSS/IMU, láser escáner, cámaras fotográficas…) calibrados y referenciados espacial y temporalmente entre sí. El montaje de un láser escáner en un sistema
móvil permite adquirir datos geométricos y radiométricos
del entorno de forma masiva y con gran precisión. Recientemente se han desarrollado nuevas metodologías que permiten detectar y reconocer de forma automática distintos
objetos en entornos urbanos. Serna y Marcotegui [34] realizaron una revisión de las metodologías que se habían utilizado para detectar objetos como edificios, vehículos, peatones, cables, etc. mediante el análisis de nubes de puntos
obtenidas con SMM. Además, proponen su propio método
para la clasificación de hasta 20 objetos distintos, basado en
una segmentación que utiliza imágenes de elevación y morfología matemática, y una posterior clasificación de objetos
utilizando SVM. Recientemente, Yang [35] desarrolló una
metodología para clasificar, entre otros objetos, edificios,
farolas o vehículos. Se consigue una eficiencia global superior al 92% con una metodología basada en la segmentación
de supervoxeles generados a partir de los atributos (posición, color, intensidad) de los puntos de la nube, y clasificación basada en criterios heurísticos.
En cuanto a las señales de tráfico, González-Jorge [36]
demuestra que los sistemas láser escáner pueden capturar la
geometría de las señales de tráfico basándose en los valores
de intensidad de los haces láser que se reflejan en puntos de
las señales. Dichos valores serán mucho más elevados que
los de su entorno, debido a las propiedades retrorreflectivas
de la pintura con la que se fabrican las señales de tráfico.
Esta propiedad abre las puertas al desarrollo de metodologías para la segmentación automática de señales de tráfico.
Pu [37] reconoce distintas formas – rectángulo, circulo,
triángulo – a partir de nubes de puntos obtenidas con un
SMM.
3
Desarrollo
3.1 Detección de señales de tráfico
La detección de señales de tráfico se lleva a cabo sobre
nubes de puntos 3D capturadas por dos sensores LiDAR
montados en el LYNX Mobile Mapper [38], sistema de
mapeado móvil de la Universidad de Vigo. El método que
58
se propone se centra en una segmentación basada en los
valores de intensidad de los puntos de la nube. Para ello, se
siguen tres pasos:
1) Eliminación del suelo. En primer lugar se eliminan los
puntos del suelo y de poca intensidad. Para ello la nube de
puntos 3D se rasteriza, es decir, se proyecta sobre el plano
XY y se crea una rejilla sobre el plano para tener una estructura de píxeles. Para los resultados presentados en la sección
4.1, las celdas de la rejilla tienen un tamaño de 20x20cm
para entorno urbano, y 50x50cm para entorno de carretera.
Los atributos de los puntos de la nube en cada celda de la
rejilla se utilizan para crear una imagen basada en intensidad
(𝐼𝑖𝑛𝑡 ) y otra basada en altura (𝐼ℎ𝑒𝑖𝑔ℎ𝑡 ).
DBSCAN [39], que agrupa los puntos de la nube basándose
en su densidad local. Además, este algoritmo elimina puntos
aislados que puedan existir debido a reflejos o ruido que
haya saturado el sensor LiDAR.
El resultado de la segmentación es un conjunto de nubes de
puntos individuales que representan señales de tráfico y
otros objetos reflectantes como matrículas de coche. En este
punto se pueden aplicar distintos filtros basados en la geometría para descartar falsos positivos, conociendo las dimensiones mínimas y máximas que pueden tener las señales
de tráfico.
La información geométrica de los puntos de cada señal se
puede utilizar para conocer, por ejemplo, su ángulo con
respecto a la vertical, mediante un análisis de componentes
principales. En la figura 1a se representan los puntos correspondientes a una señal, así como su ángulo y localización en
coordenadas UTM. Se puede observar que la resolución de
los puntos no es suficiente para distinguir el significado
concreto de la señal.
∑𝑛(𝑥𝑟 ,𝑦𝑟 ) 𝑖𝑛𝑡(𝑝𝑖 (𝑥𝑟 , 𝑦𝑟 ))
𝐼𝑖𝑛𝑡 (𝑥𝑟 , 𝑦𝑟 ) = 𝑖=1
𝑛(𝑥𝑟 , 𝑦𝑟 )
𝐼ℎ𝑒𝑖𝑔ℎ𝑡 (𝑥𝑟 , 𝑦𝑟 ) = 𝜎𝑧 (𝑥𝑟 , 𝑦𝑟 )
𝑟 ,𝑦𝑟 )
∑𝑛(𝑥
[𝑧(𝑝𝑖 (𝑥𝑟 , 𝑦𝑟 )) − min⁡(𝑧(𝑝(𝑥𝑟 , 𝑦𝑟 ))]
𝑖=1
𝑛(𝑥𝑟 , 𝑦𝑟 )
Donde 𝑝(𝑥𝑟 , 𝑦𝑟 ), 𝜎𝑧 (𝑥𝑟 , 𝑦𝑟 ) y 𝑛(𝑥𝑟 , 𝑦𝑟 ) son, respectivamente, el conjunto de puntos, la varianza vertical y el número de
puntos en la celda (𝑥𝑟 , 𝑦𝑟 ) del ráster.
Finalmente, los puntos de cada señal se pueden mapear a
coordenadas imagen de las cámaras del SMM. Cada imagen
está asociada a una marca de tiempo y a un punto en la
trayectoria, y la orientación externa así como los parámetros
de calibración de las cámaras son conocidos. Además, en
caso de que la misma señal sea capturada en varias imágenes, se puede elegir aquélla donde la señal aparezca desde el
punto de vista más óptimo – sin deformaciones perspectivas
- y con un tamaño suficiente en la imagen. En la figura 1b
se puede observar la región que contiene a los puntos 3d
mapeados sobre la imagen. El proceso de detección se convierte, de esta forma, en invariante a efectos que sí tienen
relevancia en las imágenes como pueden ser las sombras, las
variaciones en los colores, o la perspectiva.
𝐼𝑖𝑛𝑡 ⁡será una imagen donde las señales de tráfico están representadas por unos pocos píxeles de intensidad elevada,
mientras que 𝐼ℎ𝑒𝑖𝑔ℎ𝑡 ⁡produce una imagen donde los píxeles
con más intensidad representan zonas de la nube con puntos
que se elevan del suelo. La primera se binariza seleccionando como umbral el valor medio de intensidad de la imagen,
considerando sólo los píxeles correspondientes a celdas del
ráster que contienen al menos un punto. La segunda, utilizando como valor de umbral 0.001 · max⁡(𝐼ℎ𝑒𝑖𝑔ℎ𝑡 ). Finalmente, se realiza una operación AND entre ambas imágenes
y los puntos correspondientes a los píxeles que quedan a 0
se eliminan de la nube.
2) Selección de valor umbral óptimo. Los puntos restantes
en la nube se corresponderán, además de con señales de
tráfico, con superficies en fachadas, vehículos, carteles,
matrículas, etc. Para reducir el número de puntos, en primer
lugar se filtra la nube basándose en las propiedades retrorreflectivas de las señales. Se eliminan todos aquellos puntos
cuya intensidad es menor que la media en la nube, reduciendo significantemente el número de clases. Después, se ajusta
un Modelo Mixto Gaussiano (MMG) con dos componentes.
El componente con mayor media definirá la clase en la que
se encuentran los puntos de las señales de tráfico. Dentro de
esta clase, se podrían distinguir dos distribuciones normales,
que se corresponden con puntos de señales a ambos lados de
la calzada. Finalmente, los puntos en el MMG se agrupan en
dos clases maximizando la probabilidad posterior en la
clasificación de cada punto. Los puntos que pertenecen a la
clase con menor intensidad son eliminados de la nube.
(a)
(b)
Fig. 1. Detección de señales de tráfico. (a) Nube de puntos
de una señal detectada. Se visualiza el ángulo respecto a la
vertical y la localización de su centroide. (b) Los puntos 3d
se mapean en su imagen correspondiente, quedando aislada
una región de interés que contiene la señal.
3) Agrupación de puntos. Una vez realizada la segmentación, es necesario organizar los puntos en clusters para
aislar cada señal de tráfico. Para ello, se utiliza el algoritmo
59
Los falsos positivos son provocados en su mayoría por objetos reflectantes de geometría similar a las señales de tráfico,
o reflejos en objetos metálicos que cumplen los filtros geométricos que se aplican a las nubes de puntos de los paneles
de señales. Estos objetos podrán ser descartados si son clasificados correctamente como negativos en el paso de clasificación.
3.2 Clasificación de señales de tráfico
La clasificación de señales de tráfico es un problema de
clasificación multiclase. Basándonos en los resultados mostrados por Mathias en 2013 [23], nuestra propuesta consta
de dos fases: extracción de características y clasificación.
Tanto el entrenamiento como las pruebas de rendimiento del
clasificador final, se han llevado a cabo utilizando los datasets GTSRB y BTSC por separado.
En cuanto a los falsos negativos, oclusiones parciales pueden hacer que el algoritmo DBSCAN falle a la hora de
agrupar los puntos de las señales. Además, aquellas señales
cuya pintura haya perdido sus propiedades retrorreflectivas
no serán correctamente detectadas. Ante un contexto de
inspección de infraestructuras, si una señal anotada previamente no se detecta, la causa puede ser esta degradación del
material.
La primera fase será la encargada de extraer los histogramas
de gradientes orientadas (HOG), descritos por Dalal en 2005
[24], de cada una de las imágenes. Como las señales de
tráfico están diseñadas para ser reconocidas por los humanos independientemente del color, las características más
discriminativas son la forma y el diseño interior. En los
experimentos se han utilizado imágenes recortadas en escala de grises redimensionadas a 40x40 píxeles con las que se
han calculado vectores de características de 2916 dimensiones.
El resultado es un vector de características de 2916 dimensiones. Estos vectores se calcularon para las imágenes de
entrenamiento del BTSC, pero no para las del GTSRB, ya
que éste último proporciona las características HOG previamente calculadas con la misma dimensión de vector.
4.2 Clasificación de señales de tráfico
En ambas fases de la clasificación se ha usado la librería
OpenCV [26]. La implementación de SVM de dicha librería
está basada en LIBSVM [27]. Una vez generado el clasificador de señales de tráfico de los dos datasets, se han obtenido los siguientes resultados aplicados al conjunto de imágenes de pruebas.
 BTSC: 95,83% porcentaje de acierto.
 GTSRB: 94,49% porcentaje de acierto.
En la segunda fase, clasificación, se ha empleado máquinas
de vectores soporte (SVM) que hacen uso de la función
kernel Lineal (x · y). El tipo de SVM empleado ha sido el
C-SVM, propuesto por Boser en 1992 [25], y para el entrenamiento de los clasificadores se ha utilizado la técnica de
uno-contra-resto, detallada ampliamente por Hsu en 2002
[33].
4
Por último, se ha realizado un experimento de clasificación
de 1478 imágenes de señales de tráfico españolas aportadas
por la Universidad de Vigo. Este dataset no está anotado ni
guarda ninguna relación con las clases definidas en BTSC y
GTSRB, por lo que no podemos determinar el rendimiento
del clasificador de forma cuantitativa. Además, las leves
variaciones de color, forma y pictograma que existen entre
las señales correspondientes a distintos países, unido a la
ausencia de determinadas clases en los datasets, provoca que
el rendimiento del clasificador sea menor. En la figura 1 se
muestran algunos resultados.
Experimentación
4.1 Detección de señales de tráfico
La metodología para la detección de señales de tráfico sobre
nubes de puntos se ha implementado utilizando Matlab. Las
nubes de puntos utilizadas comprenden tanto entornos urbanos como interurbanos (carretera y autopista), en carreteras
de Galicia, Portugal y Brasil. Dichas nubes constan de entre
190 y 36 millones de puntos.
Muestra española
Las señales de tráfico fueron anotadas manualmente en
primer lugar como validación para los resultados de la metodología. Los resultados de la validación se pueden observar en la Tabla 1. El rendimiento es mejor para entornos
interurbanos, ya que en ciudad existen muchos objetos que
pueden ser confundidos con señales, así como mayor probabilidad de oclusiones y daños en señales.
Completeness
Correctness
Interurbano
93.58%
98.08%
Urbano
88.37%
84.44%
Total
92.11%
93.96%
Tabla 1. Resultados para la detección de señales de tráfico.
60
BTSC
GTSRB
5.
6.
7.
8.
Fig. 1. De izquierda a derecha se puede observar la imagen
de la señal de tráfico española de muestra, la clasificación
en BTSC y la clasificación en GTSRB.
9.
10.
5
11.
Conclusiones
La gestión del transporte público y privado de las grandes
ciudades será un problema a corto plazo. Mediante el despliegue de sensores y la detección y clasificación de señales
de tráfico, los ciudadanos pueden utilizar asistentes avanzados de ayuda a la conducción, o el gobierno puede conocer
el estado de las calles y carreteras y actual en consecuencia.
El clasificador HOG+SVM propuesto, obtiene altos porcentajes de acierto al aplicarse sobre el dataset alemán y el
belga, ambos por separado. Sin embargo, encontramos que
si aplicamos estos dos clasificadores a las señales de tráfico
españolas los resultados son peores. Una posible solución es
crear un dataset de imágenes anotadas con señales españolas
similar a BTSC o GTSRB que permita obtener un alto porcentaje de acierto en la clasificación usando HOG+SVM.
6
12.
13.
14.
15.
Agradecimientos
16.
17.
18.
7
Referencias
1.
2.
3.
4.
United Nations - Population Division. 2010.
<http://esa.un.org/unpd/wup/>
Solanas, Agusti et al. Smart health: a contextaware health paradigm within smart cities. IEEE
Communications Magazine 52.8 (2014): 74-81.
Su, Kehua, Jie Li, and Hongbo Fu. Smart city and
the applications. Electronics, Communications and
Control (ICECC), 2011 International Conference
on 9 Sep. 2011: 1028-1031.
Shapiro, Jesse M. Smart cities: quality of life,
productivity, and the growth effects of human capital. The review of economics and statistics 88.2
(2006): 324-335.
19.
About City Science - MIT Cities. 2012.
<http://cities.media.mit.edu/about/cities>
Smart City Lab | Alexandra Instituttet. 2015.
<http://alexandra.dk/uk/about_us/labs/smart-citylab>
IBM Smarter Cities - Future cities - United States.
2009. <http://www.ibm.com/smartercities>
Oracle
iGovernment.
2010.
<http://www.oracle.com/us/industries/publicsector/046936.html>
Amsterdam
Smart
City.
2009.
<http://amsterdamsmartcity.com/>
Arab Future Cities Summit Dubai. 2014.
<http://www.smartcitiesdubai.com/>
European smart cities. 2006. <http://www.smartcities.eu/>
Caragliu, Andrea, Chiara Del Bo, and Peter
Nijkamp. Smart cities in Europe. Journal of urban
technology 18.2 (2011): 65-82.
Giffinger, Rudolf et al. Smart cities-Ranking of European medium-sized cities. 2007.
Bahlmann, Claus, et al. A system for traffic sign detection, tracking, and recognition using color,
shape, and motion information. Intelligent Vehicles
Symposium, 2005. Proceedings. IEEE. IEEE, 2005.
Moutarde, Fabien et al. Robust on-vehicle real-time
visual detection of American and European speed
limit signs, with a modular Traffic Signs Recognition system. Intelligent Vehicles Symposium, 2007
IEEE 13 Jun. 2007: 1122-1126.
Broggi, Alberto, et al. Real time road signs recognition. Intelligent Vehicles Symposium, 2007 IEEE.
IEEE, 2007.
Keller, Christoph Gustav, et al. Real-time recognition of US speed signs.Intelligent Vehicles Symposium, 2008 IEEE. IEEE, 2008.
De La Escalera, Arturo, et al. Road traffic sign detection and classification. Industrial Electronics,
IEEE Transactions on 44.6 (1997): 848-859.
Bascón, S. Maldonado, et al. An optimization on
pictogram identification for the road-sign recognition task using SVMs. Computer Vision and Image
Understanding 114.3 (2010): 373-383.
20. Stallkamp, Johannes et al. The German traffic sign
recognition benchmark: a multi-class classification
competition. Neural Networks (IJCNN), The 2011
International Joint Conference on 31 Jul. 2011:
1453-1460.
21. Houben, Sebastian, et al. Detection of traffic signs
in real-world images: The German Traffic Sign
Detection Benchmark. Neural Networks (IJCNN),
The 2013 International Joint Conference on. IEEE,
2013.
61
22. Maldonado-Bascón, Saturnino, et al. Road-sign detection and recognition based on support vector
machines. Intelligent Transportation Systems,
IEEE Transactions on 8.2 (2007): 264-278.
23. Mathias, Markus, et al. Traffic sign recognition—
How far are we from the solution?. Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013.
24. Dalal, Navneet, and Bill Triggs. Histograms of oriented gradients for human detection. Computer Vision and Pattern Recognition, 2005. CVPR 2005.
IEEE Computer Society Conference on. Vol. 1.
IEEE, 2005.
25. Boser, Bernhard E., Isabelle M. Guyon, and Vladimir N. Vapnik. A training algorithm for optimal
margin classifiers. Proceedings of the fifth annual
workshop on Computational learning theory.
ACM, 1992.
26. Bradski, Gary, and Adrian Kaehler. Learning
OpenCV: Computer vision with the OpenCV library. "O'Reilly Media, Inc.", 2008.
27. Chang, Chih-Chung, and Chih-Jen Lin. LIBSVM: a
library for support vector machines. ACM Transactions on Intelligent Systems and Technology
(TIST)2.3 (2011): 27.
28. Arnoul, P., et al. Traffic signs localisation for
highways inventory from a video camera on board
a moving collection van. Intelligent Vehicles Symposium, 1996., Proceedings of the 1996 IEEE.
IEEE, 1996.
29. Tsai, Yichang, Pilho Kim, and Zhaohua Wang.
Generalized traffic sign detection model for developing a sign inventory. Journal of Computing in
Civil Engineering 23.5 (2009): 266-276.
30. BOE.es - Documento BOE-A-2003-23514. 2010.
<http://www.boe.es/diario_boe/txt.php?id=BOEA-2003-23514>
31. Traffic signs and signals in Germany - ADAC.
2015.
<https://www.adac.de/_mmm/pdf/fi_05_verkehrsz
eichen_engl_0510_30482.pdf>
32. Road signs in Belgium - OpenStreetMap Wiki.
2011.
http://wiki.openstreetmap.org/wiki/Road_signs_in_
Belgium>
33. Hsu, Chih-Wei, and Chih-Jen Lin. A comparison of
methods for multiclass support vector machines. Neural Networks, IEEE Transactions on 13.2
(2002): 415-425.
34. Serna, A. and Marcotegui, B. Detection, segmentation and classification of 3D urban objects using
mathematical morphology and supervised learning.
ISPRS Journal of Photogrammetry and Remote
Sensing 93 (2014): 243-255.
35. Yang, B., Dong, Z., Zhao, G., and Dai, W., Hierarchical Extraction of Urban Objects from Mobile
Laser Scanning Data. ISPRS Journal of Photogrammetry and Remote Sensing 99 (2015): 45-57.
36. González-Jorge, H., Riveiro, B., Armesto, J., and
Arias, P., Geometric evaluation of road signs using
radiometric information from laser scanning data.
Proceedings 2nd Int. Conf. ISARC (2011): 10071012.
37. Pu, S., Rutzinger, M., Vosselman, G., and Elberink, S.O., Recognizing basic structure from mobile
laser scanning data for road inventory studies.
ISPRS Journal of Photogrammetry and Remote
Sensing 66 (2011): s28-s39.
38. Optech Inc,. 2012. Homepage of the company
Opetech Inc., URL: http://www.optech.ca.
39. Ester, M., Kriegel, H., Sander, J., Xu, X., A density-based algorithm for discovering clusters. Proc.
2nd Int. Conf. on Knowledge Discovery and Data
Mining, Portland. (1996): 226-231.
62
Infraestructuras para gestión de información de una SmartCity
Miguel Luaces1, Susana Ladra1, Mario Muñoz Organero2, Jesús Arias Fisteus2, Víctor Córcoba
Magaña2, Pedro Arias-Sánchez2, Belén Riveiro3, Juan Antonio Álvarez-García4, Juan Antonio
Ortega4, Jorge Yago Fernández4
1
Universidad de A Coruña
Facultad de Informática,
Campus de Elviña s/n, 15071 A Coruña,
2
Universidad Carlos III de Madrid
Depto. Ingeniería Telemática,
Avda. de la Universidad, 30, 28911
3
Universidad de Vigo
Depto. Departamento de Ingeniería de los Recursos Naturales y Medio Ambiente
Torrecedeira, 86, 36208
4
1
{luaces,sladra}@udc.es
2
{mario.munoz,vcorcoba,jaf,luiss}@it.uc3m.es
3
{parias,belenriveiro}@uvigo.es
4
{jaalvarez,jortega,jorgeyago}@us.es
Abstract
1
grado de calidad de vida y la gestión eficaz de los recursos
naturales, todo ello a través de un gobierno participativo
[12,13].
Introducción
Más de la mitad de la población mundial vive actualmente
en áreas urbanas y se espera que esta cifra siga creciendo
hasta alcanzar los 6.300 millones de personas en el año 2050
[1]. Este crecimiento plantea nuevos retos y desafíos en el
desarrollo sostenible de ciudades que deberán abordar puntos críticos como la gestión de residuos, tráfico, recursos,
educación, energía, agua, salud [2], desempleo o seguridad
[3,4]. Para solventar estos retos, han surgido colaboraciones
entre gobiernos, empresas e instituciones académicas con el
fin de desarrollar proyectos en los que se estudian, diseñan y
construyen soluciones para hacer frente a dichas dificultades
usando las tecnologías de la información y la comunicación
(TIC) [5,6]. IBM Smart Planet and Smart Cities [7], Oracle
iGovernment [8], Amsterdam Smart City [9], Dubai
SmartCity [10], European Smart Cities [11] son algunos de
los proyectos líderes relacionados con Smart City.
En este trabajo se describe la infraestructura propuesta en el
proyecto Nacional Hermes en el que se plantea un modelo
de transición para llegar a crear ciudades inteligentes sin
necesidad de grandes inversiones en infraestructuras.
2
Diseño
En la Figura 1 puede observarse el diseño de la infraestructura software para el Proyecto Nacional HERMES. En él se
distinguen los siguientes módulos:
 DBMS
 Data Access Layer
 …
3
Una ciudad es inteligente cuando la inversión en capital
humano y social y en infraestructuras de comunicación
produce un crecimiento sostenible de la economía, un alto
Desarrollo
3.1. HERMES Citizen
3.2. HERMES Driver
3.3. HERMES S3D
63
3.4. HERMES Space&Time
5
Conclusiones
6
Agradecimientos
tal. The review of economics and statistics 88.2
(2006): 324-335.
5. About City Science - MIT Cities. 2012.
<http://cities.media.mit.edu/about/cities>
6. Smart City Lab | Alexandra Instituttet. 2015.
<http://alexandra.dk/uk/about_us/labs/smart-citylab>
7. IBM Smarter Cities - Future cities - United States.
2009. <http://www.ibm.com/smartercities>
8. Oracle
iGovernment.
2010.
<http://www.oracle.com/us/industries/publicsector/046936.html>
9. Amsterdam
Smart
City.
2009.
<http://amsterdamsmartcity.com/>
10. Arab Future Cities Summit Dubai. 2014.
<http://www.smartcitiesdubai.com/>
Referencias
1.
2.
3.
4.
United Nations - Population Division. 2010.
<http://esa.un.org/unpd/wup/>
Solanas, Agusti et al. Smart health: a contextaware health paradigm within smart cities. IEEE
Communications Magazine 52.8 (2014): 74-81.
Su, Kehua, Jie Li, and Hongbo Fu. Smart city and
the applications. Electronics, Communications and
Control (ICECC), 2011 International Conference
on 9 Sep. 2011: 1028-1031.
Shapiro, Jesse M. Smart cities: quality of life,
productivity, and the growth effects of human capi-
11.
European
smart
cities.
<http://www.smart-cities.eu/>
2006.
12. Caragliu, Andrea, Chiara Del Bo, and Peter
Nijkamp. Smart cities in Europe. Journal of urban
technology 18.2 (2011): 65-82.
13. Giffinger, Rudolf et al. Smart cities-Ranking of European medium-sized cities. 2007.
Figura 1: Diseño de la infraestructura software de HERMES
64
A Pilot Study on Energy Saving in an Intelligent Building
Zoe Falomir
Alejandro Fernandez-Montes
´
Universität Bremen, Germany
Universidad de Sevilla, Spain
[email protected]
[email protected]
1
Extended Abstract
Acknowledgments
Dr.-Ing. Falomir acknowledges funding by the FP7 Marie
Curie IEF actions (COGNITIVE-AMI, GA 328763) and by
the Universität Bremen and the Spatial Cognition Centre. Dr.
Gonzalez-Abril and Dr. Fernández-Montes acknowledges funding by the Spanish Ministry of Economy and Competitiveness
(HERMES,TIN2013-46801-C4-1-r) and by the Andalusian Regional Ministry of Economy, Innovation and Science (Simon,
TIC-8052).
Nowadays, European countries are becoming more and more
conscious of the heavy environmental impacts (ozone layer depletion, global warming, climate change, etc.) and green or renewable energy consumption is generally preferred by the population. After the European Energy Performance of Buildings
Directive in 20021 , new buildings nowadays require the installation of solar panels and the use of solar, thermal and photovoltaic energy. Moreover, according to the new European Union
guidelines (2010/31/EU, Art. 2 and Art.9) 2 [Weissenberger et
al., 2014] starting in 2021 new buildings must achieve the nearly
zero-energy (or emission) standards, that is, they must demand
very low energy in utilization.
Our case of study is an eco-friendly-smart building without
air-conditioning system, Cartesium (Figure 1(a)), located at the
University of Bremen (Figure1 (b)), which is provided with automatic systems which control the lightings and the blinds. The
lightnings at the corridors are controlled by sensors of presence,
while the lighting at the offices are switched on/off depending
if the door is open or closed. It can also be controlled by the
users through the website (Figure 2(b)). Cartesium building has
also external luminosity sensors which gather data regarding luminosity and wind speed (Figure 2(c)).
Successful results on energy saving from lightning are obtained in the literature by a smart use of wireless sensor networks
and actuators [Fernández-Montes et al., 2009]. Thus, a theoretical study for the case of Cartesium building has been carried out
and is presented in this paper in order to assess if automating the
lightning and blinds of the offices according to users luminance
perception would be relevant for energy saving and for cognitive
adecuacy to the users.
We have constructed a series of theoretical models and evaluate their energy-saving component. Theoretical models are useful because: (i) make a particular view explicit, making it easier
to debate; (ii) bring out the hidden assumptions of an approach,
and (iii) may suggest new experiments for empirical data collection. Theoretical investigations of the kind carried out in this
paper are very common in many sciences, but less common on
engineering research. If it is not possible to collect the necessary empirical data, a lot can be learned about the possibility
of certain outcomes or their plausibility, such as game theoretic
models applied to economics [Gibbons, 1992].
References
A. Fernández-Montes, L. Gonzalez-Abril, J.A. Ortega, and
F. Velasco. A study on saving energy in artificial lighting by
making smart use of wireless sensor networks and actuators.
IEEE Network, 23(6):16–20, 2009.
R. Gibbons. Game theory for applied economists. Princeton
University, 1992.
M. Weissenberger, W. Jensch, and W. Lang. The convergence of
life cycle assessment and nearly zero-energy buildings: The
case of germany. Energy and Buildings, 76:551–557, 2014.
1 Directive
2002/91/CE of the European Parliament at the Council
on 16/12/2002.
2 EI-Energy Efficient Buildings European Initiative, E2B: http:
//www.e2b-ei.eu
Luis Gonzalez-Abril
[email protected]
65
(a)
(b)
(c)
Figure 1: Cartesium building, location inside University of Bremen, and map.
(a)
(b)
(c)
Figure 2: Interact@Cartesium system: office example, website for managing the lights and the blinds, and example of data gathered
from the external luminosity sensors.
66
Integracion
´ de la Arquitectura Cognitiva SOAR en
un Entorno ROS sobre un Parrot AR.Drone 2.0∗
Sai Kishor Kothakota, Cecilio Angulo
Universitat Politècnica de Catalunya · UPC BarcelonaTech
Pau Gargallo 5. ESAII – Departament d’Automàtica, Barcelona, Spain 08028
Abstract
et al., 2015]. El principal inconveniente de este enfoque es
que se necesita demasiada informacion
´ anterior al inicio de
la toma de decisiones, que impide al robot reaccionar ante
situaciones nuevas.
La alternativa más comun
´ a las máquinas de estado son
los planificadores. Los planificadores suelen basarse en criterios probabilı́sticos que determinan la mejor combinacion
´
de habilidades en tiempo real para un objetivo especı́fico. Un
enfoque diferente a los planificadores es el uso de razonadores, es decir el uso de arquitecturas cognitivas, que ha atraı́do
una renovada atencion
´ tanto de los académicos como de la industria [Besold et al., 2014]. Ası́, una arquitectura cognitiva
se ha aplicado a un robot de servicio general en [Puigbo et
al., 2015] mediante el establecimiento de una comunicacion
´
unidireccional desde la arquitectura al robot.
La comunicacion
´ bidireccional entre entorno y robot y el
uso de una arquitectura cognitiva ayudarı́an a procesar la meta con planes incompletos en base a informacion
´ retardada
procedente del entorno. La arquitectura cognitiva con planificacion
´ incompleta también podrı́a utilizar la informacion
´
disponible por parte del usuario para construir planes. Por lo
tanto, los robots que utilizan este tipo de arquitectura podrı́an
volver a planificar de acuerdo a los datos ambientales y también podrı́a establecerse un entorno interactivo de usuario, lo
que hace innecesario el suministro de datos enormes de antemano y completar el plan a medida que se alcanza el objetivo
especificado.
Existen varias arquitecturas cognitivas en la literatura
[Langleya et al., 2009] como SOAR [Cho et al., 1991;
Laird et al., 2004; Young and Lewis, 1997], ACT-R [Anderson et al., 2004; Anderson, 2007; Trafton et al., 2005],
CRAM, SS-RICS [Kelley, 2006; Kelley and Avery, 2010].
Todas ellas fueron evaluadas en [Puigbo et al., 2015] en un
escenario robotico
´
de proposito
´
general en funcion
´ de su generalizacion,
´ el razonamiento, el aprendizaje, la capacidad de
cumplir subobjetivos y su escalabilidad. Utilizando esa discusion
´ previa, también se ha seleccionado SOAR como arquitectura más adecuada para este estudio.
SOAR es una arquitectura cognitiva que ha estado en continuo desarrollo desde 1980 que ofrece la capacidad de comunicarse con el ambiente externo. SOAR disponde de otras caracterı́sticas especiales que, en caso de no existir un plan inicial para lograr el objetivo, permite poner en práctica varios
mecanismos y conocimientos disponibles, tales como frag-
En la vida diaria, el uso de aplicaciones de Inteligencia Artificial (IA) está creciendo. Sin embargo,
todavı́a se necesita de un mayor apoyo para su integracion
´ en los robots, lo que conllevará el desarrollo de seres inteligentes capaces de tomar decisiones. Los algoritmos de inteligencia artificial hacen
el trabajo más fácil y eficaz. En este documento integramos una Arquitectura Cognitiva (SOAR) desde el dominio de la IA computacional en un Parrot
AR.Drone 2.0, un drone cuatrimotor. El principal
objetivo es hacerse cargo de la mision
´ con el Parrot AR.Drone 2.0 mediante el establecimiento de
una conexion
´ entre el sistema operativo ROS, con
la arquitectura cognitiva SOAR, en el que la informacion
´ de los sensores de aviones no tripulados se
utiliza para llevar a cabo la tarea requerida. Este trabajo tiene como objetivo la comunicacion
´ bidireccional entre SOAR y ROS, a la vez que interactuar
con los seres humanos.
1.
Introduction
La robotica
´
movil
´
es un área de investigacion
´ de rápido
desarrollo tecnologı́co.
´
Los robots moviles
´
se utilizan para
aplicaciones especı́ficas, pero en general carecen de la habilidad para realizar acciones donde se producen toma de decisiones en escenarios en tiempo real, ası́ como también la
[Canal et
interaccion
´ con humanos en que existe ambiguedad
¨
al., 2015].
El enfoque habitual para la planificacion
´ en robots moviles
´
inteligentes se basa principalmente en la tecnologı́a de máquinas de estado, de forma que se consigue un objetivo determinado con un plan completo adecuado disponible antes de su
aplicacion.
´ Este plan completo es una combinacion
´ secuencial de habilidades sencillas del robot para lograr el objetivo. Puesto que ninguna retroalimentacion
´ desde el entorno se
considera, el robot se comporta de la misma manera en diferentes situaciones del entorno. Ası́, utilizando una planificacion
´ completa desde el principio el robot procede con comandos que tienen poco conocimiento sobre el entorno [Puigbo
∗
Este trabajo ha sido financiado en parte a través del proyecto de
investigacion
´ PATRICIA (TIN2012-38416-C03-01) por el Ministerio de Economı́a y Competitividad espanol.
˜
67
mentacion,
´ aprendizaje por refuerzo, y la capacidad de subobjetivos. Esta arquitectura cognitiva permite seleccionar la habilidad necesaria para la situacion
´ actual de cara a lograr el
objetivo especificado sin necesidad de ninguna lista predefinida de situaciones y planes. Este será un elemento clave para establecer una comunicacion
´ bidireccional con el entorno
[Laird, 2008]. Por lo tanto. SOAR proporciona una buena plataforma para establecer dicha comunicacion
´ y la interaccion
´
con el usuario.
Por otra parte, SOAR se ha aplicado recientemente para
el control de un robot humanoide de servicio de proposito
´
general [Puigbo et al., 2015] para resolver tareas complejas
usando las habilidades básicas de robot humanoide de servicio como la navegacion,
´ captar el entorno y reconocimiento
de objetos. Sin embargo, solamente la comunicacion
´ unidireccional entre SOAR y ROS está disponible, y no se está
considerando la intervencion
´ del usuario.
Por lo tanto, el objetivo principal de este trabajo es el desarrollo de una comunicacion
´ bidireccional entre SOAR y ROS.
La bidireccionalidad puede ser utilizada en entornos interactivos, de forma que el robot actue
´ de manera diferente en situaciones particulares. El resto del trabajo se organiza de la
siguiente manera: la arquitectura implementada se introduce
en la Seccion
´ 2; en la Seccion
´ 3 se describe la plataforma robot; la Seccion
´ 4 destaca los principales resultados obtenidos
durante la experimentacion;
´ por último, se ofrecen algunas
conclusiones y futuras lı́neas de investigacion.
´
Figura 1: Imagen gráfica de las estructuras de memoria de
trabajo que SOAR genera automáticamente.
desde el entorno del robot se pueden procesar en SOAR para
la toma de decisiones. Por lo tanto, esta interfaz orientada a
SOAR permitirá bidireccionalidad, ası́ como la intervencion
´
del usuario.
2.1.
2.
Comunicacion
´ SOAR-ROS bidireccional
La memoria de trabajo
La informacion
´ almacenada de corto plazo y las estructuras
probadas de las reglas generadas serán colocadas en la emoria
de trabajo. Ası́ se construye la estructura de grafos de los elementos, donde se crea cada elemento a partir de las reglas y
de los sensores o los datos del entorno externo. La arquitectura SOAR crea automáticamente algunas de las estructuras de
memoria de trabajo para todos los agentes, cuyos elementos
se construyen como tripletas: identificador, atributo y valor.
SOAR automáticamente crea un atributo io, que significa
input-output para comunicarse con el mundo exterior.
Hay dos atributos para io (véase la Figura 1): input-link
y output-link:
SOAR ha sido seleccionada como arquitectura cognitiva en
nuestro enfoque para la planificacion
´ del robot. En SOAR, un
estado es una representacion
´ de la situacion
´ del entorno ante
la resolucion
´ del problema en curso, un operador transforma
un estado (produciendo cambios en la representacion)
´ y un
objetivo es un resultado deseado en la actividad de resolucion
´
de problemas, es decir, la solucion
´ virtual que representa la
manera de solucionarla.
El sistema global propuesto se compone de tres modulos
´
que están conectados mutuamente. La arquitectura cognitiva
envı́a comandos de acuerdo a los datos en tiempo real recibidos. A continuacion,
´ la interfaz entre SOAR y ROS, que es
nuestra implementacion,
´ recibe órdenes de SOAR, las procesa en comandos de ROS y luego toma los datos del entorno de
ROS o de otras fuentes, los procesa para ser enviados a la arquitectura cognitiva SOAR y ası́ procesar futuras decisiones
basadas en los datos de entrada.
La interfaz para comunicarse con el ambiente externo o robots mediante el envı́o y recepcion
´ de comandos se programa
con SML (Soar Markup Language), donde los comandos se
empaquetan como paquetes XML y se envı́an. El entorno y el
depurador que soporta se conocen como clientes.
La comunicacion
´ bidireccional propuesta entre SOAR y
ROS será posible considerando el entonro SOAR como el
elemento principal. Por lo tanto, los comandos enviados desde ROS se procesan en SOAR y los comandos de SOAR a
ROS se procesan correctamente en SOAR y formatean para
una evaluacion
´ fácil en ROS. Mediante el uso de la comunicacion
´ en esta forma, tanto el usuario como las entradas
El atributo output-link en la estructura de la memoria de trabajo se utiliza para enviar comandos al mundo
exterior con el fin de realizar algunas acciones con cada
decision
´ de la arquitectura SOAR. El output-link
es donde los comandos de accion
´ deben ser creados para
los agentes en su mundo.
El atributo input-link en la estructura de la memoria de trabajo se utiliza para obtener la informacion
´ del
mundo exterior o de los sensores. Normalmente, para la
tarea que involucra la informacion
´ desde el entorno externo la mayorı́a de los estados se construyen a partir
de la informacion
´ perceptual disponible de los sensores.
Esta informacion
´ se crea en el input-link de la estructura del estado. Puede también ser utilizado para tomar la entrada del usuario y hacer que esté disponible en
la estructura input-link.
68
2.2.
Envı́o y de recepcion
´ de comandos en SOAR
SOAR ayuda al usuario a comunicar la informacion
´ al proporcionar las estructuras input-link y output-link
de la memoria de trabajo. Usando estas estructuras es posible
enviar comandos de SOAR y hacer operaciones con éxito. En
forma similar, podemos usar la estructura input-link para obtener datos en SOAR (desde ROS en nuestro caso) con
fines computacionales o de toma de decision.
´ Ası́, se pueden
obtener los datos de los sensores de a bordo en el robot y
utilizarlos para la toma de decisiones.
2.3.
El cliente de interfaz
Con el fin de construir la comunicacion
´ con el entorno
ROS, un cliente de interfaz debe ser disenado
˜
de manera que pueda acceder a las estructuras input-link y
output-link para el procesamiento de la comunicacion
´
de envı́o / recepcion
´ de datos en SOAR. Esta conexion
´ de la
interfaz a la arquitectura cognitiva SOAR se puede establecer mediante clientes SML (Soar Markup Language). Estos
clientes ayudan a la interfaz a establecer una conexion
´ correcta para la transferencia de datos entre ellos.
SOAR ofrece diferentes archivos de clientes SML para diferentes lenguajes de programacion
´ (C ++, Python, Java) y estos archivos se pueden utilizar para establecer la conexion
´ entre la interfaz programada y la arquitectura cognitiva SOAR.
Acciones seleccionadas por la arquitectura SOAR se traducirán en la activacion
´ de una habilidad en el entorno ROS
por medio de la interfaz. Estos comandos ROS luego se reflejan en las acciones del robot, que son los pasos para lograr el
objetivo especı́fico.
Los clientes SML utilizados desde el archivo de interfaz
ayudan a la interfaz a establecer una conexion
´ con el nucleo
´
de SOAR. El cliente puede ayudar ya sea en la creacion
´ de
un kernel local de SOAR o una conexion
´ remota a un kernel existente SOAR. Este objeto nucleo
´
se puede utilizar para
crear un agente de SOAR. Los agentes creados en SOAR se
encargan de la gestion
´ de los elementos de memoria de trabajo. Este agente ayuda a obtener informacion
´ de la arquitectura
SOAR y también del envı́o de informacion
´ hacia él (ver Figura 2).
3.
Figura 2: Desarrollo de interfaz del cliente SML.
Figura 3: Parrot AR.Drone 2.0.
Configuracion
´ de la experimentacion
´
se procesa a partir de los comandos decididos en SOAR para
que el avion
´ no tripulado pueda llevar a cabo las habilidades
o acciones.
Parrot AR.Drone 2.0 responde a los comandos ROS con
el uso de los paquetes ardrone driver y ardrone autonomy
en el entorno ROS1 . Estos controladores ayudan en la interfaz a que las órdenes recibidas desde la arquitectura cognitiva SOAR sean procesados en el entorno de ROS. Este entorno proporciona los datos de navegacion
´ de los sensores de
a bordo, antes, durante y después del vuelo. Las lecturas de
los sensores disponibles son, entre otros, las senales
˜
de altitud, magnetometro,
´
giroscopio y acelerometro,
´
las lecturas
del barometro,
´
el valor de la temperatura, el ángulo de viento.
Estos pueden ponerse a disposicion
´ de SOAR y ayudar en la
toma de decisiones.
Para la experimentacion
´ de la comunicacion
´ bidireccional entre la arquitectura cognitiva SOAR y el entorno ROS
se ha empleado el UAV (Unmanned Aerial Vehicle) Parrot
AR.Drone 2.0 (ver Figura 3). Este UAV está equipado con
numerosos sensores a bordo y un controlador para mantener
un vuelo estable, por lo que puede mantenerse en un lugar de
espera antes de recibir el siguiente comando.
El Parrot AR.Drone es una interesante plataforma de investigacion
´ para la vision
´ por ordenador y la exploracion
´ roboti´
ca [Krajnı́k et al., 2011]. En el proceso de experimentacion,
´
SOAR empleará habilidades simples del robot, como el aumento / disminucion
´ de la altitud, movimiento hacia adelante
/ atrás, ‘rolling’ a la derecha / izquierda, giro hacia la derecha /
izquierda, con el fin de lograr un objetivo especı́fico (véase el
Cuadro 1. El Parrot AR.Drone se controlará mediante entorno
ROS por lo que la ejecucion
´ de órdenes de ROS en la interfaz
1
69
http://autonomylab.org/
Habilidad
Adelante
Atras
´
Arriba
Abajo
Lateral
Lateral2
Derecha
Izquierda
Accion
´
Pitch adelante
Pitch atrás
Incrementa altitud
Decrementa altitud
Rolling derecha
Rolling izquierda
Yaw derecha
Yaw izquierda
lacionados con los subcomandos permitidos en el robot,
que son los definidos en la Tabla 1.
Categorı́a 2. Intervencion
´ del usuario, cuyos datos pueden ser utilizados para la toma de decisiones.
En la Categorı́a 1, en funcion
´ de la presencia o no de un
marcador AR, el modulo
´
SOAR toma decisiones en forma de
retroalimentacion
´ del sistema: el modulo
´
SOAR modifica el
plan en tiempo real de manera que se logre el objetivo tras
sensar el entorno. Durante la operacion,
´ se busca la presencia
de marcadores AR y se envı́a al modulo
´
de SOAR el resultado
antes de tomar la siguiente decision;
´ a su vez, esta informacion
´ es utilizada por SOAR para tomar decisiones ya que ası́
conoce si ha encontrado un marcador AR en la operacion
´ anterior. El robot, tras encontrar un marcador AR, aterrizará en
esa posicion,
´ si ası́ lo determina el usuario.
En la Categorı́a 2, se le pregunta al usuario si debe aterrizar
el UAV después de haber encontrado el marcador AR:. “Enter
1 for landing, 0 for hovering some time before landing”. Dependiendo de la entrada del usuario, el robot o bien aterriza
inmediatamente o puede hacerlo tras flotar durante un determinado perı́odo de tiempo. Ası́ se introduce la intervencion
´
del usuario en la comunicacion
´ SOAR-ROS.
Cuadro 1: Habilidades del robot disponibles y sus acciones
asociadas.
4.
Resultados
La comunicacion
´ bidireccional SOAR-ROS ha sido probada en un entorno cerrado para el proceso de busqueda
´
por
parte de un Parrot AR.Drone 2.0 de un marcador AR. En esta
experimentacion,
´ el robot realiza operaciones basadas en los
comandos de ROS que se procesan en la interfaz a partir de
los comandos de decision
´ recibidos de la arquitectura SOAR
y luego devuelve informacion
´ acerca de la deteccion
´ de un
marcador de AR a SOAR para la toma de decisiones. En funcion
´ de la decision
´ del usuario, el plan cambia con el tiempo
y las habilidades están determinadas por el modulo
´
SOAR
para lograr el objetivo de una manera apropiada. Durante la
ejecucion
´ han sido probadas las dos categorı́as.
La experimentacion
´ implica proveer el sistema robot con
una entrada de usuario por medio de lı́nea de comandos, y la
comprobacion
´ de que el robot ha sido capaz de realizar las
acciones necesarias para completar el objetivo. Situaciones
en las que se prueba el robot son:
Categorı́a 1. :“Busqueda
´
de marcador AR y aterrizaje”.
La secuencia de las acciones realizadas por el robot es:
hacia adelante, verifica si existe marcador AR x 5, lado,
, verifica si existe marcador AR, hacia atras,
´ verifica si
existe marcador AR x 5, lateral, verifica si existe marcador AR, hacia adelante, verifica si existe marcador AR
x 5 y ası́ sucesivamente (hasta un cierto lı́mite de operaciones laterales) lo que demuestra que no es necesaria la
planificacion
´ completa inicial en este caso.
Category 2.:“Intervencion
´ del Usuario” Cuando se encuentra un marcador AR, entonces SOAR es notificado
sobre su presencia a través de un sistema de retroalimentacion
´ en la interfaz. A continuacion
´ se pedirá la entrada del usuario a escoger entre las opciones disponibles.
Cuando el usuario escoge una opcion,
´ SOAR llevará a
cabo una accion
´ de acuerdo con la interaccion
´ del usuario / intervencion
´ en el sistema.
Figura 4: Augmented Reality (AR) markers.
El objetivo en la experimentacion
´ será el de encontrar un
marcador de Realidad Aumentada (AR) durante el vuelo del
UAV y, eventualmente, aterrizar en base a la aceptacion
´ por
parte del usuario. La deteccion
´ de marcadores AR se llevará
a cabo con el paquete ar pose, que es un estimador de pose
de marcadores de realidad aumentada que utiliza ARToolkit
[Amin and Govilkar, 2015]. El UAV buscará un marcador de
AR durante su vuelo y enviará la informacion
´ sobre su presencia a la arquitectura SOAR a través de la interfaz. Estos
datos son utilizados por SOAR para la toma de decisiones.
Algunos de los marcadores AR empleados se muestran en la
Figura 4.
Con este experimento podemos mostrar la comunicacion
´
bidireccional entre SOAR y ROS, el envı́o y recepcion
´ de comandos entre SOAR y ROS en la interfaz desarrollada, ası́
como la forma en que el usuario también puede enviar comandos a fin de completar el plan, aceptando el aterrizaje
propuesto por el drone o eventualmente rechazarlo. La experimentacion
´ se ha hecho disponible en http://tiny.cc/ardrone.
La aplicacion
´ ha sido disenada
˜
en un ambiente cerrado pero inconcontrolado, un pasillo en un edificio publico.
´
Para
alcanzar el objetivo, el robot puede trabajar en dos tipos de
interacciones:
Categorı́a 1. Uso de comunicacion
´ bidireccional entre
la arquitectura cognitiva SOAR y el entorno ROS, como
por ejemplo los datos de presencia de marcadores AR re-
70
El sistema cognitivo presentado en este artı́culo garantiza
que las acciones propuestas darán lugar a conseguir la meta,
por lo que el robot encontrará una solucion
´ basada en la retroalimentacion
´ desde el entorno ROS y en la intervencion
´ del
usuario. Esta situacion,
´ sin embargo, no puede ser declarada
como óptima. Por ejemplo, en algunas situaciones, el robot
podrı́a trasladarse a un lugar que no era camino correcto para
el comando ejecutado; antes de pasar a una segunda accion
´
necesita dar un paso en la direccion
´ correcta. En todo caso,
la realizacion
´ de la tarea está garantizada ya que la arquitectura proporcionará continuamente pasos hasta que se logre el
objetivo especı́fico.
5.
el estado del mundo no ha cambiado y seleccionará de nuevo la misma accion
´ (reintento) para el logro de la meta. Este
comportamiento podrı́a dar lugar a un bucle infinito de reintentos. Para la seguridad del robot UAV, cuando se produce
esta situacion
´ el drone aterriza en el momento que reconoce
la existencia de comandos repetidos.
Acknowledgments
Este trabajo fue posible gracias al apoyo de la Universitat Politècnica de Catalunya (UPC) y la SASTRA University,
con su programa de intercambio.
Referencias
Conclusiones y futuras investigaciones
[Amin and Govilkar, 2015] Dhiraj Amin and Sharvari Govilkar. Comparative study of Augmented Reality SDK’s. International Journal on Computational Sciences & Applications (IJCSA), 5, Feb 2015.
[Anderson et al., 2004] John R Anderson, Daniel Bothell,
Michael D Byrne, Scott Douglass, Christian Lebiere, and
Yulin Qin. An integrated theory of the mind. Psychological review, 111(4):1036–60, October 2004.
[Anderson, 2007] John R. Anderson. How can the human
mind occur in the physical universe? New York: Oxford
University Press, 2007.
[Besold et al., 2014] T. R. Besold, A. D’Avila Garcez, K. U.
Kuhnberger,
¨
and T. C. Stewart(eds.). Neural-symbolic networks for cognitive capacities (Special issue). Biologically
Inspired Cognitive Architectures (BICA), 9, 1–122, 2014.
[Canal et al., 2015] Gerard Canal, Cecilio Angulo, and Sergio Escalera. Human multi-robot interaction based on
gesture recognition. In International Joint Conference on
Neural Networks, IJCNN, 2015. In Press.
[Cho et al., 1991] Bonghan Cho, Paul S. Rosenbloom, and
Charles P. Dolan. Neuro-Soar: A neural-network architecture for goal-oriented behavior. Proceedings of the Thirteenth Annual Conference of the Cognitive Science Society, pages 673–677, 1991.
[Kelley and Avery, 2010] Troy Dale Kelley and Eric Avery.
A cognitive robotics system: the symbolic and subsymbolic robotic intelligence control system (SS-RICS).
SPIE Proceedings Vol. 7710: Multisensor, Multisource Information Fusion: Architectures, Algorithms, and Applications 2010, 25:460–470, April 2010.
[Kelley, 2006] Troy Dale Kelley. Developing a Psychologically Inspired Cognitive Architecture for Robotic Control :
The Symbolic and Subsymbolic Robotic Intelligence Control System. Internation Journal of Advanced Robotic Systems, 3(3):219–222, 2006.
[Krajnı́k et al., 2011] Tomáš Krajnı́k, Vojtěch Vonásek, Daniel Fišer, and Jan Faigl. AR-Drone as a Platform for Robotic Research and Education. Springer Berlin Heidelberg
: Communications in Computer and Information Science,
pages 172–186, June 2011.
[Laird et al., 2004] John E Laird, Keegan R Kinkade, Shiwali Mohan, and Joseph Z Xu. Cognitive Robotics using
La comunicacion
´ bidireccional entre la arquitectura cognitiva SOAR y el entorno ROS se ha introducido y probado
en un UAV (Unmanned Aerial Vehicle) Parrot AR.Drone 2.0
con objeto de resolver una tarea compleja expresada como la
combinacion
´ de habilidades simples. La comunicacion
´ bidireccional es un tema clave para aumentar la informacion
´ en
la toma de mejores decisiones por parte de un razonador como SOAR. De esta forma, los planes incompletos pueden ser
considerados, informacion
´ del entorno se puede introducir en
el razonador y el usuario puede interactuar en el proceso de
planificacion.
´
La comunicacion
´ bidireccional introducida ha permitido superar dos desventajas principales del anterior enfoque
de comunicacion
´ SOAR-ROS. En primer lugar, la finalizacion
´ de los planes incompletos de un objetivo especificado
basándose en los datos en tiempo real disponibles en el entorno. En segundo lugar, la intervencion
´ del usuario que ayuda a las arquitecturas a comunicarse directamente con los
usuarios y tomar sus sugerencias o entradas para utilizarlos
en la toma de decisiones.
Como inconveniente, ya que se permiten planes parciales
cuando se inicia la ejecucion
´ del plan, la arquitectura cognitiva no puede conocer si el objetivo solicitado al robot será
completamente accesible o no. Sin embargo, ya que la accion
´ del usuario y la retroalimentacion
´ del entorno son ahora
accesibles, volver a planificar se hace más fácil, ası́ como la
intervencion
´ del usuario en base a la informacion
´ del entorno
ayuda en la consecucion
´ de la meta.
Se abren además diversas lı́neas de investigacion
´ futuras.
Ası́, con el uso de esta comunicacion
´ bidireccional, solo
´ una
arquitectura es necesaria sobre las habilidades simples de robots bien definidas y la retroalimentacion
´ de los sensores de a
bordo. Por lo tanto, el robot puede comportarse de diferentes
maneras ante diferentes ambientes, modificando los planes y
ayudando a lograr el objetivo.
Por otra parte, la implementacion
´ actual se puede mejorar
en términos de robustez mediante la resolucion
´ de un problema conocido. Principalmente, si una de las acciones no está
completamente lograda (por ejemplo, el robot no es capaz de
llegar a una posicion
´ en el espacio, porque está ocupada o el
robot no puede encontrar un objeto que está delante de él), la
activacion
´ de la habilidad fallará. En la implementacion
´ actual, ası́ como en los anteriores, el robot no tiene medios para
descubrir la razon
´ del fallo. De ahı́ que el robot detecte que
71
the Soar Cognitive Architecture. In Proc. of the 6th Int.
Conf.on Cognitive Modelling, pages 226–230, 2004.
[Laird, 2008] John E Laird. Extending the Soar Cognitive
Architecture. In Artificial General Intelligence Conference, 2008.
[Langleya et al., 2009] Pat Langleya, John E. Lairdb, and
Seth Rogersa. Cognitive architectures: Research issues
and challenges. Cognitive Systems Research, June 2009.
[Puigbo et al., 2015] J.-Y. Puigbo, A. Pumarola, C. Angulo,
and R. Tellez. Using a Cognitive Architecture for Generalpurpose Service Robot Control. In Connection Science The Society for the Study of Artificial Intelligence and the
Simulation of Behaviour, 2015.
[Trafton et al., 2005] J.G. Trafton, N.L. Cassimatis, M.D.
Bugajska, D.P. Brock, F.E. Mintz, and A.C. Schultz.
Enabling effective human-robot interaction using
perspective-taking in robots.
IEEE Transactions on
Systems, Man and Cybernetics, 25:460–470, July 2005.
[Young and Lewis, 1997] Richard M Young and Richard L
Lewis. The Soar Cognitive Architecture and Human Working Memory. In Models of Working Memory: Mechanisms of Active Maintenance and Executive Control, Nov
1997.
72
Towards Parameterizing a Colour Model depending on the Context
Lledó Museros, Ismael Sanz
Zoe Falomir
Universitat Jaume I
Universität Bremen
Castellón
Bremen
Abstract
Luis González-Abril
Sevilla
it can be adapted to the results of the group of users consulted.
The study of the influence of the context when naming
colors is not new. Some authors searched for a prototypical, language-independent color naming theory [3-4],
while others affirmed language relativism when naming
colors [5]. Mylonas et al. [6] presented a synthetic observer trained by the participants’ responses to facilitate
color communication. In this paper their approach is followed, but with a crucial difference: while their work
aims to separate participants into culturally homogeneous
groups, and then create a customization for each different
group, in our context we try to find a single customization
for the whole group of participants trying to gather all the
cultural differences thanks to the use of a fuzzy theory.
The advantage of our approach is that it can capture the
natural heterogeneity of the group.
In this paper we try to demonstrate if when people name colors are influenced by aspects as culture, visual ability, experience, and even the capabilities of the display device. For that reason,
in order to parameterize a naming model using a
taxonomy of colors as general as possible, an
experiment has been carried out where people
were asked to freely determine a name and an
adjective (if necessary) for describing a displayed color. During the experiment users’ profile and level of confidence information have
been recovered too. All these information is used
to check if the Fuzzy Color Descriptor (FCD)
presented in [1] is suitable to gather the context
influence in a naming color model for humanmachine communication.
1. Introduction
Acknowledgements
Since the process of color naming is formed to a greater
part by early childhood learning processes it could be
considered a subjective process, and therefore influenced
by each one context (culture, language, education, and so
on). To check if this assumption is true, an experimental
research based in an online test has been conducted to investigate the process of color naming considering the effect of age, gender, occupation, cultural level, nationality
and mother-language.
A Fuzzy Color Descriptor (FCD) based on the Hue Saturation and Lightness (HSL) color space and 3D Radial
Basis Functions (RBFs) are used in the experiment for
categorizing color coordinates into names with a degree
of believing. HSL color space is used because, according
to Falomir et al. [2] its topological structure is intuitive to
be divided into intervals of values corresponding to color
names by maintaining the continuity of the parameters
and defining a conceptual neighborhood diagram. This
fuzzy color model is defined in general and parameterized
as a baseline using data of a collection of color data [1].
However with the result of the experiment here presented
This work was conducted on the scope of the following
projects: Spanish Ministry of Economy and Competitiveness HERMES (TIN2013-46801-C4-1-r), Andalusian
Regional Ministry of Economy (project SIMON TIC8052), the Spanish Ministry of Economy and Competitiveness (project TIN2011-24147), Generalitat Valenciana (project GVA/2013/135) and Universitat Jaume I
(Project P11B2013-29).
Dr.-Ing. Zoe Falomir acknowledges funding by the project COGNITIVE-AMI (GA 328763) by the European
Commission through FP7 Marie Curie IEF actions and
the support by the Universität Bremen and the Spatial
Cognition Centre.
References
[1] Falomir, Z., Mast, V., Vale, D., Museros, L., Gonzalez-Abril, L., 2014. Towards a fuzzy colour model sensitive to the context, in: Proceedings of the XVI ARCA
Days: Qualitative Systems and its Applications to Diagnose, Robotics and Ambient Intelligence, pp. 57–67.
73
[2] Falomir Z., Museros L., Gonzalez-Abril L. (2015), A
Model for Colour Naming and Comparing based on Conceptual Neighbourhood. An Application for Comparing
Art Compositions, Knowledge-Based Systems, 81: 1-21.
DOI: http://doi.org/10.1016/j.knosys.2014.12.013.
[3] Berlin, B., Kay, P., 1969. Basic color terms: Their
universality and evolution. University of California Press,
Berkeley.
[4] Kay, P., Regier, T., 2007. Colour naming universals:
The case of Berinmo. Cognition 111, 289–298.
[5] Roberson, D., Davidoff, J.B., Davies, I.R.L., Shapiro,
L.R., 2005. Color categories: Evidence for the cultural
relativity hypothesis. Cognitive Psychology 50, 378–411.
[6] Mylonas, D., Stutters, J., Doval, V., MacDonald, L.,
2013. Colournamer – a synthetic observer for colour
communication. AIC 2013.
74
Calculo
´
de la odometrı́a en un robot cuadrupedo
´
∗
mediante técnicas de vision
´ artificial
Lucı́a Lillo-Fantova, Manel Velasco, Cecilio Angulo
Universitat Politècnica de Catalunya · UPC BarcelonaTech
Pau Gargallo 5. ESAII – Departament d’Automàtica, Barcelona, Spain 08028
[email protected], {manel.velasco, cecilio.angulo}@upc.edu
Abstract
En una primera fase, se hizo posible publicar las
imágenes obtenidas por la cámara del robot AIBO a través del entorno de programacion
´ ROS, un
estándar en la programacion
´ de robots moviles.
´
Estas imágenes, pueden ser utilizadas por algoritmos
de vision
´ artificial para obtener la odometrı́a del
robot. Debido a la escasa calidad de las imágenes
obtenidas por la cámara integrada en el robot, se
doto´ al robot de un sistema de vision
´ estéreo, incorporando una cámara externa de mayor resolucion.
´
Se han establecido dos sockets, uno para cada una
de las cámaras, que envı́an la informacion
´ a través
de una Raspberry Pi B+ a un ordenador externo.
El ordenador envia las órdenes de movimiento al
robot AIBO, quien las recoge y ejecuta mediante
un cliente de URBI que corre en el procesador del
robot. Los experimentos muestran la fiabilidad del
método en la medida de rotacion,
´ pero sus limitaciones en el cálculo de la traslacion,
´
1.
Figura 1: Estructura del modelo de integracion
´ del robot AIBO de Sony en ROS.
integrar ROS en el proyecto nace como una oportunidad de
poder hacer uso de la gran cantidad de librerı́as y herramientas que incorpora este sistema operativo para facilitar
la creacion
´ de aplicaciones roboticas.
´
Entre ellas, nos ofrece
herramientas como SLAM [Huang and Dissanayake, 2007;
Riisgaard and Blas, 2005], SIFT [Flores and Braun, 2011] o
RANSAC, de gran utilidad para el proyecto.
El cálculo de la odometrı́a en robots rodados es un problema con una amplia literatura de soluciones. La informacion
´
odométrica se suele extraer integrando los valores obtenidos
de los encoders montados sobre las ruedas del robot. En cambio en robots con patas como AIBO no puede aplicarse el
sistema estándar anterior. En estos casos, una alternativa es
el cálculo de la odometrı́a a partir de la informacion
´ visual
aportada por las cámaras integradas en el robot.
Inicialmente se definio´ una solucion
´ al problema planteado
unicamente
´
con las herramientas incorporadas en el propio
robot AIBO, conjuntamente con un portátil para realizar los
cálculos computacionales más complejos. No obstante, debido a la voluntad de realizar el control del robot mediante
vision
´ por computador y la escasa calidad de las imágenes
capturadas por la cámara incorporada en el robot, se ha recurrido a un sistema de estéreo vision
´ montado sobre éste.
El artı́culo se organiza como sigue: la Seccion
´ 2 realiza
una breve introduccion
´ al robot AIBO y la metodologı́a para la adquisicion
´ de la imagen de la cámara del robot AIBO
y su publicacion
´ en ROS. Después, se define el conjunto de
estéreo vision
´ y su puesta a punto. Luego se describe el método global de correspondencia de imágenes que deberá permitir el cálculo de la odometrı́a, incluyendo la reconstruccion
´
tridimensional a partir del cálculo de la disparidad y la matriz
Introduccion
´
Este proyecto surge a partir de la voluntad de recuperar la
funcionalidad del robot AIBO de Sony [Decuir et al., 2004],
actualmente discontinuado en su produccion
´ y soporte, dentro de un entorno de programacion
´ actualizado y potente
[Kertész, 2013], ası́ como dotarlo de las herramientas necesarias [Kolovrat, 2013] para proporcionarle la capacidad de
navegar autonomamente
´
en un entorno de trabajo mediante
algoritmos de vision
´ artificial.
Como se muestra en la Figura 1, a partir del sistema
operativo de AIBO, OPEN-R SDK, se ha integrado el entorno de programacion
´ URBI [Baillie, 2005a; 2005b], mediante el cual se enviarán datos a través de una estructura servidor/cliente entre el robot, servidor, y un ordenador externo, cliente. Los datos recibidos por el cliente deberán ser almacenados en un buffer temporal para su posterior tratamiento mediante algoritmos en OpenCV [Bradski, 2000], encargado de su transformacion
´ en imágenes interpretables por ROS [Quigley et al., 2009]. La voluntad de
∗
Este trabajo ha sido financiado parcialmente por el proyecto de
investigacion
´ PATRICIA (TIN2012-38416-C03-01) del Ministerio
de Economı́a y Competitividad del Gobierno de Espana.
˜
75
de disparidad a profundidad obtenida del proceso de calibracion.
´ En las siguientes secciones se define el protocolo para el
cálculo del movimiento realizado por el AIBO y se describen
las diversas experimentaciones realizadas para dicho cálculo.
Finalmente, se presentan las diferentes conclusiones extraı́das
a partir los resultados observados y las posibles ampliaciones
del trabajo aquı́ presentado.
2.
igualmente la imagen a cv::Mat o IplImage segun
´ la aplicacion
´ pero representan el mismo formato. En consecuencia el
primer paso será la transformacion
´ de los datos a OpenCV.
Primero se operan los mensajes de URBI de tipo UMessage donde si el contenido es correcto contendrá un apuntador
al valor del mensaje. Los datos adquiridos serán de tipo UValue el cuál contiene varias variables. De entre ellas se seleccionan los datos del tipo DATA BINARY, representados por
mensajes de tipo UBinary (Figura 3). Finalmente, se escoge
la informacion
´ del subtipo UImage.
El robot AIBO
AIBO es un robot cuadrupedo
´
creado por SONY e introducido en 1999 con el modelo ERS-110 [Téllez, 2004]. Fue el
primer robot de este tipo en el mercado. Inicialmente SONY
enfoco´ este producto como un robot de entretenimiento para el uso doméstico, pero tuvo una gran repercusion
´ entre la
comunidad cientı́fica puesto que facilitaba la investigacion
´ en
campos como la inteligencia artificial y la interaccion
´ entre
robots [Zhang and Chen, 2007].
SONY comenzo´ su comercializacion
´ limitando el acceso
al lenguaje de programacion
´ a ellos mismos y a los participantes de la RoboCup. En 2001 retiro´ los derechos de autor
al kit de programacion
´ del AIBO permitiendo ası́ el uso no
comercial de éste. El kit incluı́a los lenguajes de programcion
´ R-CODE, OPEN-R SDK y el Framework remoto de AIBO. Además de las propias plataformas oficiales de SONY
la comunidad de usuarios desarrollo´ toda una serie de plataformas sobre el robot para facilitar la programacion
´ de éste,
tales como
´
Tekkotsu [Tek, 2014], URBI [Gostai, 2014] o la
compilacion
´ cruzada de Webots [Hohl et al., 2006].
El proyecto ideado por SONY únicamente duro´ siete anos,
˜
parando la produccion
´ en 2006, pero a pesar de la brevedad
lanzo´ toda una serie de modelos en tres generaciones de robot
(Figura 2). Pese a la interrupcion
´ de la fabricacion
´ hoy en dı́a
aun
´ se encuentran eventos tales como
´
la Convencion
´ Internacional de AIBO donde se realizan intercambios de software
open source.
Figura 3: Obtencion
´ de las imágenes.
Esta informacion
´ será guardada en un buffer de 4 imágenes ya en formato jpg, por lo que se presentará una imagen
de 3 canales R, G, B que posteriormente será abierta mediante OpenCV y transformada en una imagen de tipo cv::Mat,
tipo obligado para la implementacion
´ de las funciones de
la librerı́a de CV Bridge. Mediante ésta transformaremos las
imágenes a un formato interpretable por ROS y que serán enviadas al topico
´
de imagen correspondiente.
3.
Conjunto de estéreo vision
´
La imagen que genera el propio AIBO presenta una calidad muy baja. Además, el robot únicamente dispone de una
cámara; en consecuencia, dificulta la extraccion
´ de la traslacion
´ entre imágenes. Por todo ello, se ha creado un conjunto
de estéreo vision
´ para la realizacion
´ del proyecto permitiendo
ası́ tareas como
´
la reconstruccion
´ tridimensional. El conjunto
de estéreo vision
´ estará formado por dos cámaras web sujetas mediante un soporte solidario a la rotacion
´ del cuello del
AIBO y conectadas a una Raspberry Pi B+ que enviará las
imágenes recibidas al ordenador.
Ambas cámaras se situarán a una distancia segun
´ planos de
5cm aproximadamente (se ha obtenido un valor de 4,802cm).
Dicha distancia es conocida como lı́nea de base, b y será
de gran utilidad para la reconstruccion
´ tridimensional de las
imágenes. El plano de las cámaras se situará 3cm por detrás
del punto de rotacion
´ de la cabeza y 5cm por encima de éste.
Figura 2: Modelos de AIBO en el perı́odo 1999-2006.
El robot AIBO dispone de una cámara de tecnologı́a
CMOS que presenta una resolucion
´ de 350000 pı́xels (480 x
720) [Pérez et al., 2010a; Pérez et al., 2010b]. Las imágenes
captadas tendrán que ser convertidas para permitir el procesamiento aplicado posteriormente.
El primer punto a tratar es la transformacion
´ de las imágenes a partir de URBI en un formato adecuado para la publicacion
´ en ROS. El formato correspondiente a los topicos
´
de imagen en ROS es sensor msgs::Image. Para realizar la
conversion
´ se dispone de la plataforma cv bridge, una librerı́a de OpenCV que facilita la conversion
´ entre imágenes
de OpenCV a mensajes de ROS. Para esto se debe transformar
76
Dichos valores son de utilidad a la hora de asociar la rotacion
´
y la traslacion
´ visionada por el conjunto con el movimiento
realizado por el propio AIBO.
La Raspberry Pi B+ es un ordenador de placa única, SBC,
desarrollado por la Raspberry Pi Foundation como ordenador
de bajo coste destinado a promover la programacion
´ en las
escuelas. Permite diferentes sistemas operativos entre ellos
el sistema operativo Linux escogido, Raspbian. Al no incorporar un sistema de almacenamiento permanente el sistema
operativo se instala en una tarjeta SD de 32Gb. Sobre este
sistema operativo se deberán instalar las librerı́as OpenCV
2.4.10 para el tratamiento de imágenes.
Para la comunicacion
´ entre la Raspberry Pi B+ y el ordenador se establecen dos sockets TCP para el envı́o de las imágenes capturadas, un socket por cada cámara. En esta estructura
la Raspberry Pi será el cliente que se conecta al servidor, el
ordenador. Debido a la existencia de dos sockets el ordenador
reserva dos puertos para la recepcion
´ de imágenes. La definicion
´ de los parámetros de la cámara se realiza mediante las
librerı́as libv4l2 que se aplica sobre los dispositivos de vı́deo.
Finalmente, una vez montado se ha calibrado el conjunto de estéreo vision
´ con la finalidad de obtener los parámetros intrı́nsecos e extrı́nsecos de las cámaras y poder ası́ realizar las correcciones necesarias en ambas imágenes, izquierda
y derecha. Asimismo se utilizarán los parámetros obtenidos
para extraer la informacion
´ tridimensional de la imagen de
utilidad en el cálculo de la odometrı́a. Además, como la informacion
´ que se extraiga tendrá como finalidad el cálculo
del movimiento del robot AIBO considerando como origen
el centro de rotacion
´ del robot, se deberá de transformar la
informacion
´ obtenida al centro de rotacion
´ de éste.
El origen de coordenadas de las cámaras se situa
´ en la
cámara izquierda mientras que el origen de coordenadas de
AIBO se situa
´ en el centro de giro del robot que se considera
que se localiza en el centro del cuerpo del robot. Por ello se
establecen tres puntos con sistemas de referencia conocidos.
La calibracion
´ de las imágenes consiste en la estimacion
´ de
los parámetros de la cámara a partir de la captura de imágenes
de calibracion
´ en las que se observa un patron
´ de dimensiones reales conocidas. Dicho patron
´ consistirá en un tablero de
ajedrez de 10x7 casillas cada una de 2,5x2,5cm.
Dichos parámetros son de dos tipos, los extrı́nsecos y los
intrı́nsecos. Los parámetros extrı́nsecos son aquellos que relacionan el sistema de coordenadas global {Xw , Yw , Zw } con
un sistema de coordenadas fijo a la cámara {x, y, z} obteniendo ası́ su posicion
´ y orientacion
´ respecto al sistema definido.
Mientras que los intrı́nsecos relacionan un punto del sistema
de coordenadas fijo a la cámara (x, y, z)t con su proyeccion
´
en el plano de la imagen (X, Y )t .
En ambas cámaras se han obtenido parámetros muy parecidos debido a que son del mismo modelo, en consecuencia
las lentes presentan una gran similitud entre ellas resultando
en una distancia focal y centro óptico prácticamente iguales.
4.
base camera: Siendo el origen la cámara izquierda del
conjunto de estéreo vision.
´ El eje z es el perpendicular
al plano de imagen, el eje y es la vertical del plano de
imagen y finalmente el eje x es la horizontal del plano
de imagen.
base head: Siendo el origen el punto central de la lı́nea
de contacto entre el conjunto de estéreo vision
´ y la parte
trasera de la cabeza del AIBO y manteniendo la misma orientacion
´ de los ejes que en base camera. Coincide con el origen de la semicircunferencia de la base del
colları́n del AIBO.
4.1.
Reconstruccion
´ tridimensional
Una vez determinado el modelo que define el conjunto de
estéreo vision
´ se realiza la reconstruccion
´ tridimensional de la
escena captada. Para ello se parte de la imagen de referencia,
la izquierda, y se realiza la busqueda
´
del pı́xel correspondiente en la imagen derecha. La diferencia entre las coordenadas
de los pı́xeles correspondientes es conocida como disparidad
y es expresada en términos de pı́xeles.
Se considera que el pı́xel correspondiente pertenece a la
lı́nea epipolar ya que se parte de la imagen rectificada, es decir, partiendo de un pı́xel en la imagen de referencia (x, y)
se establece que el pı́xel correspondiente cumplirá la relacion
´
x0 = x + d(x, y) estableciendo d(x, y) como la disparidad
entre dos pı́xeles.
Una vez limitado el cálculo a los pı́xeles pertenecientes a
la lı́nea epipolar se procede a calcular la disparidad. Para ello
base link: Siendo el origen el centro de rotacion
´ del AIBO.
Las transformaciones entre sistemas de referencia quedan
resumidas en el Cuadro 1.
base head → base camera
(−0,024, 0,000, 0,050)
Implementacion
´ del método
Una vez calibrado el conjunto de estéreo vision,
´ etapa previa a la puesta en marcha del robot, se ha realizado la reconstruccion
´ tridimensional de la escena observada a partir del
modelo obtenido. Mediante la reconstruccion
´ tridimensional
se obtiene una nube de puntos que recoge la informacion
´ de
color de la imagen y la profundidad calculada, de utilidad en
el cálculo de la traslacion.
´
Posteriormente se han extraı́do los puntos de interés de cada una de las cámaras del conjunto de estéreo vision
´ permitiendo ası́ relacionar dos escenas siempre y cuando tengan
una parte de la escena comun.
´
El algoritmo de correspondencia entre imágenes es la base
para el cálculo de la transformacion
´ entre dos escenas recogido en la matriz fundamental. A partir de la matriz fundamental y conjuntamente con los parámetros de calibracion
´ se
ha extraı́do la informacion
´ de la matriz esencial, imprescindible para el cálculo de la rotacion
´ del robot. Además, la correspondencia entre imágenes también sirve para calcular la
traslacion
´ como la diferencia entre los valores de la nube de
puntos para cada uno de los puntos emparejados.
Finalmente se establece la metodologı́a utilizada para procesar los datos obtenidos y calcular ası́ la odometrı́a a partir
de los puntos aquı́ expuestos.
base link → base head
(0,070, 0,000, 0,160)
Cuadro 1: Transformaciones entre sistemas de referencia, expresadas en metros.
77
se calcula la diferencia de intensidad de un pı́xel con su correspondiente suponiendo una disparidad d y la de los pı́xeles
contiguos con sus pı́xeles correspondientes, considerando el
mismo valor de disparidad. De esta manera se obtiene la diferencia de intensidades para toda una ventana W entorno al
pı́xel de estudio.
El sumatorio de las diferencias de intensidad obtenidas en
dicha ventana es la funcion
´ de coste a minimizar para cada uno de los pı́xeles de la imagen y cuyo resultado será la
disparidad óptima para cada uno de éstos. Dicho método es
conocido como
´
SAD, suma de las diferencias de intensidad
absolutas.
Una vez encontrada la disparidad óptima se procede al
cálculo de la profundidad para cada uno de los pı́xeles sabiendo que la disparidad es inversamente proporcional a la profundidad a la que se situan
´ los objetos captados por la imagen.
Tras determinar la profundidad z, se calculan las coordenadas
x e y a partir de las ecuaciones de la cámara. Finalmente, la
matriz de disparidad a profundidad Q recoge dichos cálculos
resumiendo ası́ la transformacion
´ de disparidad a profundidad.
4.2.
La transformacion
´ escogida como solucion
´ es expresada
en la forma de matriz fundamental que relaciona un pı́xel de
la primera imagen con su correspondiente lı́nea epipolar en la
segunda imagen mediante una matriz 3x3 de rango 2. A partir
de esta matriz y conjuntamente con la informacion
´ intrı́nseca
de las cámaras se ha obtenido la matriz esencial mediante la
que se ha realizado la descomposicion
´ en los valores singulares pudiendo extraer ası́ el valor de la matriz de rotacion.
´
5.
Calculo
´
de la odometrı́a
Los datos de odometrı́a están formados por la informacion
´
de posicion
´ y de orientacion
´ del robot. Debido a la naturaleza
del movimiento del robot, un robot cuadrupedo
´
que carece de
sistemas óptimos para el cálculo de su movimiento, quedan
descartadas las soluciones comunes para el cálculo de la odometrı́a. Entre las opciones descartadas se encuentra el cálculo
del movimiento a partir de la integracion
´ de los datos obtenidos de los encoders en el caso de los robots rodados o a partir
de la informacion
´ tridimensional obtenida de sensores como
la kinect. En el caso de estudio se ha optado por un sistema
de estéreo vision
´ para la reconstruccion
´ tridimensional del entorno y aprovechar ası́ la informacion
´ de ambas cámaras para
el cálculo de la odometrı́a.
Se ha modelizado el sistema de navegacion
´ a partir del propio sistema utilizado por el ser humano que se basa en la vision
´ para desplazarse por su entorno, tal y como
´
se establece
en [Pérez-Sala et al., 2011]. Siendo la vista el sentido que determina a qué distancia se encuentra un objeto, como
´
se ha
movido éste y el desplazamiento realizado respecto a dicho
objeto.
Si se analiza el protocolo de movimiento de una persona se
observa que antes de girar en una direccion
´ se produce una
observacion
´ del medio y el establecimiento de la direccion
´
objetivo. Por tanto para la realizacion
´ de un giro en el caso de
una persona con el sentido de la vista correctamente desarrollado primeramente orientarı́a la cabeza en la direccion
´ deseada estableciendo ası́ la meta y posteriormente girarı́a el resto
del cuerpo hasta alinear cabeza y tronco. En el caso de una
traslacion
´ se utiliza la percepcion
´ de la profundidad obtenida
gracias a la vision
´ binocular que permite gracias al fenomeno
´
de triangulacion
´ realizar la reconstruccion
´ tridimensional del
entorno.
A partir de dicho modelo se ha definido el sistema de cálculo de odometrı́a donde la percepcion
´ de la profundidad se obtiene a través del conjunto de vision
´ estéreo y la rotacion
´ se
computará a partir de los puntos de interés correspondientes
entre dos imágenes.
Se debe remarcar que para mayor eficacia del método se
ha dividido el movimiento del robot en dos fases: rotacion
´ y
traslacion.
´ Cuando el robot recibe una orden de teclado para
moverse el programa procesa primero las órdenes de rotacion
´
y una vez finalizada la rotacion
´ pasa a procesar las órdenes
de traslacion.
´ Posteriormente publica los datos de odometrı́a
calculados a través de ROS.
Correspondencia estéreo robusta
La correspondencia estéreo consiste en la busqueda
´
de un
punto correspondiente a la primera imagen en la segunda imagen. Dicha busqueda
´
se ha limitado a aquellos pı́xeles que
cumplan una serie de caracterı́sticas de diferenciabilidad en
la imagen y son nombrados puntos de interés.
Se entiende por puntos de interés aquellos puntos caracterı́sticos de cada imagen que sean fácilmente distinguibles y
detectables frente a cambios de escala y deformaciones. La
informacion
´ se obtiene en la forma de descriptores locales
que contienen la informacion
´ de los puntos de interés de una
imagen consistente en un vector que recoge la informacion
´
del punto de interés y de dicho punto con los puntos de interés contiguos.
Se ha aplicado la metodologı́a SURF [Herbert Bay and
Van Gool, 2006], algoritmo introducido por Herber Bay en
2006, para la extraccion
´ de los puntos de interés. Esta metodologı́a está basada en el método SIFT [Lowe, 2004] publicado por David Lowe en 1999.
Una vez detectados los puntos de interés mediante el algoritmo de SURF se procede al emparejamiento robusto por
fases entre los puntos de interés de la primera y la segunda
imagen. Mediante este procedimiento se eliminarán aquellos
puntos considerados como falsas correspondencias.
Una vez determinada la transformacion
´ ésta se aplica a todos los puntos emparejados de la primera imagen y se calcula
que la distancia de los puntos transformados con los puntos
correspondientes en la segunda imagen sea inferior a una tolerancia . Aquellos puntos que estén dentro de los lı́mites
de tolerancia son conocidos como inliers. Si el porcentaje de
inliers respecto a los puntos totales es superior al 99 % se considera correcta la solucion
´ que es refinada teniendo en cuenta
todos los inliers. En caso contrario, se vuelve a seleccionar
un conjunto aleatorio de puntos y se repite el proceso hasta
encontrar un modelo de movimiento que represente correctamente al 99 % de los puntos.
5.1.
Calculo
´
de la rotacion
´
Se ha definido un estándar de giro de ±45◦ y se ha limitado
el valor del ángulo de giro con la intencion
´ de minimizar el
78
error introducido por la metodologı́a, ya que no puede ser corregido con los datos de navegacion
´ del robot debido a la falta
de un sensor que suministre dicha informacion.
´ El protocolo
de giro propuesto parte de un estado inicial con la cabeza y
el tronco alineados, θ = 0◦ . Donde
´
θ es el ángulo presente
entre la cabeza y el tronco. Posteriormente se establece la direccion
´ objetivo girando la cabeza hacia la direccion
´ deseada
un ángulo θ = ±45◦ dependiendo de si se gira hacia la izquierda (+) o la derecha (-). Una vez definido el objetivo se
trata de un proceso iterativo compuesto por tres etapas.
consecuencia en la siguiente iteracion
´ se invierte el sentido de giro hasta encontrarse de nuevo con una escena
conocida.
Cambio de signo: Se ha establecido que la solucion
´ será
correcta cuando el ángulo entre el tronco y la cabeza del
robot sea de ±5◦ . Pero puede darse el caso en el que el
robot pase de largo la direccion
´ objetivo debido a una
rotacion
´ excesiva, en consecuencia se ha dotado al robot
con un sistema que al detectar un cambio de sentido de
la rotacion
´ (cambio de signo en el ángulo de la cabeza
respecto al tronco) se realiza una inversion
´ del sentido
de rotacion
´ hasta corregir el error.
1. Giro del robot un ángulo desconocido, α.
2. Cálculo del ángulo girado, β, mediante la transformacion
´ de la escena observada por las cámaras.
5.2.
3. Giro de la cabeza del robot un ángulo β en sentido contrario al giro del cuerpo, de manera que θ = θ − β hasta
visualizar de nuevo el objetivo.
Calculo
´
de la traslacion
´
En el cálculo de la odometrı́a se debe establecer la equivalencia entre la suposicion
´ en que la cámara realice una traslacion
´ T y los puntos observados permanezcan estáticos en
la escena tridimensional, y la suposicion
´ en que es la cámara
la que permanece estática en la escena tridimensional y son
los puntos los que realizan una traslacion
´ de valor −T . Dicha informacion
´ es de gran utilidad para la extraccion
´ de la
traslacion
´ real realizada por el robot.
Estas tres etapas se repetirán hasta alinear la cabeza y el
tronco estableciendo un margen de error de ±5◦ . Finalmente
se corrige el ángulo girado sabiendo que el ángulo final será
el ángulo inicial establecido como objetivo ±45◦ menos el
angulo
´
final entre cabeza y tronco θ, por último se alinea cabeza y tronco, θ = 0◦ . Una vez alineado si se vuelve a cambiar de direccion
´ se partirá de nuevo desde el estado inicial
con la cabeza y el tronco alineados y se repetirá el proceso.
Se ha especificado un estándar de giro del robot para la
primera etapa, éste consiste en un giro de un segundo, con
un periodo por paso de un segundo. De aquı́ surge una problemática ya que el efecto en el AIBO de una misma orden de
giro presenta una variabilidad muy elevada, afectada también
por el tipo de terreno sobre el que se esté moviendo, la posicion
´ inicial, etc. En consecuencia se ha definido que antes
de realizar un giro el AIBO partirá de una posicion
´ con las
patas completamente estiradas para reducir al máximo posibles variaciones entre secuencias. De esta manera se asegura
también que siempre que se entre al proceso de cálculo de la
rotacion
´ entre dos secuencias de tiempo se hayan capturado
ambas escenas a la misma altura siendo entonces la variacion
´
observada en la escena únicamente la introducida por el movimiento de rotacion.
´
Para la determinacion
´ del ángulo girado, β, se calculan los
puntos de interés de la imagen anterior al giro y la posterior.
Posteriormente se realiza la correspondencia estéreo robusta
de ambas imágenes obteniendo ası́ la matriz fundamental. La
rotacion
´ se calcula a partir de la matriz esencial obtenida a
partir de la matriz fundamental. Este método ha sido probado
en un seguido de experimentos para comprobar su veracidad.
Se han considerado un seguido de situaciones singulares que
podrı́an darse durante el cálculo de la rotacion
´ del robot y para
las que se ha creado un protocolo de actuacion
´ para evitar
fallidas del programa.
6.
6.1.
Experimentacion
´
Traslacion
´
La experimentacion
´ de la rotacion
´ se divide en dos fases.
En la primera fase se ha experimentado únicamente con el
conjunto de estéreo vision
´ realizando rotaciones en el eje y
ya que el robot realiza un movimiento planar y la principal
rotacion
´ detectada será en este eje. En esta fase también se
ha estudiado los valores óptimos para los diferentes parámetros del cálculo de la correspondencia entre imágenes. En la
segunda fase se ha experimentado ya sobre el AIBO y aplicando el protocolo de giro para observar si el comportamiento
es el esperado.
Fase I: Rotacion
´ pura
En el primero de los experimentos se ha realizado una idealizacion
´ del movimiento planar en cuyo caso el AIBO únicamente rotarı́a en el eje y. Entendiendo el eje y como el eje
vertical perteneciente al plano de imagen.
Se han comparado la rotacion
´ real con la estimada para diferentes valores de los parámetros de correspondencia estéreo
obteniendo ası́ los valores óptimos. Entre los parámetros estudiados se encuentran el ratio de distancia, para el paso 2
del emparejamiento estéreo, y la distancia máxima a la lı́nea
epipolar para el test de RANSAC, paso 4 del emparejamiento
estéreo.
Rotacion
´ calculada en funcion
´ del ratio de distancia Se
ha realizado un estudio del cálculo de la rotacion
´ en funcion
´
del ratio. Concretamente se ha asignado valores de ratio de
0,2 a 3,0 con incrementos de 0,2. Se observa un mayor ajuste
a la rotacion
´ real para un ratio máximo de 0,7.
El robot se pierde: Si el robot gira un ángulo excesivamente elevado, este pierde de vista los puntos de referencia de la iteracion
´ anterior y en consecuencia se pierde.
Dicho comportamiento debe ser detectado y corregido
dentro del protocolo de giro.
Concretamente se ha establecido que si el robot se pierde es debido a una rotacion
´ excesivamente elevada y en
Rotacion
´ calculada en funcion
´ de la distancia a la lı́nea
epipolar La distancia a la lı́nea epipolar es de interés en el
79
Estado 6: Estado resultante del giro de la cabeza un
angulo
´
de valor −β1
test de RANSAC, dicho valor define la distancia máxima a la
que se debe encontrar un punto de su lı́nea epipolar para ser
considerado apto. Se observa un mayor ajuste a la rotacion
´
real para un valor del parámetro de la distancia máxima a la
lı́nea epipolar de 2,6 pı́xeles.
Estas pruebas han sido realizadas en un entorno en el que
los puntos de interés se encuentran entre 1 y 1,5m ya que el
ambito
´
de esta aplicacion
´ son los espacios interiores. Por tanto en este caso se ha demostrado que el rango óptimo para el
cálculo de la rotacion
´ a través de la vision
´ se encuentra entre
0◦ y 15◦ . Si los puntos de interés detectados se encontrarán
a mayor distancia el rango óptimo de rotacion
´ aumentarı́a ya
que los puntos de interés detectados se mantendrı́an dentro
del campo de vision
´ al girar.
El ángulo después de esta iteracion
´ entre la cabeza y el
cuerpo, pan, será de pan = 45◦ − β0 − β1 = −3◦ . Por la
tanto, se encuentra dentro de los lı́mites de |pan| ≤ 5◦
Como consecuencia al estudio previo sobre el cálculo de la
rotacion
´ en un espacio interior se sabe que el rango óptimo de
giro se encuentra entre 0◦ y 15◦ . Por tanto se ha realizado un
estudio sobre el protocolo de giro del AIBO con el objetivo
de seleccionar la frecuencia de paso y la duracion
´ del giro que
resultan en un giro de aproximadamente 15◦ .
Se observa una mayor precision
´ del giro si la duracion
´ del
giro del robot es dos veces el perı́odo de paso del robot. Al
ser un multiplo
´
del perı́odo de paso se satisface que al girar
se acaba la accion,
´ es decir, que acaba en una postura similar a la inicial. Además se selecciona este valor concreto ya
que multiples
´
mayores corresponden a giros excesivamente
elevados.
Fase II: Rotacion
´ sobre el AIBO
Estado inicial: Establecimiento de la direccion
´ objetivo
girando únicamente la cabeza 45◦ en la direccion
´ deseada (en este caso en sentido horario).
Estado 1: Rotacion
´ del robot un ángulo α0 desconocido
(en este caso en sentido horario).
6.2.
Traslacion
´
Se divide en dos etapas, en la primera se calcula la traslacion
´ mediante la reconstruccion
´ tridimensional para cada
instante de tiempo del conjunto de estéreo vision.
´ En el segundo experimento se introduce un elemento de dimensiones
conocidas en el entorno de modo que a partir de la variacion
´
de la percepcion
´ de dicho objeto entre las dos secuencias y
el conocimiento de la distancia focal se calcula la traslacion
´
realizada. A continuacion
´ se detallan cada uno de los experimentos realizados.
El robot no detecta ningun
´ punto de interés reobservado en
consecuencia deshace el giro para volver al estado inicial.
Estado 1: El robot se ha perdido.
Estado 2: El robot realiza un giro de ángulo α1 desconocido (en este caso en sentido antihorario).
Realiza la comparacion
´ de los puntos de interés del Estado
2 con los del Estado inicial y comprueba que se encuentra
ahora sı́ en un estado conocido.
Fase I: Traslacion
´ a partir de la reconstruccion
´
tridimensional
En este caso el cálculo de la traslacion
´ se realiza mediante
la informacion
´ tridimensional obtenida a partir del conjunto
de estéreo vision
´ para dos instantes de tiempo.
El cálculo recoge aproximadamente un total de 1000 experimentos para translaciones en el eje z de 0,5cm a 3cm con
incrementos de 0,5cm. Se observa una gran variabilidad en
la traslacion
´ calculada esta tendencia aumenta para translaciones inferiores a 1,5cm debido a la escasa variacion
´ de la
escena. Dicha observacion
´ ha conllevado a la realizacion
´ de
un estudio para translaciones mayores, de un rango de entre
3 y 6cm.
Se vuelve a repetir el comportamiento anterior quedando
claramente reflejado la disminucion
´ del error para translaciones superiores a 1 cm. El error observado puede deberse a un
fallo en alguno de los parámetros de calibracion
´ calculados.
También se deben considerar posibles errores en la fase de
correspondencia entre imágenes ya que un error de 1cm en
la traslacion
´ calculada puede deberse a un error en la fase de
deteccion
´ de 7,604 pı́xeles, error menospreciable teniendo en
cuenta el tamano
˜ de la imagen de 640x480 pı́xeles.
Estado 2: Estado de partida para la extraccion
´ del ángulo rotado a partir de la imagen.
Estado 3: Estado resultante de la realizacion
´ de un giro de ángulo α2 desconocido (en este caso en sentido
horario).
2 con los del Estado 3 y cálcula el ángulo girado β0 .
horario).
Estado 4: Estado resultante del giro de la cabeza un
angulo
´
de valor −β0
El ángulo después de esta iteracion
´ entre la cabeza y el
cuerpo, pan, será de pan = 45◦ − β0 = −24◦ .
Estado 4: Estado de partida para la extraccion
´ del ángulo rotado a partir de la imagen.
horario).
Fase II: Traslacion
´ a partir de un objeto de tamano
˜
conocido
Se introduce un elemento de dimensiones conocidas en el
entorno y se observa en dos instantes de tiempo. De este modo
conociendo la dimension
´ real del objeto y la distancia focal se
puede calcular la profundidad a la que se encuentra el objeto
4 con los del Estado 5 y cálcula el ángulo girado β1 .
horario).
80
en cada una de las escenas y ası́ calcular la variacion
´ en la
profundidad.
Debido a los resultados recogidos en el experimento anterior se ha optado por realizar el estudio de la traslacion
´ para
valores comprendidos entre 0,5 y 6cm. Se observa una disminucion
´ del error respecto al calculado anteriormente de dos
ordenes
´
de magnitud para translaciones inferiores o iguales a
1cm y disminuciones del error de un orden de magnitud para translaciones mayores. A pesar de la disminucion
´ del error
este presenta el mismo orden de magnitud que el valor de
traslacion
´ y por tanto las medidas realizadas son de escasa
fiabilidad.
En consecuencia se establece que se podrı́a mejorar el ajuste de los parámetros de calibracion
´ ya que en este experimento en que únicamente se depende del valor de la distancia focal los resultados mejoran, por tanto los resultados de la Fase
I podrı́an mejorar con un mayor ajuste de los valores de calibracion.
´ Finalmente repetir la conclusion
´ de la fase anterior y
es que un error de 1cm en el valor de la traslacion
´ puede estar
debido a un fallo en la etapa de correspondencia de imágenes de 7,604 pı́xeles en consecuencia es un error viable y de
aquı́ que el error en la traslacion
´ presente el mismo orden de
magnitud que la propia traslacion.
´
7.
del proyecto incluyendo interaccion
´ con el entorno y comportamiento autonomo.
´
Sin embargo en la fase de traslacion
´ no
se han conseguido cumplir los requisitos marcados.
El trabajo podrı́a extenderse a la creacion
´ de una base de
datos con las imágenes capturadas. De esta manera el robot
se crea una representacion
´ del entorno a partir de las imágenes capturadas, contra más capturas realiza mejor descripcion
´
presenta del entorno de manera que disminuirı́a el numero
´
de
estados desconocidos.
Otro punto a trabajar una vez calculada la base de datos
serı́a la aplicacion
´ de SLAM, mediante SLAM se permite la
creacı́on
´ de un mapa del entorno de navegacion
´ a partir de la
informacion
´ analizada y la situacion
´ del robot en el mapa.
Como futura consideracion
´ el AIBO podrı́a incorporar un
sistema de deteccion
´ de obstáculos que definiese la direccion
´
de movimiento en vez de recibir la informacion
´ del teclado
del ordenador, presentando ası́ una mayor aproximacion
´ a la
inteligencia artificial.
Finalmente también se podrı́a optimizar el funcionamiento de los diferentes procesos para aumentar la velocidad de
procesado.
Referencias
[Baillie, 2005a] Jean-Christophe Baillie. URBI Language
Specification. 2005.
[Baillie, 2005b] Jean Christophe Baillie. URBI: Towards a
universal robotic low-level programming language. In
2005 IEEE/RSJ International Conference on Intelligent
Robots and Systems, IROS, pages 3219–3224, 2005.
[Bradski, 2000] G. Bradski. The OpenCV Library. Dr.
Dobb’s Journal of Software Tools, 2000.
[Decuir et al., 2004] John D. Decuir, Todd Kozuki, Victor
Matsuda, and Jon Piazza. A friendly face in robotics:
Sony’s aibo entertainment robot as an educational tool.
Computers in Entertainment, 2(2):14, 2004.
[Flores and Braun, 2011] Pablo Flores and Juan Braun. Algoritmo SIFT : fundamento teorico.
´
pages 1–5, 2011.
[Gostai, 2014] Gostai. URBI Doc for Aibo ERS2xx ERS7
and URBI 1.0, 2014. Data de consulta: 6 de març de 2014.
[Herbert Bay and Van Gool, 2006] Tinne Tuytelaars Herbert Bay and Luc Van Gool. Surf: Speeded up robust features. In Computer vision–ECCV 2006, pages 404–417.
Springer, 2006.
[Hohl et al., 2006] Lukas Hohl, Ricardo Tellez, Olivier Michel, and Auke Jan Ijspeert. Aibo and Webots: Simulation,
wireless remote control and controller transfer. Robotics
and Autonomous Systems, 54(6):472–485, June 2006.
[Huang and Dissanayake, 2007] Shoudong Huang and Gamini Dissanayake. Convergence and Consistency Analysis
for Extended Kalman Filter Based SLAM. IEEE Transactions on Robotics, 23(5):1036–1049, October 2007.
[Kertész, 2013] Csaba Kertész. Improvements in the native
development environment for Sony AIBO. International
Journal of Interactive Multimedia and Artificial Intelligence, 2(3):51, 2013.
Conclusiones y trabajo futuro
A lo largo del trabajo se ha explicado la metodologı́a aplicada para el cálculo de la odometrı́a mediante la informacion
´
visual. Estableciendo ası́ cada una de las etapas necesarias y
sus parámetros de ajuste.
Se ha conseguido publicar correctamente a través de ROS
las imágenes capturadas por el AIBO pero la escasa calidad
de éstas conjuntamente con la caı́da de la transmision
´ de la informacion
´ a conllevado al montaje de un conjunto de estéreo
vision.
´
Se ha conseguido ajustar correctamente la reconstruccion
´
tridimensional, ası́ como el cálculo de la rotacion
´ y en consecuencia lograr un control óptimo sobre la fase de rotacion
´ en
el cálculo de odometrı́a. A pesar de ratificar el correcto funcionamiento de las etapas mencionadas anteriormente se ha
detectado fallos de funcionamiento en el cálculo de la traslacion,
´ tal y como
´
se observa por la gran variabilidad de los
datos obtenidos.
Tal y como
´
se ha justificado anteriormente en cada uno de
los experimentos realizados para el cálculo de la traslacion
´
la variabilidad observada se explica como una pequena
˜ variacion
´ en la correspondencia entre imágenes. Originando ası́
que una variacion
´ de pocos pı́xeles se transforme en una variacion
´ de centı́metros en el cálculo de la traslacion
´ realizada,
error del mismo orden de magnitud que el de la traslacion
´ realizada. En consecuencia el error obtenido en los resultados es
significativo en ambos experimentos. Se establece entonces
que la traslacion
´ extraı́da no es concluyente.
Finalmente se establece que la rotacion
´ calculada es correcta, por tanto el AIBO es capaz de extraer la informacion
´
que necesita del entorno para realizar la rotacion
´ deseada a
pesar de que la traslacion
´ extraı́da no se corresponde con la
real. En consecuencia si se limita a la extraccion
´ de la rotacion
´ se han cumplido todos los objetivos marcados al inicio
81
[Kolovrat, 2013] Stipe Kolovrat. Development of Android
Application for Sony Aibo ERS-7 robot. Master’s thesis,
Universitat Politècnica de Catalunya, 2013.
[Lowe, 2004] David G Lowe. Distinctive image features
from scale-invariant keypoints. In International journal
of computer vision, volume 60, pages 91–110. Springer,
2004.
[Pérez et al., 2010a] Xavier Pérez, Cecilio Angulo, and Sergio Escalera. Vision-based Navigation and Reinforcement
Learning Path Finding for Social Robots. PhD thesis, Universitat Politécnica de Catalunya, 2010.
[Pérez et al., 2010b] Xavier Pérez, Cecilio Angulo, Sergio
Escalera, and Diego E. Pardo. Vision-based navigation
and reinforcement learning path finding for social robots.
In Jornadas ARCA 2010, June 2010.
[Pérez-Sala et al., 2011] Xavier Pérez-Sala, Cecilio Angulo,
and Sergio Escalera. Biologically inspired turn control
for autonomous mobile robots. In CCIA, pages 189–198,
2011.
[Quigley et al., 2009] Morgan Quigley, Ken Conley, Brian P.
Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, and Andrew Y. Ng. ROS: an open-source robot operating system. In ICRA Workshop on Open Source Software,
2009.
[Riisgaard and Blas, 2005] Søren Riisgaard and Morten Rufus Blas. Slam for dummies: A tutorial approach to simultaneous localization and mapping. Technical report, 2005.
[Tek, 2014] Tekkotsu, 2014. Fecha de consulta: 18/06/2014.
[Téllez, 2004] Ricardo A. Téllez. Aibo Quickstart Manual.
Technical report, Grup de Recerca en Enginyeria del Coneixement (GREC), 2004.
[Zhang and Chen, 2007] Jiaqi Zhang and Qijun Chen. Learning based gaits evolution for an AIBO dog. 2007 IEEE
Congress on Evolutionary Computation, pages 1523–
1526, September 2007.
82
Un estudio estadístico de los e-mails que recibe un investigador actual en el área de
conocimiento de la computación y de la bioingeniería
Luis González-Abrila y Yenny Lealb
a
Departamento de Economía Aplicada I. Universidad de Sevilla. Avda. Ramón y Cajal 1, 41018 Sevilla (España)
b
Department of Information Engineering, University of Padova. Via G. Gradenigo 6/B, 35131 Padova (Italy)
Evidentemente el porcentaje de correo basura depende
de cada investigador, pero en ningún caso es despreciable.
Esto provoca malgastar recursos, como espacio de almacenamiento de correo, ancho de banda y, sobretodo, horas de
trabajo. Además, la entrada de estos correos potencian el
riesgo de introducir algunos de los diferentes virus que constantemente van apareciendo por internet.
En general, el correo basura es cierta forma de inundar Internet con muchas copias del mismo mensaje, en un intento
de alcanzar a gente que de otro modo nunca recibiría dicho
mensaje ni lo leería. Sin embargo, la mayoría de estos e-mails
no pasan el filtro establecido por la institución ya que los patrones de comportamiento son conocidos y cuando aún no los
son, se suelen detectar con prontitud y cortar su difusión.
El objetivo de este trabajo no es centrarnos en los correos
spam tradicionales, sino que nos enfocaremos en los correos
no deseados que hacen referencia a temas de investigación,
pero que distan mucho de ser afines al investigador que lo
recibe o que aún siendo afines no son de interés en ese momento al investigador.
En referencia a dichos correos, por ejemplo, no sería una
invitación afín aquella invitación a participar en un evento
científico relacionado con teoría de control cuando la línea
de trabajo del investigador está relacionada con temas de desigualdad social.
Otra invitación no deseada, en general, sería aquella que
tiene un alto coste económico y/o de tiempo. Un ejemplo
es “The 2015 international conference on new energy and
renewable resources will be held from July 4-5, 2015, in
Guangzhou” para un investigador de Cádiz, ambas ciudades
con malas conexiones aéreas y/o por el idioma local muy díficil de desembolverse en su entorno, lo que supone un gasto
en tiempo a la hora de gestionar el viaje. Además, una vez
completados los preparativos, sólo el viaje requeriría más de
4 días completos y el desembolso económico con toda seguridad no bajaría de los 1500 euros.
En este trabajo, un estudio de la correspondencia, vía emails, recibida por un investigador universitario a lo largo de
un período de dos meses es analizada.
El resto del artículo se estructura como sigue: en la sección
2 se realiza un estudio de algunos tipos de correos significativos. En la sección 3 se muestra una estadística con los diferentes tipos de correos recibidos. Se finaliza con una sección
de conclusiones sobre el trabajo.
Abstract
Un estudio de la correspondencia, vía e-mails,
que fue recibida por un investigador universitario
a lo largo de un período de dos meses es analizada en profundidad en este artículo. Se muestran el número de invitaciones a congresos y/o revistas (“call for papers”) y a otros acontecimientos científicos, analizándose la calidad de estas
llamadas a la participación en términos standard
(índice del congreso en rankings de referencia e indexación de las revistas). Se profundiza en aquellas invitaciones donde hay que realizar un desembolso económico para poder tener una publicación.
Además, se muestran otros tipos de correspondencias que no tienen como fin una publicación, pero
que se mueven dentro del mismo entorno.
1
Introducción
En los tiempos actuales, un investigador lo primero que suele
hacer cuando se sienta en su despacho es revisar el correo
electrónico y despachar los e-mails de su bandeja de entrada.
Si se tiene en cuenta que la mayoría de las instituciones
científicas tienen un cortafuego en su correo institucional,
que limita la entrada de correos no deseados (o correos “basura”), son muchos los que escapan a esos controles y entran directamente en la bandeja de entrada o no son directamente rechazados por el sistema y se colocan en una carpeta
que generalmente se suele llamar SPAM1 . Incluso, aunque el
usuario basado en experiencias previas de correos no deseados recibidos con anterioridad, haya puesto diferentes filtros
para rechazar peticiones de diferentes tipos, léase “publicidad
no deseada” (anuncios comerciales, anuncios de pornografía,
métodos de hacerse ricos, servicios no siempre muy legales,
viajes,...), “publicidad no comercial” (los llamados JUNKMAIL: cartas de cadenas, anuncio de falsos virus, falsos premios por reenviar mensajes, falsas peticiones de claves de accesos,...), no se logra un control íntegro del mismo.
1
Esta palabra proviene de la época de la segunda guerra mundial,
cuando los familiares de los soldados en guerra les enviaban comida
enlatada; entre estas comidas enlatadas se encontraba una carne enlatada llamada spam, que en los Estados Unidos es muy común.
83
2
Análisis de datos
sólo con leerlo. En determinados caso se ha de analizar si
ciertamente es o no realmente de interés y a veces se suele
llevar algún tiempo en alguna bandeja apropiada del correo
hasta que definitivamente se borran o se toman en consideración (los menos).
En otros muchos casos, como ha ocurrido con el investigador analizado, una vez ha recibido diferentes correos del
mismo tipo y deseando que sea el propio sistema quien lo
rechace automáticamente, ha tenido que establecer un filtro
en su navegador para dicho fin, con el consiguiente gasto en
tiempo así como en ralentizar el sistema de correo al tener que
comprobar si el correo entrante cumple o no el filtro impuesto
(por ejemplo, con el correo que ha motivado la figura 1).
En este punto señalar que algunos correos incorporan en el
cuerpo del texto la opción de no recibir más correo, es decir,
la desuscripción. Sin embargo, el envio de estos correos se
hace de forma automática y en muchos casos se desconoce si
ciertamente la dirección de correo a la que se envia el correo
(es decir, la del investigador analizado) es una dirección correcta o no. Por tanto, si recibe una replica con una orden
de desuscripción se esta diciendo directamente que la dirección es buena y ya se esta “fichado” para siempre o hasta que
cambies de dirección de correo.
Retomando el cómputo de tiempo, si por término medio
se pierden 45 segundos y el número de correos recibidos en
los dos meses de estudios han sido 1032, el cómputo total es
de 774 minutos, es decir, 12 horas y 54 minutos. Así, si se
considera una jornada laboral normal de 8 horas, en sólo dos
meses de estudios sería más de un día y medio perdido solo
en leer y rechazar este tipo de correos.
Respecto a la constancia en el envío de invitaciones indicar,
por ejemplo, que para el “6th Workshop on Soft Computing
in Image Processing and Computer Vision (SCIPCV)” a celebrar en Las Vegas (USA), en julio de 2015, se reciben 4 emails en poco más de un mes (ver la figura 2). Lo mismo
ocurre con el “16th EANN 2015 Engineering Applications of
Neural Networks” a celebrar en las isla de Rodas (Grecia),
en septiembre de 2015, para el cual se reciben 5 invitaciones
a la participación. Permítanos centrarnos, como referencia,
El presente estudio abarca un período ininterrumpido de dos
meses de duración, concretamente, desde el 4 de marzo al 3
de mayo de 2015, en el cual se han recogido todos los e-mails
que un investigador universitario ha recibido, en su cuenta de
correo institucional, y se ha llevado a cabo un estudio estadístico descriptivo de los mismos.
El número total de e-mails recibidos en ese período fue
de 4126. Si tenemos en cuenta que median 61 días entre el
comienzo y el final del período analizado, este dato proporciona una media de 67.64 correos diarios. De estos correos
1032 fueron invitaciones a diferentes eventos científicos, publicaciones en revistas, congresos, jornadas, charlas inaugurales, ser editor asociado, productos relacionados con sus trabajos,.... que proporciona un 25.01% del total de correos
recibidos. Así, en la figura 1 se recoge un ejemplo de este tipo
de correos. En concreto, en este correo (anuncio) se ofrece
un producto relacionado con la salud, tema que el investigador ha tratado en algunas de sus publicaciones científicas.
En otros correos, es de destacar la insitencia del remitente,
Figura 1: Ejemplo de un fichero no deseado.
por ejemplo sirva notar que solo con remitente EAI Events2
aparecen 45 e-mails, es decir, casi a uno diario.
En primer lugar, permítanos comenzar realizando una
aproximación al tiempo dedicado a este tipo de correos. La
experiencia de los remitentes en general es muy buena, es por
ello que estos correos a priori no son siempre detectables a
primera vista, es decir, simplemente con mirar el “Asunto“.
Así, el proceso generalmente seguido es, una vez se tiene un
mensaje en la bandeja, abrir dicho mensaje, leer lo más significativo del mismo y si no es de interés, salir del texto del
mensaje y borrarlo del sistema. Partamos del supuesto que
dicho tiempo se encuentra en el rango de entre 30 y 60 segundos. Dicho valor ha sido estimado tras consultar no sólo
al investigador analizado sino también a otros investigadores
del mismo área de conocimiento y/o afines.
Evidentemente, la gran mayoría de ellos se desechan antes
de los 30 segundos pero, como nos asegura la estadística, en
algún otro se pierde mucho más tiempo, puesto que no basta
Figura 2: Invitación al 6th Workshop on Soft Computing in
Image Processing and Computer Vision (SCIPCV).
2
The European Alliance for Innovation es un ecosistema
dinámico para fomentar la innovación de las TIC para mejorar la
competitividad europea y para beneficio de la sociedad.
en esta última invitación, puesto que se trata de un “correo
apetecible” en el sentido que se indica que los artículos sele-
84
ccionados serán considerados para publicación en una revista
de alto impacto (ver Figura 3). En primer lugar, se debe com-
Topics
Suggested topics include, but are
not limited to, the following:
Neural networks techniques
Learning theory
Evolutionary architectures
Unsupervised Learning
Reinforcement Learning
Adaptive architecture
Hybrid systems and Modeling
Hardware development
Low cost architectures
Research areas
Computer vision
Pattern Recognition Classification
Colour, Motion analysis
Image Video Processing
Signal Processing
Fusion
Telecommunications
Robotics
Intelligent Transportation Systems
Financial Forecasting
Time Series Analysis
Data mining
Adaptive Control
Modelling and identification
Prediction
Process Monitoring and Diagnosis
Intelligent Agents
Multi Agent Systems
Real Time Intelligent Systems
Bioinformatics
Fuzzy logic and systems
Support vectors machines
Humanistic Data Mining
Optimization
Genetic Algorithms and Optimization
Feature Minimization
Filters
Clustering - Fuzzy Clustering
Figura 3: Los artículos del EANN seleccionados serán considerados para su publicación en un número especial de las revistas de Springer “Neural Computing and Applications” con
índice de impacto 1.763, y en la revista de Springer “Evolving
Systems”.
probar que ciertamente se dispone de algún trabajo que encaje
dentro de la temática del congreso. Este punto en general no
suele ser una handicap ya que el abanico de temáticas que
se plantea es muy amplio con objetivo de acceder al mayor
número de investigadores posibles (ver la figura 4). En concreto, en este congreso es de 3 áreas y 50 items en total.
Sin embargo, la lectura de los gastos del congreso (ver la
figura 5) junto con los gastos de transporte y de alojamiento
te hacen no solo retomarte el tema sino en la mayoría de los
casos olvidarlo,
Con respecto a las revistas de acceso libre (Open- son revistas académicas que están disponibles en línea para el lector), las cuales suelen publicitarse por e-mail al igual que
se publicita la compra de productos en TV, es importante
señalar la dudosa reputabilidad3 de algunas de ellas. Tanto
es así que incluso existen listados de revistas e instituciones,
a disposición de profesores e investigadores, como por ejemplo, la lista de Predatory Publishers (editores depredadores)
de Jeffrey Beall4 . De esta lista se obtiene que en el año 2013
estén catalogados como “Predatory Publishers” un total de
242 editores y de “Individual Journals” un total de 126 revistas. Por otro lado, estas revistas generalmente llevan asociado
un pago por la publicación que no suele ser en algunos casos económico. sirva de ejemplo los siguientes casos que se
muestran en la figura 6.
Hay otro tipo de correo que inicialmente agrada al lector.
Es el tipo de correo en el que se le invita como editor asociado
en una revista, a pertenecer al comité técnico de un congreso,
a dar una charla inaugural aunque poco tenga que ver con su
línea de investigación, como por ejemplo (ver la figura 7).
Engineering Applications
Civil Engineering
Fuzzy Inference Systems
Medical and Biomedical Engineering
Decision Making
Industrial -Manufacturing Engineering Applications
Computer science
Thermal Engineering
Financial Engineering
General Engineering Applications
Environmental Engineering
Natural Disasters Applications
Risk Modeling
Social Media Applications
Chaos
3
Un interesante artículo puede verse en
http://multipliciudades.org/2014/06/24/cuidado-con-los-timos-enrevistas-open-access/
4
http://scholarlyoa.com/2012/12/06/bealls-list-of-predatorypublishers-2013/
Figura 4: Lista de temas no excluyentes del Congreso 6th
Workshop on Soft Computing in Image Processing and Computer Vision (SCIPCV)
85
Registration Fees
Registration type
Full registration
Full - ENNS/INNS member
Student registration
Student-ENNS/INNS member
Extras
Over-length Fee:
Additional Paper Fee:
Dear XXXXX,
Early
500 Euros
450 Euros
300 Euros
250 Euros
Late
550 Euros
500 Euros
300 Euros
250 Euros
We would like to invite you as Guest
(Invited) Speaker on Economics and
Business in one of our conferences
www. wseas. org
a) in Konya, Turkey, May 20-22, 2015
(a town with a great number of monuments
of anatolian, ancient greek, hellenistic,
roman, byzantine, seltzuk and ottoman
civilization as well as modern monuments
and museums -- google it) www. wseas. org
100 Euros per page
350 Euros per paper
Figura 5: Gastos del congreso 6th Workshop on Soft Computing in Image Processing and Computer Vision (SCIPCV)
British Journal of Economics, Management
& Trade (ISSN: 2278-098X) is an OPEN peer
reviewed INTERNATIONAL journal. We offer
both Online publication as well as Hard
copy options. Publication Charge is only
100 USD as per present offer. This journal
is now publishing Volume 7.
and /or
b) in Salerno, Italy, June 27-29, 2015
www. wseas. org , a town with long history
and important roman and italian monuments
and natural beauties, 1 hour away from A
ncient Pompei and Volcano of Vesuvius.
We have initiated a Journal called Jacobs
Journal of Physiotherapy & Exercise. We
are planning to release the edition in the
month of June, 2015. On this Occasion, we
request you to submit your Research Work as
manuscripts to our journal; we would request
you to invite your colleagues, experts to
submit their manuscripts. Authors are requested
to pay 199 USD as publication charges for
Inaugural edition.
Figura 6: Revistas on-line que cobran por hacer visible los
artículos.
Destacar en este correo como se enfatiza el lugar de realización del evento, aportando algunos datos relevante respecto a la ciudad sede. Ello radica en que se da por sentado
que la asistencia a un congreso esta muy condicionado por
el lugar de celebración del mismo, lo cual es algo que generalmente es aceptado en el ambiente investigador, es decir,
trabajo-ocio no son antagonistas.
3
Estadística de los correos
La gran variedad de correos recibidos entre los 1032 de invitaciones a diferentes eventos científicos, publicaciones en
revistas, congresos, jornadas, charlas inaugurales, ser editor
asociado, productos relacionados con sus trabajos,.... se ha
catalogado según aparece recogidos en la tabla 1. En esta
tabla, se diferencian entre aquellas invitaciones a revistas,
congresos y otros.
3.1
Revistas
Figura 7: Invitación a dar una charla inaugural.
Mencionar que en aquellas que se han registrados como no
de pago hay un error en su cómputo no despreciable (debido
a que su validación requeriría un excesivo coste en tiempo)
ya que cuando una de ellas es interesante y se decide profundizar más, acudiendo directamente a la web de la revista,
se encuentra que en algunas la información es fraudulenta y
ciertamente hay que pagar en otras hay que hacer un donativo
(sin el pago del mismo no hay posibilidad de publicar) y en
otras piden una “suscripción“ por un tiempo con el objetivo
de paliar los costes inherentes a la publicación.
Por otro lado, dentro de la invitaciones a revistas no de
acceso libre, 90 de 286, hay una proporción de casi 8 a 1,
entre las revistas no indexadas en el Journal Citation Report
(JCR) y las que si están. Además de las 11 invitaciones a
revistas JCR 5 de ellas son de revistas de acceso libre con un
coste por publicación. Por ejemplo la revista5 SENSORS, con
un índice de impacto de 2.048 en el año 2015, la cual muestra
en su web en la sección “Publication fee” lo que sigue:
For Sensors (ISSN 1424-8220), authors are
asked to pay a fee of 1800 CHF (Swiss Francs) per
processed paper, but only if the article is accepted
for publication in this journal after peer-review and
possible revision of the manuscript. Note that many
national and private research funding organizations
and universities explicitly cover such fees for articles originated in funded research projects. Discounts are available for authors from institutes that
participate with MDPI’s membership program.
3.2
En referencia a las revistas, lo más significativo es que de
las 286 invitaciones a la participación, que representan un
27.71% del total de invitaciones, hay más invitaciones a revistas de acceso libre que aquellas que no lo son, 193 de unas
frente a 90 de las otras. Además dentro de las primeras priman aquellas revistas que son de pago en más del doble que
las otras (133 frente a 63).
Congresos
El número de invitaciones a congresos es con diferencia
mayor que con el resto de invitaciones ya que representan
5
Sensors (ISSN 1424-8220; CODEN: SENSC9) is the leading
international, peer-reviewed, open access journal on the science and
technology of sensors and biosensors. Sensors is published monthly
online by MDPI.
86
esta forma se intentan asegurar audiencia y el éxito del
congreso.
• Invitación a dar charla inagural o de clausura. Busca la
misma finalidad del punto anterior y podriamos considerar como un reconocimiento más alto que el anterior.
• Invitación a ser editor de un libro sobre una determinada
temática. En este caso desde una editorial contactan contigo con la idea de animarte a la elaboración de un libro.
Para ello, se te pide que busques a otros investigadores
para completar el libro aunque en la mayoría de los casos a estos investigadores se les pide un “canon” por la
publicación de su capítulo. El tema del pago vuelve a
aparecer ya que son los propios autores los que sufragan
el pago de la edición del libro.
• Invitaciones a recibir cursos, seminarios, charlas,....
Tabla 1: Estadística de los correos recibidos. Journal Citation Report (JCR). (*1) son invitaciones a miembro de comité
técnico (TPC), editor de libro, a dar charla,.. y (*2) a invitaciones a asistencia a escuelas de verano, cursos, ofrecimiento
de proof-reader, a competiciones, ...
Revistas
Acceso libre
No acceso libre
Congresos
Sin revistas
Con revistas
27.71%
Con pago
Sin pago
JCR
No JCR
133
63
11
79
JCR
No JCR
518
18
72
58.91%
Otros
13.37%
(*1)
(*2)
102
36
1032
• ....
4
100.00%
casi el 60% del total.
El reclamo principal de los congresos es la posibilidad de
llevar aparejado una publicación posterior en una revista y
si esta es JCR mayor es su potencial, como se mostró en la
sección anterior. Sin embargo, no suele ser fácil conseguir
que una revista de impacto se involucre en una llamada a la
publicación (call for paper) para un número especial de la
misma, por diferentes motivos (proceso de selección, calidad
de los trabajos involucración de6 TPC del congreso en los
posteriores procesos de edición,...). En este estudio sólo son
18 invitaciones de un total de 608 invitaciones.
Sin embargo, hay 518 invitaciones a congresos que no
llevan ninguna revista detrás. De estas, lo más destacable
es, como se comento en la sección anterior, que junto a la
temática del congreso, lo que se destaca con mayor enfasis es
el lugar de celebración, así con destinos europeos se centran
en Lisboa, Londres, Paris, Roma, Isla Griegas, ....
3.3
Agradecimientos
Este trabajo esta parcialmente financiado por el proyecto
del Ministerio de Economía y competitividad, HERMES
(TIN2013-46801-C4-1-r) y por el proyecto Simon (TIC8052) de la consejería de Economía, Innovación y ciencias
de la Junta de Andalucía.
Otros
El resto de invitaciones hacen referencia a temas satélites a la
investigación. El porcentaje de invitaciones de este tipo es del
13.37% del total, es decir, representan una parte significativa
del total de correos.
Las invitaciones de este tipo son muy dispares y variadas
asi se tiene:
• Oferta de proof-reader. En estas invitaciones te hacen
ver la necesidad de un buen nivel técnico en el inglés escrito en los artículos y ofrecen su servicio para completar
el envio de tus investigaciones a revistas y/o congresos
de relevancias.
• Invitación a formar parte del TPC. Después de unos
breves halagos sobre tus méritos se te ofrece formar
parte del equipo del programa de congreso y te animan
a promocionar el evento y a participar con uno o varios
artículos a ti y al resto del equipo al que pertenece. De
6
Conclusiones
Del análisis de este investigador en referencia a los correos
recibidos relacionados de una u otra forma con la investigación que desarrolla en su profesión, lo más significativo
es el tiempo que dedica a ellos, el cual si se extrapola a todo
un año supone 77 horas y 24 minutos, es decir, casi 10 días
de 8 horas de trabajo.
Cómite Técnico de Programa
87
Author Index
Agell, Núria . .......................................................................................................................................14
Álvarez-García, Juan Antonio . ....................................................................................53, 57, 63
Angulo, Cecilio . . .......................................................................................................................67, 75
Arcos García, Álvaro . .............................................................................................................53, 57
Arias Fisteus, Jesús. .................................................................................................................53, 63
Arias-Sánchez, Pedro . ............................................................................................................57, 63
Català Mallofré, Andreu . ...............................................................................................................22
Córcoba Magaña, Víctor . . ............................................................................................. 45, 53, 63
Datko, Patrick . ...................................................................................................................................18
Díaz Boladeras, Marta . ...................................................................................................................22
Eberlei, Andreas . ..............................................................................................................................18
Falomir, Zoe . ....................................................................................................................1, 6, 65, 73
Fernández Cerero, Damián . .........................................................................................................11
Fernández Montes, Alejandro . . .........................................................................................11, 65
Fernández Rodríguez, Jorge Yago . ........................................................................... 53, 57, 63
González Abril, Luis . . ................................................................................................ 6, 65, 73, 83
Herzberg, Manuel . ...........................................................................................................................18
Kaltenbach, Jonas. ............................................................................................................................18
Kothakota, Sai Kishor. ....................................................................................................................67
Ladra, Susana . ...................................................................................................................................63
Lázaro, Juan Pablo . ..........................................................................................................................26
Lealb, Yenny . ......................................................................................................................................83
Lillo Fantova, Lucía . ........................................................................................................................75
Luaces, Miguel. ..................................................................................................................................63
Martí, Isabel . ......................................................................................................................................26
Martínez Madrid, Natividad . . ............................................................................................. 33, 34
Martínez, Ángel. ................................................................................................................................26
Muñoz Organero, Mario ................................................................................................ 45, 53, 63
Museros, Lledó.............................................................................................................................6, 73
Nguyen, Jennifer...............................................................................................................................14
Ortega Ramírez, Juan Antonio .............................................11, 32, 33, 34, 39, 53, 57, 63
Perugia, Giulia ...................................................................................................................................22
Ringgeler, Dominik .........................................................................................................................18
Riveiro, Belén ............................................................................................................................57, 63
Sánchez Fernández, Luis ..............................................................................................................53
Sánchez Hernández, Germán ......................................................................................................14
Sanz, Ismael................................................................................................................................... 6, 73
Scherz, Wilhelm D. .......................................................................................................... 18, 32, 39
Seepold, Ralf ...................................................................................................................... 18, 32, 39
Soilán, Mario ......................................................................................................................................57
Soria Morillo, Luis Miguel .................................................................................................... 32, 33
Thimm, Tatjana ................................................................................................................................18
Torres, Jesús ......................................................................................................................................53
Velasco, Manel...................................................................................................................................75
Yay, Emre .................................................................................................................................... 33, 34

Actas de las XVII Jornadas de ARCA 2015

Transcripción

Documentos relacionados

una amante imaginaria