RETOS INVESTIGACIÓN - grfia

Transcripción

RETOS INVESTIGACIÓN - grfia
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
INSTRUCCIONES PARA COMPLETAR LA MEMORIA CIENTÍFICO-TÉCNICA DE
PROYECTOS DE I+D+i “RETOS INVESTIGACIÓN” DEL PROGRAMA ESTATAL DE I+D+I
ORIENTADA A LOS RETOS DE LA SOCIEDAD
Lea detenidamente estas instrucciones para completar correctamente la memoria-científico técnica.
1. Este modelo de memoria está restringido en su extensión máxima y por consiguiente ha de limitarse a
los espacios indicados al completarla.
2. Las memorias pueden completarse en español o en inglés, a excepción del apartado 1. RESUMEN DE
LA PROPUESTA, que debe completarse en ambos idiomas.
3. Se recomienda completar la memoria empleando un pc con sistema operativo Windows y usando como
procesador de textos MS Word (MS Office).
4. Para completar los textos, sitúe el cursor en las zonas sombreadas. 4000 caracteres son,
aproximadamente, una página.
5. Una vez terminada la memoria, guarde su archivo en formato pdf (de no más de 4Mb) y apórtelo a la
solicitud telemática del proyecto en el apartado “Añadir documentos”.
6. Debido a que este formulario está diseñado para incluir únicamente texto con un tipo de letra
determinado, si necesita incluir fórmulas, reacciones químicas, fórmulas matemáticas, etc., o figuras
aclarativas, deberá hacerlo en los anexos I y II, respectivamente, tras citarlas en el cuerpo del texto. No
deberá emplear más extensión que la indicada.
7. El formulario está adaptado para poder emplear la opción de “copiar y pegar”.
1
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
AVISO IMPORTANTE
En virtud del artículo 11 de la convocatoria NO SE ACEPTARÁN NI SERÁN SUBSANABLES MEMORIAS
CIENTÍFICO-TÉCNICAS que no se presenten en este formato.
1. RESUMEN DE LA PROPUESTA / SUMMARY OF THE PROPOSAL
(Debe rellenarse también en inglés / It should also be completed in English)
INVESTIGADOR PRINCIPAL 1 (Nombre y apellidos):
José Manuel Iñesta Quereda
INVESTIGADOR PRINCIPAL 2 (Nombre y apellidos):
Rafael Ramírez Melendez
TÍTULO DEL PROYECTO COORDINADO:
Tecnologías interactivas para el aprendizaje de música
ACRÓNIMO DEL PROYECTO COORDINADO:
TIMuL
TITLE OF THE COORDINATED PROJECT:
Technologies for interactive music learning
ACRONYM OF THE COORDINATED PROJECT:
TIMuL
2
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
RESUMEN DEL PROYECTO COORDINADO
Debe contener los aspectos más relevantes, los objetivos propuestos y los resultados esperados.
El resumen del proyecto debe completarse también en la aplicación de la solicitud. Su contenido podrá ser publicado a
efectos de difusión si el proyecto resultara financiado en esta convocatoria.
Máximo 3500 caracteres
Conseguir un alto nivel de experiencia en la educación musical requiere una larga trayectoria de educación musical,
aprendizaje y práctica intensiva. Aprender a tocar un instrumento se basa fundamentalmente en el modelo maestro aprendiz, en el cual el profesor básicamente proporciona información verbal sobre la ejecución del alumno. En un
modelo de aprendizaje como el descrito, la tecnología se emplea raramente y casi nunca va más allá de la grabación de
audio y vídeo. Además, la interacción y socialización de los estudiantes quedan a menudo restringidas a un contacto
breve y puntual con el maestro, seguidas por largos períodos de estudio y práctica a solas, que a menudo convierten la
experiencia de aprendizaje musical en algo solitario, resultando en altas tasas de abandono. De manera similar a otras
disciplinas como el deporte, donde la tecnología se utiliza habitualmente para mejorar la formación y el desempeño de
los atletas, este proyecto propone incorporar los últimos avances tecnológicos a la formación musical, con el fin de
definir métodos pedagógicos óptimos y herramientas para facilitar el aprendizaje y hacer del aprendizaje de la música
un proceso más interactivo y social.
El objetivo principal del proyecto es estudiar cómo aprendemos la interpretación de la música desde una perspectiva
pedagógica y científica, así como para crear nuevos sistemas asistenciales, multimodales, interactivos y de conciencia
social complementarios a la enseñanza tradicional. La interpretación musical no es sólo tocar la nota correcta en el
momento oportuno. Nuestro proyecto tiene como objetivo investigar y explorar todos los aspectos relevantes para
elaborar métodos y herramientas para la educación musical con paradigmas pedagógicos innovadores, teniendo en
cuenta factores clave como la expresividad, la interactividad hombre-máquina, el control de los gestos y el trabajo
cooperativo entre los participantes.
Como resultado de una interacción firmemente acoplada entre los socios participantes, el proyecto intentará responder
a preguntas tales como ―¿Cómo serán los ambientes de aprendizaje de la música dentro de 5 a 10 años? ― ¿Qué
impacto tendrán estos nuevos entornos musicales en el aprendizaje musical en general?
Los objetivos generales del proyecto son: (1) diseñar e implementar nuevos paradigmas de interacción multimodales
para el aprendizaje y formación basada en herramientas de análisis del audio y la música, y las técnicas de
reconocimiento de patrones, (2) evaluar la eficacia de estos nuevos paradigmas, desde un punto de vista pedagógico,
(3) en base a los resultados de la evaluación, desarrollar prototipos multimodales interactivos para el aprendizaje de
música, creando escenarios de aprendizaje tanto en solitario como de colaboración entre estudiantes, y (4) crear bases
de datos de referencia públicamente disponibles de grabaciones musicales con información multimodal para el
aprendizaje cooperativo y posterior conservación. Los resultados del proyecto servirán como base para el desarrollo de
la próxima generación de sistemas de aprendizaje de la música, mejorando la actual interacción alumno-profesor y la
práctica del alumno en solitario. Además, proporcionará el potencial para hacer accesible a un público más amplio la
educación musical.
PALABRAS CLAVE
Máximo 200 caracteres
3
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Tecnologías de la música, aprendizaje musical, aprendizaje computacional, reconocimiento de formas, interacción
hombre-máquina, multimodalidad, pedagogía, aprendizaje individual,inteligencia colectiva
SUMMARY OF THE COORDINATED PROJECT
It should contain the most relevant topics of the project, the objectives and the expected results.
The summary should also be completed in the electronic application. It could be published for diffusion purposes if the
project is financed in this call.
Maximum 3500 characters
To attain a high level of expertise in music education requires a long learning trajectory and intensive practice. Learning
to play music is mostly based on the master-apprentice model in which the teacher mainly gives verbal feedback on the
performance of the student. In such a learning model, modern technologies are rarely employed and almost never go
beyond audio and video recording. In addition, the student‘s interaction and socialization is often restricted to short and
punctual contact with the teacher followed by long periods of self-study, which often makes musical learning a lonely
experience, resulting in high abandonment rates. Similarly to other disciplines such as sport, where technology is
commonly used to improve the training and performance of athletes, this project proposes to incorporate the latest
technological advances to music training in order to define optimal pedagogical methods and tools, to facilitate learning,
and to make music learning a more interactive and social process.
The main aim of the project is to study how we learn music performance from a pedagogical and scientific perspective
and to create new assistive, multimodal, interactive, and social-aware systems complementary to traditional teaching.
Music performance is not simply playing the right note at the right time.
Our project aims to investigate and explore all the relevant aspects in order to produce methods and tools for music
education with innovative pedagogical paradigms, taking into account key factors such as expressivity, interactivity,
gesture control, and cooperative work among participants. As a result of a tightly coupled interaction between the
participating partners, the project will try to answer questions such as ―How will the music learning environments be in
5-10 years time? ―What impact will these new musical environments have in music learning as a whole?
The general objectives of the project are: (1) to design and implement new multi-modal interaction paradigms for music
learning and training based on state-of-the-art audio processing, music analysis and pattern recognition techniques, (2)
to evaluate from a pedagogical point of view the effectiveness of such new paradigms, (3) based on the evaluation
results, to develop new multimodal interactive music learning prototypes for student-teacher, student only, and
collaborative learning scenarios, and (4) to create a publicly available reference database of music recordings with
multimodal information for cooperative learning. The results of the project will serve as a basis for the development of
next generation music learning systems, thereby improving on current student-teacher interaction, student-only practice,
and furthermore providing the potential to make music education accessible to a substantially wider public.
KEY WORDS
Maximum 200 characters
Music technology, music learning, machine learning, pattern recognition, human-computer interaction, multimodality,
pedagogy, individual learning, collective intelligence
4
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
2. INTRODUCCIÓN
2.1. Antecedentes y estado actual de los conocimientos científico-técnicos de la materia específica del proyecto,
incluyendo, en su caso, los resultados previos del equipo investigador, de otros grupos que trabajen en la misma y la
relación, en su caso, entre el grupo solicitante y otros grupos de investigación nacionales y extranjeros;
- si el proyecto es continuación de otro previamente financiado, deben indicarse con claridad los objetivos y los
resultados alcanzados de manera que sea posible evaluar el avance real que se propone en el proyecto solicitado.
- si el proyecto aborda un nuevo tema, deben indicarse los antecedentes y contribuciones previas del equipo
investigador con el fin de justificar su capacidad para llevar a cabo el nuevo proyecto.
Máximo 16 000 caracteres
Background
Playing a musical instrument is a highly complex activity. It requires a complex combination of mental and sensory-motor
skills, which are acquired during a long learning trajectory. Music pedagogy represents a long-standing tradition, which is
mostly based on a master-apprentice model in which the student observes and imitates the teacher, the teacher
provides verbal feedback on the performance of the student, and the student engages in long periods of self-study
without teacher supervision. However, learning under the master-apprentice model is difficult because it requires the
student to mirror the teacher‘s body language, and verbal feedback remains susceptible to ambiguous interpretation. In
addition, the time lag between the student‘s performance and the teacher‘s feedback makes the feedback to be
dissociated from the online proprioceptive and auditory sensations accompanying the performance (Welch 1985) – this
is especially relevant since most of student‘s performance practice takes place long after the teacher‘s feedback.
The proposed project addresses these problems by contributing to design and implement new multi-modal interaction
paradigms for music learning and to develop assistive, interactive and multimodal environments complementary to
traditional teaching.
The master-apprentice model works well for teaching and learning artistic values and conventional performance
practices; however, approaches to teaching the biomechanics of musical performance may be based on subjective and
vague perception, rather than on accurate understanding of the principles of human movement (Brandfonbrener 2003).
Another characteristic of the master-apprentice model is that it often consists of brief, scattered interaction between the
teacher and the student followed by long periods of private-study by the student. This frequently makes the learning of
musical instruments a rather harsh and solitary experience, resulting in high abandonment rates (Aróstegui 2011). Other
factors such as the belief in the need for long hours of repetitive practice to acquire technical skills or the competitive
nature of music performance often result in a practice with little consideration for the efficacy of the practice. For the
student, consequences of such an uninformed pedagogy can range from a frustrating lack of progress to chronic
problems, or even forced breaks from performing during extended periods. Despite many individual success stories,
documented rates of injury among musicians challenge the success of traditional approaches (Zaza 1998). Pedagogy
informed by science holds the potential to improve the acquisition of performance skills efficiently and effectively. It is
only recently that researchers have begun to explore scientific approaches in music-related research. Musical
performance shares many characteristics in common with other skill-oriented activities (Robson 2004). For example,
commonalities between sports and musical performance are obvious, particularly in the area of biomechanics. It is
reasonable to postulate, therefore, that some methodologies already used successfully in sports could be useful in
studying aspects of musical performance. These include motor learning theory, learning models, and the use of
technology to analyze and validate the effects of training (Hay 1993; Magill 2001).
5
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
This project proposal addresses all the issues mentioned above from a pedagogical and scientific perspective. As a
result of a tightly coupled interaction between its partners, the project aims (1) to design and implement new multi-modal
interaction paradigms for music learning and training based on state-of-the-art audio processing, music analysis and
pattern recognition techniques, (2) to evaluate from a pedagogical point of view the effectiveness of such new
paradigms, (3) based on the evaluation results, to develop new multi-modal interactive music learning prototypes for
different learning frameworks, and (4) to create a publicly available reference database of recordings for browsing,
visualising, annotating and exchanging multimodal information. An example of a typical scenario in the project is
represented in Figure 1.
Sample Scenario
In order to illustrate the project application, consider the following scenario: a student is learning at home without
teacher‘s guidance. The student has access to a public reference performances database and to the augmented
feedback system - relying only on basic computer microphone and webcam capabilities. For instance, the student may
want to practice vibrato. The system may then show a curve representing the student‘s fundamental frequency or pitch
computed in real time (Figure 2). The student is able to compare pre-loaded recorded vibratos from the reference
database in order to compare her own. Additionally, the system computes several vibrato parameters such as the rate
(how fast) and depth (how intense) and compares it to the parameters of the reference vibratos.
The student may also for instance want to practice the timbre. The system may provide visual representations for the
timbre, i.e. a spectrogram with a linear regression of the spectral peaks (Figure 3) and the so called Schelleng’s
Diagram, a force by bow-distance to the bridge figure with two lines that delimit the only space where the balance
between force and distance is allowed as a "good" sound in traditional playing. Alternatively, the student may wish to
improve his/her expression in a particular music piece. In this case, the system may provide visual and auditory
representations of expressive performances of experts for the same piece, either because the database already contain
such performances, or by applying precomputed expressive performance models of experts to that particular piece score
(Ramirez 2012; Ramirez 2010). Students/teachers may decide to upload their recorded audio, video, and/or feedback to
a shared database or access recorded feedback uploaded by other students and comment or receive comments about
their performances. Students and teachers will be able to explore and reason about educational data in the shared
database in order to better understand the students' knowledge, assess their progress and evaluate environments in
which they learn.
Previous results
In the past, the MTG-UPF group has participated in several projects related to the current proposal topic. The
OpenDrama project aimed at developing a novel platform to author and to deliver rich cross-media digital objects of lyric
opera, opening this heritage to a dimension of learning, exploring and entertainment. The MODEM project was also
closely related to TIMuL ideas, trying to develop a virtual learning website for the achievement of creative web musical
projects and offering advanced formative courses for the production and exchange of those creative projects inside the
virtual learning area. The MTG-UPF group was also in charge of the implementation of the SingingTutor, a prototype that
interpolates the voice attributes between amateur and professional singer and aligns the user utterance with the song
lyrics, but focusing on singers' formation rather than in entertainment purposes. Another related project is Cost287ConGAS Action which aimed at developing musical gesture data analysis and to capture aspects connected to the
control of digital sound and music processing. At last, Classical planet, also developed by MTG-UPF, is a platform where
6
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
you can freely enjoy and with interact audiovisual recordings of master classes produced by the Reina Sofía School of
Music and Albéniz Foundation.
The gRFIA-UA group has also a broad experience in projects related to the topic of this proposal. The DRIMS project,
and its predecesor, the PROSEMUS project, which can be considered as a seminal activity for the present one, were
national projects involving pattern recognition and signal theory methods on sound signals and symbolic music
sequences. In the GRE09-32 project, an interactive software for chordal segmentation of symbolic music sequences was
designed and tested. In GRE-12-34, an interactive framework for representing early Spanish music is being developed.
The MIPRCV project, within the CONSOLIDER-INGENIO program, involved 13 research centers and more than 100
researchers. It was a huge effort for coordinating the skills of experimented research groups for advancing in multimodal
and interactive applications –some of them in the music domain– of pattern recognition and machine learning. Finally,
the gRFIA-UA plans to incorporate results from the PROMETEO/2012/017 project, focused on exploiting user feedback
on interactive pattern recognition tasks, to the development of prototypes in this proposal.
2.2. Bibliografía más relevante.
Máximo 8000 caracteres
(Arlandis 2002) Arlandis, J.; Perez-Cortes, J.C.; Cano, J. (2002), “Rejection Strategies and Confidence Measures for a kNN Classifier in an OCR Task”, 16th. Int. Conf. on Pattern Recognition, pp 576–579, vol 1.
(Aróstegui 2011) Aróstegui, J. L. (2011), “Evaluating Music Teacher Education Programmes”. In Educating Music
Teachers for the 21st Century, pp. 1-14.
(Bernado 2007) Bernado-Mansilla, E.; Maciá-Antolínez, N. (2007), “Modeling problem transformations based on data
complexity”, Artificial Intelligence Research and Development, pp. 133-140.
(Brandfonbrener 2003) Brandfonbrener, Alice G (2003), "Advice from one musician to another: beware of friends bearing
gifts", Medical Problems of Performing Artists 18.3 (2003): 89.
(Bresson 2012) Bresson, J.; Pérez-Sancho, C. (2012), "New Framework for Score Segmentation and Analysis in
OpenMusic", Proc. of the 9th Sound and Music Computing Conference, pp. 506-513.
(Breukelen 1998) van Breukelen, M.; Duin, R.P.W.; Tax, D.M.J.; den Hartog, J.E. (1998), "Handwritten digit recognition
by combined classifiers", Computacional Linguistics, pp. 381–386, vol. 34.
(Calvo-Zaragoza 2013a) Calvo-Zaragoza, J; Oncina, J; Iñesta J. M. (2013), "Recognition of Online Handwritten Music
Symbols", Proc. of the 6th Int. Workshop on Machine Learning and Music.
(Calvo-Zaragoza 2013b) Calvo-Zaragoza, J.; Oncina, J. (2013), "Human-Computer Interaction for Optical Music
Recognition tasks". Actas del III Workshop de Reconocimiento de Formas y Análisis de Imágenes, pp. 9-12.
(Garcia 2008) García, V.; Mollineda, R.A.; Sánchez, J.S. (2008), "On the k-NN performance in a challenging scenario of
imbalance and overlapping", Pattern Analysis and Applications, 11(3-4):269-280.
(Garcia 2012) García, S.; Derrac, J.; Cano, J.; Herrera, F. (2012), “Prototype selection for nearest neighbor
classification: Taxonomy and empirical study”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 34(3):417-135,
2012.
7
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
(Hay 1993) Hay, J. G. (1993), “The biomechanics of sports techniques”, Englewood Cliffs, NJ: Prentice-Hall. pp.
signature verification”, Pattern Analysis and Applications, 15(2):113-120.
(Illescas 2011) Illescas, P.R.; Rizo, D.; Iñesta, J.M., Ramirez, R. (2011), "Learning melodic analysis rules", Proc. of the
4th Int. Workshop on Music and Machine Learning.
(Ko 2008) Ko, M. C., Kim, J. H., Kang, H. K., & Lee, J. O. (2008). Development of the Interactive Conference System in
Ubiquitous Computing Environment. In Information Science and Security, Int. Conf. on (pp. 177-180). IEEE.
(Kuncheva 2004) Kuncheva, L. (2004), “Combining Pattern Classifiers. Methods and Algorithms”, Ed. Wiley.
(Lindström 2003) Lindström, E., Juslin, P. N., Bresin, R., & Williamon, A. (2003). “Expressivity comes from within your
soul”: A questionnaire study of music students' perspectives on expressivity. Research Studies in Music Education,
20(1), 23-47.
(Lidy 2010) Lidy, T; Mayer, R.; Rauber, A.; Ponce de León, P. J.; Pertusa, A.; Iñesta, J. M. (2010), “A cartesian
ensemble of feature subspace classifiers for music categorization”. Proc. of the 11th Int. Society for Music Information
Retrieval Conf., pages 279–284.
(Lisboa 2005) Lisboa, T., Logan, T., Chaffin, R., & Gurung, K. Thought, action and self in cello performance. In proc.
ESCOM2005 Performance matters.
(Magill 2001) Magill, R. A. (2001), “Augmented feedback in motor skill acquisition”, Handbook of sport psychology, 86114.
(Mayer 2010) Mayer, R.; Rauber, A.; Ponce de León, P. J.; Pérez-Sancho, C.; Iñesta, J. M. (2010), “Feature selection in
a cartesian ensemble of feature subspace classifiers for music categorisation”. Proc. of the 3rd Int. Workshop on
Machine Learning and Music.
(Micó 2012) Micó, L.; Oncina, J. (2012), "A log square average case algorithm to make insertions in fast similarity
search", Pattern Recognition Letters, vol. 33, pp. 1060–1065.
(Moreno-Seco 2006) Moreno-Seco, F.; Iñesta, J. M.; Micó, M. L; P. Ponce de León, L. (2006), “Comparison of Classifier
Fusion Methods for Classification in Pattern Recognition Tasks”, LNCS, 705-713.
(Ng 2007) Ng, K., Larkin, O., Koerselman, T., & Ong, B. (2007), “i-Maestro gesture and posture support: 3d motion data
visualisation for music learning and playing”, In Proc. of EVA 2007 London Int. Conf., pp. 11-13.
(Oncina 2009) Oncina, J. (2009) , "Optimum Algorithm to Minimize Human Interactions in Sequential Computer Assisted
Pattern Recognition", Pattern Recognition Letters, vol. 30, pp. 558-563.
(Patel 2007) Patel, A. D., Iversen, J. R. (2007), “The linguistic benefits of musical abilities”, Trends in Cognitive Sciences
11, pp. 369–72.
(Pertusa 2013) Pertusa, A.; Gallego, A.-J.; Bernabeu, M. (2013). "MirBot: A multimodal interactive image retrieval
system", LNCS, vol. 7887, pp. 197-204.
(Piccolli 2013) Piccoli, H. C. B.; Silla, C. N.; Ponce de León, P. J.; Pertusa, A. (2013), “An evaluation of symbolic feature
sets and their combination for music genre classification”. Proc. of the IEEE Int.. Conf. on Systems, Man, and
Cybernetics (in press).
(Marinescu, 2012) Marinescu. M., Ramirez, R. (2012), “Learning singer-specific performance rules”, International Journal
of Modeling and Optimization, 2(2): 97-102
8
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
(Ramirez 2010) Ramirez, R., Maestre, E., Serra, X. (2010), “Automatic performer identification in commercial
monophonic Jazz performances”, Pattern Recognition Letters, 31: 1514-1523.
(Ramirez 2012) Ramirez, R., Maestre, E., Serra, X. (2012), “A Rule-Based Evolutionary Approach to Music Performance
Modeling”, IEEE Transactions on Evolutionary Computation, 16(1): 96-107.
(Rico-Juan 2007) Rico-Juan, J. R., Iñesta, J.M. (2007), “Normalisation of Confidence Voting Methods Applied to a
Fast Handwritten OCR Classification”, In Computer Recognition Systems 2, number 45 in Advances in Soft Computing,
pages 405-412.
(Socorro 2011) Socorro, R.; Micó, L.; Oncina, J. (2011), "A fast pivot-based indexing algorithm for metric spaces",
Pattern Recognition Letters, vol. 32, pp. 1511-1516.
(Rico-Juan 2012a) Rico-Juan, J. R., Iñesta, J.M. (2012), “Confidence voting method ensemble applied to off-line
(Rico-Juan 2012b) Rico-Juan, J. R.; Iñesta, J.M. (2012), “New rank methods for reducing the size of the training set
using the nearest neighbor rule”, Pattern Recognition Letters, 33(5):654-660.
(Ritchie 2013) Ritchie, L., & Williamon, A. (2013). Measuring Musical Self-Regulation: Linking Processes, Skills, and
Beliefs. Journal of Education and Training Studies, 1(1), p106-117.
(Robson 2004) Robson, B. E. (2004), “Competition in sport, music, and dance”, Medical Problems of Performing Artists,
19(4), 160-166.
(Sala 2008) Dolors Sala, “ePortfolios for Developing Research Skills in ICT Engineering Disciplines”, ePortfolio Int.
Conferences., 22-24 October 2008.
(Schoonderwaldt 2005) Schoonderwaldt, E., Askenfelt, A., & Hansen, K. F. (2005), “Design and implementation of
automatic evaluation of recorder performance in IMUTUS”. In Proc. of the Int. Computer Music Conf..
(Thompson 2003) Thompson, S., & Williamon, A. (2003). Evaluating evaluation: Musical performance assessment as a
research tool. Music Perception: An Interdisciplinary Journal, 21(1), 21-41.
(Thompson 2007) Thompson, L. K. (2007). Considering beliefs in learning to teach music. Music Educators Journal,
93(3), 30-35.
(van der Linden 2011) van der Linden, J., Schoonderwaldt, E., Bird, J., & Johnson, R. (2011), “MusicJacket—Combining
Motion Capture and Vibrotactile Feedback to Teach Violin Bowing”, IEEE Transactions on Instrumentation and
Measurement, 60(1), 104–113.
(Welch 1985) Welch G.F., (1985), “Variability of practice and knowledge of results as factors in learning to sing in tune”.
Bulletin of the Council for Research in Music Education, Vol 85, 1985, 238-247.
(Zaza 1998) Zaza, C. (1998), “Playing-related musculoskeletal disorders in musicians: a systematic review of incidence
and prevalence”, Canadian medical association journal, 158(8), 1019-1025.
2.3. Finalidad del proyecto, oportunidad de llevarlo a cabo en el contexto del reto elegido y adecuación del mismo a la
Estrategia Española de Ciencia y Tecnología y de Innovación y, en su caso, a Horizonte 2020 o a cualquier otra
estrategia nacional o internacional de I+D+i.
Máximo 8000 caracteres
9
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Alignment with challenge ‘Societal Challenge 6: Inclusive, innovative and reflective European societies’
The text of the challenge clearly states the urgent need of expanding education as it is now understood. In the proposed
project we aim to develop novel paradigms of technology-enhanced learning of instrumental performance skills. By
implementing technology-enhanced learning methods in music education we aim to enhance the efficiency of the
educational process by increasing the interests and motivations of the learner; to raise their level of involvement, their
understanding, their means of accessibility, their level of attention, their attention duration, and much more.
Crucial for a successful pedagogical implementation is the interaction with the user. In this project we will provide several
alternative interfaces, depending on the purpose of the application and the age and level of the student. The challenge
hereby is to address complex performance skills in an intuitive way. This will be achieved by novel displays for
visualization of performance features that will allow them to explore their own performance and extend their possibilities.
An additional criterion for a successful pedagogical integration is that the applications will offer teachers and students
alternative ways of thinking about their skills and performance. It has been shown that the use of technology can extend
the vocabulary of teachers and students when talking about their performance (van der Linden et al. 2011). The
approach that is taken by our project has the potential to achieve this, judged by the interest shown by the playing and
teaching community in the research and technologies that lie at the projects basis.
Additionally, the TIMuL project is expected to impact the way the society think, making it more inclusive and reflexive.
Specifically, the proposed project is expected to impact in:
• Unlock the potential of the individual by a stronger and smarter adaptation and personalization of educational
technologies. The technologies developed in TIMuL will be adapted to the specific type of musical instrument and the
pedagogic methods will be designed based on different skills of the students.
The availability of a feedback system where the student can observe representations of his own performances in real
time and also the possibility of comparing and follow pre-recorded masters’ performances, gives an enormous autonomy
and self-monitoring to the student, who is expected to interact and adapt and explore.
• Significantly higher level of effective, personalised, ICT-based tutoring, leading to its wide-spread penetration in music
schools and at home. One of the project outcomes is a personalised interactive and multimodal ICT-based music
tutoring system capable of detecting and adapting the learning experience depending on the user group level. The
project is partially dedicated to the development of low-cost tools that can be used at home without the need of a human
tutor and using any device and low-cost sensor solutions.
• Faster, more timely and more cost-effective up/re-skilling through learning technologies and their sustained adoption by
SMEs. The project output will increase culture distribution by making music education more versatile and efficient. The
project will make available both music tuition and music materials out of the academic institutions in a cost-effective
manner.
• Emergence of new learning models, including models invoking creativity. The use of technology for measuring musical
performances, the audiovisual feedback of empirical measures constitutes an absolute novelty in music pedagogy and
therefore new learning models will be emerge based on this new way of interaction.The project is expected to provide
new tools for innovative music learning paradigms.
• Affordability and widespread availability of tools and services for releasing the economic potential of cultural heritage in
digital form and for adding value to cultural content in educational, scientific and leisure contexts. Also, part of the
developed tools are meant to be used in a home scenario, widening the target group of users. Furthermore, the project is
expected to produce low-cost platforms available to a very large number of users.
10
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
• Wider range of users of cultural resources in diverse real and virtual contexts and considerably enhanced ways to
experience culture in more personalised and adaptive interactive settings.
At last, but not least, TIMuL is completely aligned with the strategy behind the Creative Europe initiative. This new
programme includes a Culture sub-programme, supporting performing and visual arts, heritage and other areas. The
European cultural and creative sectors represent up to 4.5% of EU GDP and employ more than 8 million people.
Creative Europe will help the creative and cultural industry (CCI) to contribute even more to the European economy by
seizing the opportunities created by globalisation and the digital shift. TIMuL project will help the CCI sector to seize the
opportunities of the ‘digital age’ by (i) providing them with highly innovative educational tools that can also be adapted for
gaming purposes, (ii) allowing a broader access to cultural contents and (iii) contributing to the preservation of cultural
heritage.
Contribution to the strategic goals of Spanish strategy for science, technology and innovation 2013-2020:
- Captivating talent: TIMuL will involve a team of 31 researchers, including both senior and young researchers and also
technical staff. Furthermore, additional young researchers will be able to join the project team and also master students
might do their thesis in the frame of our project and thus making the research careers more appealing for young people.
- Putting Spain at the forefront of knowledge and promoting excellence in scientific research: the applicant research
groups are leaders in the sound and music computing research field and recognised at international level, promoting
initiatives like the SMC network, with a well-consolidated network of collaborators and active in organising outreach
activities in order to maximise the impact associated to the research and academic activities.
- Promoting a highly competitive business environment: MTG-UPF is very active in technology transfer activities. It has
created 3 spin-off companies and usually carries out joint research projects with large companies like Yamaha Corp.,
Microsoft Research, Telefonica, etc. Both research groups are actively participating in the identification of companies’
priorities and needs in order to create demand for new scientific and technical developments that are able to bring
competitive advantages to the Spanish and European market and increase their competitiveness.
- Set the conditions for the creation and dissemination of science and technology. There are several dissemination
activities included in the project work plan which are addressed to different targets including the academic audience,
main players and practitioners within the music pedagogy field and also the general society so they can also participate
and contribute with their expertise. TIMuL expected educational toolkit will be easily integrated in everyday life and will
contribute to create a favorable environment to sustain a society capable of valuing and be taught at all educational
levels and all ages.
11
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
3. HIPÓTESIS Y OBJETIVOS DEL PROYECTO
Describa las razones por las cuales se considera pertinente plantear esta investigación, la hipótesis de partida
y los objetivos generales perseguidos en el proyecto coordinado. Enumere brevemente, con claridad, precisión
y de manera realista (acorde con la duración prevista del proyecto), los objetivos específicos correspondientes
a cada subproyecto, con indicación expresa del investigador principal (IP) responsable en aquellos subproyectos
con dos investigadores principales.
Máximo 12 000 caracteres
Starting Hypothesis
Music is a very important part of education which has been shown to incorporate benefits at all levels, e.g. (Patel 2007).
However we are not capable of providing music education for the majority of people. In recent years, information
technology has improved the efficiency of education in several areas. However, music education is still very much based
on the traditional model of teacher-pupil instruction, which is inherently non-scalable, time-consuming and labour
intensive. This is mainly due to the complexity of creating a unified educational model by integrating different
methodologies which take into account the relevant aspects of culture, physiological, emotional, cognition and
perception. This challenge can be addressed with the support of suitable pedagogical models and technology solutions.
Technology can be exploited as a mean to enhance the learning models and the efficiency of the educational process by
increasing the interest, motivation and social interaction of the student.
General Objetives
The main scientific and technological objectives of the project are the following:
O1- This objective aims to acquire and evaluate multimodal data for the pattern recognition and machine learning
techniques of O2. These data include audio, gesture, physiological, symbolic, image, and video information. In order to
obtain these features, it is necessary to develop analysis techniques to extract them from different sources (for example,
audio analysis needs to be performed to get high level descriptors such as pitch).
O2- Due to the multimodal and adaptive nature of the proposed tasks, it is necessary to apply and develop pattern
recognition and machine learning techniques that could take advantage of the interaction with users or their
environment. The gRFIA group has a large experience in this topic, giving as a result both basic science publications
and multimodal interactive systems, mainly in the scope of the project Consolider Ingenio 2010 "Multimodal Interaction in
Pattern Recognition and Computer Vision". Also, the MTG group has recently develop multimodal systems combining
different information sources. This objective aims to investigate, compare, and design techniques for integrating the
descriptions coming from the different information sources involved in multimodal problems, both at feature level (fusion
of descriptors), and decision level (combination of classifiers). Also, interactive and adaptive methodologies will be
studied and developed in order to make the most of the user feedback. The resulting techniques will be used for the
prototypes development at O3.
O3- Implement novel multi-modal interactive music learning prototypes compatible with traditional teaching methods for
different learning tasks, based on audio, image and motion acquisition, and augmented feedback systems (real-time
visualizations and sound). Five interactive multimodal prototypes will be developed for pedagogical music learning and
training. Music transcription will help musicians to get scores from audio. Interactive music analysis is useful get chordal
and tonal information from scores for academic analysis. A performances database will be developed to help musicians
to learn how to play musical instruments (like in the Fig. 1 example). The expressive performance prototype can
12
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
generate human-like performances. Optical music recognition helps musicians to write music notation by hand, helping
musicians to correctly write the notes.
O4- Pedagogical implementation and validation of the results using novel techniques for human assessment,
incorporation and evaluation of the benefits of collaborative learning in a student’s network environment, and building a
public database of musical performances. The project will develop a student network platform in which performances can
be uploaded, downloaded, examined and commented, and the benefits of collaborative learning in this setting will be
assessed. This objective will be achieved by a close collaboration between the involved institutions. In the initial phase of
the project systematic discussions will be organized, which will lead to a clear formulation of working concepts and use
scenarios. In the second phase of the project early prototypes will be tested by teachers and students, providing
feedback for further improvements. Finally, the systems will be formally evaluated in field studies, focusing on usability
aspects and effect on performance.
O5- Monitoring, control and evaluation of the different stages of the project. Adaptation of the working plan to possible
eventualities. Organization and co-ordination of publication, diffusion, and exploitation activities. This objective will be
jointly executed by the gRFIA-UA and MTG-UPF centres.
Concrete measurable outcomes of the project include:
- Low-cost data acquisition prototypes based on sound and video analysis.
- Library of performance features computation algorithms, for sound and control.
- Online performance database prototype capable to uploading, storing, sharing and retrieving students‘ and teachers‘
multimodal performance information.
- Software prototypes for the different measuring systems and able to compute features and give informative feedback in
real time.
- Definition of metrics for automatic evaluation and progress assessment of student`s performance skills.
- Quantitative and pedagogical evaluation of novel learning paradigms.
- Project web portal.
Specific objectives of subproject 1 (gRFIA-UA)
The gRFIA group has experience with basic science pattern recognition and machine learning techniques that have also
been applied to music problems, mainly using symbolic data (music scores). Their objectives in the proposed project
include:
- To acquire labelled symbolic data annotated with their genre, melody track, tonal and functional analysis, and chord
sequences
- To develop music analysis algorithms to process these data
- To develop novel efficient machine learning and pattern recognition techniques, focused on multimodal interactive
tasks
- To develop an interactive multimodal music transcription prototype
- To develop interactive interfaces for symbolic music analysis
13
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
- To develop prototypes for optical music recognition
Specific objectives of subproject 2 (MTG-UPF)
The MTG group is an international reference for music technology and has a large experience with music-related tasks,
mainly with audio and gestures data. Their objectives in the proposed project include:
- To acquire audio music data and develop analysis techniques for obtaining high-level descriptors useful for the aims of
the project
- To acquire gesture data and develop algorithms for their analysis
- To build a multimodal performance database framework for browsing, recording, annotating, visualizing, and sharing
music performances
- To develop an expressive music performance system prototype
- To design appropriate performance exercises and evaluation procedures from a pedagogical point of view to evaluate
the prototypes developed in the project
- To define and implement a portfolio framework for independent learning of music using intereactive technologies.
14
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
4. JUSTIFICACIÓN DE LA COORDINACIÓN
Indique:
- Necesidad de la coordinación propuesta
- Valor añadido que se espera alcanzar con la coordinación
- Interacción entre los distintos objetivos, tareas y subproyectos
- Mecanismos de coordinación previstos para la ejecución eficaz del proyecto
- Necesidad de todos los subproyectos que integran el proyecto coordinado.
Máximo 8000 caracteres
The TIMuL research groups (i.e. MTG-UPG and gRFIA-UA) are leaders in Spain and in Europe in the strategic
fields of interest for the project. The participating institutions in the project are on the one hand complementary in
covering the research challenges that need to be faced in order to develop the TIMuL models and prototypes
and, on the other hand, to provide all the skills and expertise necessary to successfully address such research
challenges. The project is composed of 2 scientific partners and 2 collaborating music education institutions:
Escola Superior de Música de Catalunya (ESMUC), and Conservatorio Superior de Música “Manuel Massotti
Littel” (CM), all of them supported by the BMAT company which has already expressed their interest in the
expected outcomes to be obtained at the end of the TIMuL project. This combination covers the whole value
chain: from research and development of new and challenging ideas to the end users’ involvement for gathering
requirements and validating solutions and also including the exploitation stage.
We will invest in research for obtaining real advancement and working prototypes to be deployed in public and
private educational institutions (e.g. in conservatories). The project participants cover various aspects required for
the whole project ranging from pedagogical aspects of music education, performance modeling, system
development, assessment models and management, dissemination and exploitation capabilities. They represent
the excellence in their respective role and the project partners have already managed large multi-partner national
and international projects. Figure 4, shows the main areas of research necessary to achieve the objectives of the
TIMuL project and the expertise of each of the participating institutions.
The general interaction between objectives can be seen in Fig. 5. First, multimodal information (audio, symbolic,
image, and video) will be collected and analyzed in O1. In parallel, pattern recognition and machine learning
techniques will be developed and evaluated in O2. Tasks from objectives O1 and O2 can initially be done
independently, but they will eventually meet, adapting basic science techniques developed in O2 to the features
obtained in O1. In particular, data analysis tasks T1.2-T1.5 (see Sec. 5.2) only depend on corpora generation and
acquisition (T1.1), whereas learning tasks T2.1-T2..4 can be executed in parallel from the beginning of the
project.
Then, five prototypes will be developed in O3 using the data from O1 and the learning techniques from O2 for:
- Interactive music transcription. In order to convert audio into musical scores, the results from audio analysis
(T1.2), and the learning techniques from O2 will be used.
- Interactive music analysis: This task consists of providing a symbolic music analysis prototype for music
learning, and it depends symbolic music analysis (T1.3), and the learning techniques from O2.
- Public performances database: The development of a public database of performance recordings of multimodal
time-aligned data depends on all the O1 tasks.
- Expressive performance: This prototype will be able to generate human-like music performances based on
multimodal (i.e. audio and gesture) information, and it depends on audio analysis (T1.2), video analysis (1.5), and
the learning techniques from O2.
15
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
- Interactive optical music recognition: This prototype for recognizing writting music notation using images or the
input from a tablet depends on the image analysis task (1.4), and the learning techniques from O2.
Eventually, pedagogical and scientific evaluation (O4) will be performed to assess the quality of the prototypes.
From the beginning of the project, a set of models for music training will be proposed (T4.1). Simultaneously,
learning and guidance process and measures of assessment (T4.2) will be defined to provide an independent
framework for students to practice and study their performance skills. The learning performance of students will
be measured quantitatively and qualitatively (T4.3) using the information from T4.1 and T4.2, and finally all the
prototypes developed in O3 will be evaluated using pedagogically and scientific measures (T4.4). The evaluation
in the latter stages may influence the prototype development, forcing to lightly readapt them if necessary.
To carry out these goals, the MTG-UPF group has a proved and internationally recognised experience in audio,
music and sound analysis, and synthesis techniques, as well as in music gesture analysis and expressive music
performance modeling. On the other hand, the gRFIA-UA group has a well-established experience in machine
learning and pattern recognition applications, mainly in the music field. The technologies from both groups are
complementary for approaching the challenges in the proposed research field in an effective way, as well as in
developing prototypes. We therefore consider the collaboration of both groups in this project to be necessary and
well justified. Furthermore, collaboration between the two groups has already proved to be fruitful and effective in
previous projects like PROSEMUS and DRIMS.
Tracking, control, and evaluation of all project stages are also part of the project, as well as the working plan
adaptation to eventualities taking place during the execution of the project. For that, a specific objective is
proposed (O5). The project tracking will be done from both a scientific viewpoint and the coordination of
publication and exploitation activities. Participating entities will be supported in a proactive way and formal
controls will be established in order to guarantee optimal interrelations. It is planed that each group will send a
report every six months to the project supervisor about the progress of their respective tasks, the problems
encountered and potential schedule modifications. Additionally, the two participant research groups will meet
periodically -also using virtual means- in order to update each other and jointly define future directions based on
the knowledge gained.
The coordinator of the project must produce a progress report yearly. He would also be in charge of organizing
coordination meetings with all the project participants.
Also, a website for the project will be developed, with a private area for databases, documentation hub, progress
reports and internal demos, and a public area where the project will be publicized and where the contributions
and demonstrations resulting from the project work will be accessible to the general audience.
16
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
5. METODOLOGÍA, PLAN DE TRABAJO
Y RESULTADOS PREVISTOS
5.1. Descripción de los materiales, infraestructuras y equipamientos singulares a disposición del proyecto
coordinado que permitan abordar la metodología propuesta.
Máximo 8000 caracteres
We describe the main facilities available to both groups. Most of them can be shared when be required by a specific task
or if it is necessary.
The MTG-UPF group has:
• Centralized storage system: a NetApp FAS3140, with 100Tb of raw disk space, providing storage services to all
systems, servers (both physical and virtual) and users, serving the data on CIFS and NFS protocols over a dual, 10Gb
connection.
• Virtualization services: a farm of 14 servers, all of them running ESXi. These servers can provide virtualization services
to the project. This virtualization farm provides 230 CPUs and 1,10 Tb RAM
• Supercomputing cluster: computing power to the project, to perform heavy, long, high-cpu and RAM demanding jobs.
This cluster is geared with 11 AMD Opteron, 4-way servers and a head node, featuring 728 CPUs and 2,8Tb RAM.
• Backup: in addition to the local backups performed by the FAS3140 storage, a secondary disk array is placed outside
the main CPD. Every night the data is backuped on this remote filer, giving disaster recovery capabilities the services
hosted here.
• Recording studio equipment.
• Gesture and motion capture equipment (Polhemus).
• Electroencephalograph signal equipment.
• Audio database that has been used in other projects (both European and Spanish funded). Some of this data will be
organized to be usable in this project.
In addition, the project will make use of the repoVizz framework, developed by the Music Technology Group in the
context of the EU FET project SIEMPRE, as the platform for storing and browsing/visualizing the public database of
recordings. repoVizz (http://repovizz.upf.edu/) is an integrated solution for structured formatting and remote storage,
browsing, exchange, annotation, and visualization of synchronous multi-modal, time-aligned data. repoVizz is motivated
by counterposing the potential offered by collaborative, data-focused, exchange-driven research and the difficulties often
found for sharing or browsing large data sets of multi-modal, heterogeneous data. Further repoVizz functionalities will be
developed within the proposed project in order to adapt the framework to a more pedagogical context including:
- Integration within the proposed project descriptors and prototypes.
- High level visualization of analysis descriptors to show feedback.
- User configurable visualizations (e.g. layout).
- Collaborative User Annotation system.
17
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
- Web API to access and download the individual data streams.
- tools for guiding, analysing and evaluating student performances and learning progress.
The gRFIA-UA group has:
• Musical databases acquired with previous projects funds (RWC and RISM databases).
• A wide collection (more than 200.000 items downloaded from publicly available Internet sites) of symbolic music files
that still needs to be structured and tagged with metadata relevant to the categories we want to infer in our models.
Another source of metadata are the annotated Lyrics compiled by the group of the Technical University of Vienna, with
which a recent integrated action has been carried out. Also there are Internet sites open to researchers, like that of the
Center for Computer Assisted Research in the Humanities at Stanford University (www.ccarh.org) that provides a large
number of symbolic music data of high value for these researches.
• Melody track identification system developed in former works.
• Interactive interfaces for music analysis, in JavaFX2 and OpenMusic.
• MIDI keyboard, digitizing tablet.
• Access to a HPC cluster composed by 26 nodes of calculation HP Proliant SL390s G7 with the following libraries
installed: MPI, MVAPICH, MVAPICH2 and Intel MPI, all supporting low-latency Infiniband type interconnection network.
5.2. Metodología. Detalle la metodología propuesta de acuerdo con los objetivos del apartado 3.
- Deberá indicarse la viabilidad metodológica de las tareas y reseñar los hitos o entregables previstos. Si fuera
necesario, se incluirá una evaluación crítica de las posibles dificultades de un objetivo específico y un plan de
contingencia para resolverlas.
El personal implicado en cada una de las tareas deberá especificarse en el cronograma.
- Si solicita ayuda para la contratación de personal, justifique claramente su necesidad y las tareas que vaya a
desarrollar.
Máximo 32 000 caracteres
Objective 1: MUSIC ANALYSIS
●
Task T1.1: Corpora generation and acquisition
●
Task T1.2: Audio analysis
●
Task T1.3: Symbolic music analysis
●
Task T1.4: Image analysis
●
Task T1.5: Video analysis
Objective 2: PATTERN RECOGNITION AND MACHINE LEARNING
18
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
●
Task T2.1: Preprocessing multimodal information
●
Task T2.2: Active learning for interactive systems
●
Task T2.3: Efficient classification methods for dynamic environments
●
Task T2.4: Multimodal classification techniques
Objective 3: PROTOTYPE DEVELOPMENT
●
Task T3.1: Interactive music transcription prototype
●
Task T3.2: Prototype for interactive music analysis
●
Task T3.3: Prototype public performances database
●
Task T3.4: Expressive performance system prototype
●
Task T3.5: Prototype for interactive optical music recognition
Objective 4: PEDAGOGICAL AND SCIENTIFIC EVALUATION
●
Task T4.1: Models for music training
●
Task T4.2: Definition of learning and guidance process and measures of assessment
●
Task T4.3: Models for assessment of performance skills
●
Task T4.4: Prototype evaluation
Objective 5: COORDINATION AND DISSEMINATION
Task T5.1: Evaluation and coordination meetings
Task T5.2: Communication channels: website
Task T5.3: Publications and conferences
O1: MUSIC ANALYSIS
Task 1.1: Corpora generation and acquisition
Execution centre: gRFIA-UA and MTG-UPF
Goal: To record, acquire and organize musical audio, gesture, physiological, and symbolic data of interest to the
project.
Description: This task is devoted to recording, acquiring, organizing, and tagging multimodal music-related data for the
development and testing of the algorithms developed in other tasks in O1 and O2. Some of these data (audio and
19
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
gesture) will be obtained using existing MTG-UPF equipment (e.g. MTG-UPF audio recording studio). This task will
provide a data repository for structured storage and user-friendly browsing of the obtained multimodal recordings. We
plan to store data streams as single-channel PCM files, annotations as text files and musical scores as MusicXML. Data
will be structured by customizable XML skeleton files enabling meaningful retrieval.
Also, since some of the algorithms to be developed are symbolic, so will be the rest of the data needed: digital scores in
different formats and metadata will be the initially considered symbolic data. Besides publicly available data, an
important part of the databases will be self-compiled due to an important copyright issue inherent to the data used. A
wide symbolic music dataset (basically, digital scores in MIDI and MusicXML format will be created, organized, indexed,
and tagged by experts to build corpora both for the rest of O1 tasks and for some of those described in O2.
Tagging very large databases of music files in symbolic format is an open problem, and it is not possible to organize
them by hand without the use of interactive helping tools. In the prototype development stage (O3) it is planned to
develop these tools, integrating some of the techniques developed by our group in previous projects, for classifying
genres, computing similarity, extracting and processing metadata. We will also count on the advice and knowledge of
musicology experts that are already collaborating with our group, some of them doing their PhD’s. Involving this staff in
this stage of the project is a key issue for the success of the next stages.
In order to acquire a multimodal corpus of music transcriptions, a national jazz transcription contest will be organized in
cooperation with some of the music education institutions that expressed interest in the outcomes of this project. As far
as we know, this is a novel approach in music data acquisition, that will benefit both the music information retrieval
researcher community and music education institutions participating in the project. The contest will allow to take
advantage of the transcription work that jazz students regularly do during their career development.
Deliverables: Multimodal transcription corpus of jazz solos. A symbolic dataset annotated with their genre, melody track,
tonal and functional analysis, and chord sequences, in MIDI and MusicXML format. A repository of synchronized
multimodal recordings relevant to the project.
Task T1.2: Audio analysis
Execution centre: MTG-UPF
Goal: To develop a library of music audio description components appropriate to the problems tackled in the project.
Description: The developed library will provide the signal processing tools that are needed for analyzing musical audio
signals and extracting music descriptors such as instantaneous pitch, energy, note onsets, spectral descriptors, and
harmonic pitch class profiles (HPCP), as well as higher level descriptors such as melodic, harmonic, timbral, and
structural descriptors. The developed library will be tested with the corpora generated by task T1.1.
Deliverables: A library of music audio content descriptors
Task T1.3: Symbolic music analysis
Execution centre: gRFIA-UA
Goal: Development of a framework library for the automatic analysis of musical works
Description: The harmonic analysis of a work can be solved from two different perspectives: the chordal analysis and the
academic tonal analysis. Our group has currently a development for both approaches: tonal (Illescas, 2011), and chordal
20
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
analysis (Bresson 2012). The chordal analysis prototype has been developed in collaboration with the Musical
Representations Team at Ircam, and its core has been integrated in the latest release of their software OpenMusic
(http://repmus.ircam.fr/openmusic/home), version 6.6.
These systems should be integrated in a common library to be able to answer questions that depend on music analysis
such as the current tonality or the role of a given note. At their current stage, the analysis algorithms that will be
integrated in this library perform in an automatic manner. In order to allow human interaction, they must be adapted,
including the ability to take profit from human feedback. These techniques will be developed and tested using the
corpora developed in task T1.1, and models from tasks T2.2 and T2.3.
Deliverables: A library that performs a music analysis, harmonic, tonal and functionally. Report: Chordal and melodic
tonal analysis.
Task T1.4: Image analysis
Execution centre: gRFIA-UA
Goal: Evaluation of image analysis methods and descriptors
Description: An interactive prototype for Optical Music Recognition (OMR) will be developed in O3 using the image
analysis techniques developed in this task. On these prototypes several preprocessing and feature extraction methods
will be used in order to find the most adequate for each modality. It’s not expected to develop novel image descriptors,
but to evaluate existing ones for music-related and multimodal tasks. The gRFIA group has experience in this topic, as
some of their members have recently developed a multimodal interactive image retrieval prototype (Pertusa 2013).
Milestones: Report: First versions of image descriptors.
Deliverables: Report: Review of preprocessing and feature extraction techniques for OMR.
Task T1.5: Video analysis
Execution centre: MTG-UPF
Goal: This task aims to develop methods for recognising performer’s gestures.
Description:
For some of the interactive multimodal prototypes as described in objective 3 (e.g. expressive performance system
prototype), we will to develop algorithms to recognise full body gestures related to the performance of musicians. From
visual data acquired by a camera, we need to recognise the typical expressive movements from performers or ensemble
conductors.
This task will involve:
- recording of visual data in a live performance environment
- analysing the key movements involved in the activity and manually annotating some recording as ground truth.
- developing algorithms to extract these key movements
- test the algorithms against several recorded live sessions
21
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Deliverables: Algorithms for the recognition and performer gestures
Success criteria: Accuracy of methods tested and quantified on ground-truth data consisting on manually
annotated recordings.
O2: PATTERN RECOGNITION AND MACHINE LEARNING
Task 2.1: Preprocessing multimodal information
Execution centre: gRFIA-UA and MTG-UPF
Goal: Study and development of preprocessing techniques for prototypes and feature reduction.
Description: In music, different features could be integrated, such as audio, textual, and physiological features for music
recommendation, or audio and symbolic features for chord recognition. Obtaining and using datasets in order to train
interactive multimodal applications is a complex task, due to the large variability in the input data in which only a small
subset of data is labeled and sometimes can also be noisy. Classical selection and prototype reduction techniques
(Garcia 2012), analysis and balance correction of datasets (Garcia 2008), and the study of the data complexity (Bernado
2007) are widely known techniques that can be applied to the interactive multimodal paradigm.
In addition, dimensionality reduction and feature selection are particularly relevant for multimodal datasets, as the
amount of features is typically large. In this task, dimensionality reduction techniques using medians of datasets or
feature transformations will be studied and adapted to dynamic systems. Also, early fusion (combination of features)
methods will be considered as an alternative to combination of classifiers for multimodal classification, like in (Piccoli
2013).
Deliverables: Techniques and algorithms for integrating the multimedia data
Task 2.2: Active learning for interactive systems
Execution centre: gRFIA-UA
Goal: Application, adaptation, and development of active learning techniques for multimodal tasks
Description: Active learning methods take advantage of the user interactions such as error corrections, access
repetitions, etc. to minimize the number of operations with respect to direct interaction. Interactive systems have only
recently been explored (Ko 2008), but they are currently a very active topic. Projects such as "A unified framework for
multimodal content SEARCH (I-SEARCH)", the network of excellence "Multimodal Computing and Interaction", or the
Consolider Ingenio project "Multimodal Interaction on Pattern Recognition and Computer Vision", in which the gRFIA
group has participated, include active learning as one of their main topics.
In this task, active learning techniques will be applied, adapted, and developed for multimodal systems, based on the
users interaction or their environment. For example, interaction can be exploited for polyphonic music transcription,
making use of user feedback on the initial onset computation, beats, meter estimation, and notes in a left-to-right
validation approach. This means that any user interaction validates what is at the left-hand side of its position, and the
interactions are used to re-compute the rest of the output hypothesis (Oncina 2009). Moreover, the system can be
22
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
retrained in order to take in account the new validated data. A similar scheme can be used for tonal analysis, music
synthesis, and optical music recognition, as it can be seen in the O3 prototypes description.
The quality of the prototypes (samples in the training set) has been widely discussed problem. Recent studies about offline systems such as (García 2012) summarises selection methods, while (Rico-Juan 2012b) proposes some effective
ranking methods. One of the goals of this task would also be to adapt the last one methods to online learning because
they offer more information about each prototype in order to adapt the system online to minimize human corrections. A
second step could be to study prototype states as they arrive in order to know what classes are more stable and
therefore obtain a better prediction.
Milestones: Report: First versions of active learning methods and context review.
Deliverables: Report: Evaluation of active learning techniques in interactive systems. Basic science and applied
techniques to exploit interaction in active learning systems.
Task 2.3: Efficient classification methods for dynamic environments
Execution centre: gRFIA-UA
Goal: Adaptation and development of efficient algorithms for similarity search and reduction techniques in dynamic
environments.
Description: Real-time applications such as the music training scheme in Fig. 1 require efficient techniques for
classification, and many of them based on similarity searching. The gRFIA group has a large experience in this topic,
specially in the application of efficient similarity search techniques in metric spaces (Micó 2012)(Socorro 2011). These
techniques are very appealing when we have to search in large databases and/or the knowledge of the internal structure
of the objects (points) is irrelevant. To apply these techniques, only the dissimilarity measure is used. To speed up the
search, these methods exploit some metric properties of the search space. Several types of search are allowed
depending on the problem to solve (k-nn search, range search, reverse k-nn search, similarity join, etc). Reduction
techniques can take advantage of these methods to improve the efficiency.
In this task, exact and approximate similarity search methods with efficient index updates will be developed and
evaluated, and also adaptation and development of edit and condensing techniques for dynamic environments.
Milestones: Development of efficient classification techniques for dynamic environments.
Deliverables: Report: Dynamic similarity search and reduction techniques and evaluation.
Task 2.4: Multimodal classification techniques
Execution centre: gRFIA-UA
Goal: Development of new classifier combination methods. Adapt and apply classifier combination methods to
multimodal classification tasks in the music education field.
Description: The traditional classification techniques are based on extracting features and obtaining a final decision with
a single classifier. An evolution of the previous scheme makes use of several weak classifiers to combine their individual
outcomes to obtain more robust decisions (Kuncheva 2004). These ensembles summarise several viewpoints or modes
in multimodal problems.
23
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
In particular, the combination based on confidence measures per class (Breukelen 1998) allows to build a final decision
with more precision. (Arlandis 2002) proposed an interesting method to estimate a posteriori probability using k-NN rule.
An enhanced formula was proposed by (Rico-Juan 2007). A good results are obtained with theses approaches applied
to signature verification (Rico-Juan 2012a) and to OCR (Rico-Juan 2007) problems.
One of the objectives in this task is to apply/adapt the experience in the previous methods of combining classifiers for
multimodal problems to the music education field. Some previous works applied to the music field are (Lidy 2010), and
(Mayer 2010), which show promising results by using a late fusion approach applied to a multimodal music genre
recognition problem. Prototypes developed in O3, like interactive music transcription, multimodal expressive
performance modeling, or multimodal interactive learning will benefit from the adaptation of classifier combination
techniques to classification problems related to music education systems.
Milestones: Report: First classifier combination versions and context review. Report: Developed classifier combination
techniques.
Deliverables: Classifier combination methods for O3 using multimodal data sources.
O3: PROTOTYPE DEVELOPMENT
Task T3.1: Interactive music transcription prototype
Execution centre: gRFIA-UA
Goal: Design and implementation of a research platform for developing and applying interactive and multimodal
techniques to monotimbral music transcription.
Description: Music transcription transforms an audio signal of a music performance in a symbolic representation (MIDI or
score). State-of-the-art techniques are far from being accurate, especially in the case of polyphonic sounds. Nothing
even close to 100% of accuracy can be expected from available systems, so user corrections are needed.
The task is planned for applying multimodal solutions and user feedback interactions. These aspects are explained in
more details next:
Multimodality: different sources of information to detect notes in a musical audio excerpt will be used: the energies in the
signal spectrogram, note onsets times, and rhythm information (beat and meter), and chromagrams. Classical
approaches usually utilize only spectral information for this task, but the cooperative use of the other proposed
informations is expected to improve dramatically its performance.
Interactive: the prototype will be designed to make use of user feedback on the initial onset computation, beats, meter
estimation, and notes in a left-to-right validation approach. This means that any user interaction validates what remains
at the left-hand side of its position, and the interactions are used to re-compute the rest of the output hypothesis.
Deliverables: Interactive multimodal music transcription prototype accesible for on-line operation
Task T3.2: Prototype for interactive music analysis
Execution centre: gRFIA-UA
Goal: Development of an interactive prototype for testing and demonstrating the analysis algorithms.
24
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Description: having developed both prototypes that perform chordal analysis (Bresson 2012) and academic tonal
analysis (Illescas 2011), they must be provided with the interaction features that help better the user understand the
output of an automatic analysis, and that enable the change of that analysis by, either correcting wrong decisions, or that
allow to experiment with new heuristics, rules, weights, etc. Besides, this tool can help for the creation of corpora, by
exporting the annotated analyses as a ground truth that can be used in further experiments.
The graphical interface of this prototype can be developed both using JavaFX2 and OpenMusic frameworks, while
sharing the same core libraries developed in task T2.4. The JavaFX2 version allows to integrate this system with the
polyphonic transcription prototype, in order to provide chordal or tonal annotations for the transcribed scores, while the
OpenMusic version allows to increase the visibility of these tools thanks to the big community of users (researchers and
composers) that use OpenMusic.
Deliverables: Interactive interfaces for symbolic music analysis, in JavaFX2 and OpenMusic
Task T3.3: Prototype for public performances database
Execution centre: MTG-UPF
Goal: Development of a public database of performance recordings.
Description:
Based on the MTG-UPF group’s previous work on repoVizz, this task will develop a database for remote storage,
browsing, exchange, annotation, and visualization of synchronous multi-modal, time-aligned data. Concretely, this task
wll adapt the repoVizz functionalities to a more pedagogical context including:
- Integration within the proposed project descriptors and prototypes.
- High level visualization of analysis descriptors to show feedback.
- User configurable visualizations (e.g. layout).
- Collaborative User Annotation system
- Web API to access and download the individual data streams.
- tools for analysing and evaluating student performances and learning progress.
Deliverables: Public performance database for browsing, recording, annotating, and visualizing music performances.
Task T3.4: Expressive performance system
Execution centre: MTG-UPF
Goal: Development of an expressive music performance system prototype in the context of
music generation.
Description: The developed prototype will be able to generate expressive (i.e. human like) music
performances of monophonic melodies based on multimodal (i.e. audio and gesture) information. The
25
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
prototype will be based on a predictive model trained using pattern recognition techniques applied to a database of
performances annotated with high-level audio features and gesture information. The prototype will consist of three
components: (a) a melodic transcription component (b) a Pattern recognition component which automatically learns an
expressive transformation model, and (c) a melody synthesis component which generates expressive monophonic audio
output from inexpressive melody descriptions. i.e. music scores.
Deliverables: An expressive music performance prototype
Task T3.5: Prototype for interactive optical music recognition
Execution centre: gRFIA
Goal: Development of an online and an offline Optical Music Recognition prototype.
Description:
Two different prototypes will be developed in this task:
1) Offline Optical Music Recognition: The aim of the musician is to transcribe a score form a sheet of paper to an
electronic format. The considered device is a tablet. The musician makes a high resolution photo of the sheet and the
system provides an initial transcription. Then, the musician interacts with the tablet in order to correct the unavoidable
transcription mistakes. In order to easy this task (minimise the number of interactions), techniques developed in previous
workpackages will be applied. A longer description of the system can be found in (Calvo-Zaragoza 2013b).
2) Online Optical Music Recognition: This prototype can be used for dictations, correcting the music apprentices at
notation and calligraphic levels. In this modality the musician is interested in writing music using a tablet. This case
poses less problems than Optical Music Recognition since some steps can be skipped (most of the preprocessing and
the staff removal), or greatly simplified (symbol segmentation). The system can also use error correcting and user
adaption techniques developed in previous objectives. Some preliminary results of this approach can be found in (CalvoZaragoza 2013a).
Deliverables: Prototypes for offline and online optical music recognition.
O4: PEDAGOGICAL and SCIENTIFIC EVALUATION
Task 4.1: Models for music training
Execution centre: MTG-UPF, gRFIA-UA
Goal:
The goal of this task is to gather knowledge in order to create a database of current methods, their commonalities and
differences with particular respect to the way posture and movement are handled.
Description:
This task will be conducted as follows:
26
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
- Review of traditional music teaching models: we will conduct a literature review of an examination of pedagogy models
in general instrumental teaching.
- Questionnaire and Interview Design. Based on the knowledge gathered through the literature review, questionnaires
concerning teaching of instrument techniques will be designed to be distributed to teachers at Escola Superior de Música
de Catalunya (ESMUC), and Conservatorio Superior de Murcia. The aims for the questionnaires will be to gather
knowledge on current methods, including detailed information on concerns of the three levels of sound, controls and
body motion. These will be broken down into groups of questions on producing sound and posture.
- Propose novel pedagogic technology-enhanced methods. Sessions at the music institutions participating in the project
(ESMUC and CM) will be conducted with an aim of demonstrating the available technologies that will be developed the
other objectives to practicing teachers in order to get their feedback on usefulness, visualisation design and to ultimately
aid in proposing new pedagogic methods that could incorporate the use of these technologies.
Milestones: teachers confirming the benefit of incorporating technology in the training process in the context of different
types of skills and suggesting novel pedagogic methods.
Deliverables: novel pedagogic technology-enhanced methods.
Task T4.2: Definition of learning and guidance process and measures of assessment
Execution centre: MTG-UPF
Goal:
To provide an independent learning framework for the student to practice and study the performance skills (playing on
time, playing with correct tuning and playing expressively) tracing the practice so it can be assessed and guided by a
tutor.
Description:
For each skill evaluated we will develop both automatic (e.g. the degree of similarity between the dynamic interpretation
of the student and expert musician) and human expert evaluations (in collaboration with teachers) of success. In
conjunction with music teachers of the ESMUC and CM we will define a set of exercises for different performance skills:
playing on time, playing with correct tuning and playing expressively. Once the specific exercises for each skill have
been defined, this task will involve recording expert musicians and teachers to build a model of what is a successful
execution of each of the exercises.
These goals will be managed using the portfolio-based methodology (Sala 2008). The portfolio is the collection of
evidences of the student learning process which can be used to assess, guide and mentor the student. This task will
define the tools to materialize the objective goals and exercises in the definition of a portfolio for independent learning of
music using interactive technologies. It is expected to define a semi-structured portfolio with a guided part with the
defined exercises above and a free-style part where students take the initiative to decide what practices they do and
hence what additional evidences they provide towards the objective goals proposed in the portfolio.
Milestones:
Definition of models for the assessment of the interactive technologies and for the assessment of the student learning
process.
Deliverables:
27
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
A set of of music performance exercises and corresponding expert musicians’ recordings.
The definition of the portfolio for independent learning of music using interactive technologies.
Task 4.3: Models for assessment of performance skills
Execution centre: MTG-UPF, gRFIA-UA
Goal: This task will assess the technologies by measuring the learning performance of students quantitatively and
qualitatively using the tools and comparing this performance with students not using the tools.
Description:
This task incorporates experiments and users with the goal to assess how well the student performs through using the
developed technologies. This will be conducted using two groups of students for each skill, one group who uses the
technology and one who continues with traditional teaching methods. Evaluation models will be developed to assess the
usefulness of the technologies in helping students achieve their well-defined goals. These will contain two types of
measures: numerical evaluations such as those determined by the percentage of correct notes performed, or the
correlation of vibrato curves of the student and the master, and evaluations of the success of students who use these
new technologies compared to students who learn through traditional teaching methods alone.
Milestones:
Perform field trial with students and teachers for a period of time
Deliverables:
Assessment report of the success of the proposed technologies and learning techniques based on the students
participating in the field trial.
Task T4.4: Prototype evaluation
Execution centre: gRFIA-UA + MTG-UPF
Goal: Measure the success rate of the prototypes developed in O3.
Description:
Prototypes must be evaluated from a computational point of view in order to get a measurable success rate to improve
the quality of the models. This can be done using classical pattern recognition and machine learning measures taking
into account true positives, false positives and false negatives, but also, in the case of interactive (active learning)
prototypes such as the polyphonic music transcription system and the online OMR, considering that the number of
corrections made by a human should be as low as possible.
Deliverables: Report: Evaluation of prototypes with a classical scientific methodology and using active learning metrics
O5: COORDINATION AND DISSEMINATION
Task T5.1: General coordination of the project
28
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Goal: Efficient project coordination and timely achievement of results.
Description: This task is dedicated to all control aspects of the project, including the scientific works, the diffusion and
exploitation activities and reporting. Participant entities will be supported and formal controls will be established to
ensure an optimum interrelationship between all of them. In particular, a project web site will be developed, where the
used databases, reports on the project development, demonstrations, and publications will be stored when they are
public.
Each of the participant groups must inform every six months the project supervisor of the progress of its respective
modules, the dedication paid, as well as the problems found and calendar modifications, if appropriate. The coordination
responsible must elaborate a progress report yearly. He will also summon the coordination meetings needed between all
participants, and the elaboration of the necessary contracts.
This task will also involve the establishment of the pertinent communication mechanisms with the Education Department
and other stakeholders and policy-makers, as well as the alignment with other national and international ongoing
initiatives..
Milestones: yearly and final reports.
Task T5.2: Dissemination
Goal: Disseminate the project results and maximise impact associated to the project
Description: The participating research groups will disseminate the results as described in Secs. 6 and 7. Besides, they
will organize international meetings (i.e. workshops and conferences) in the project area, as a continuation of the events
they have organized in the past, like the Sound and Music Computing Conference (SMC 2010) (Barcelona), Int. Conf. on
Multimodal Interaction (ICMI 2011) and the previous editions of the Int. Workshop on Music and Machine Learning
(MML), co-located with conferences like IJCAI, ICML, ECML, ACM Multimedia, or NIPS. General outreach activities will
be included such as events within the ESMUC festival will also be done.
Deliverables: Publications and conferences, general outreach activities
Task T5.3: Communication channels: website
Goal: Facilitate dissemination and exploitation of the project results. Provide visibility to the project.
Provide an easy and fast communication and data interchange between both groups.
Description: A project web site will be developed, with an intranet devoted to store the used databases,
internal reports on the project development and state-of-the-art demonstrations. Part of the web site will be
made public as a way to disseminate available publications and prototypes developed in the project.
This task will also involve the establishment of the pertinent communication mechanisms with the Dirección
General de Enseñanza Superior e Investigación Científica, for the monitoring and control of the project, and
the technical and scientific report that are need to be elaborated yearly.
Deliverables: A web site for the project
29
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
30
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
5.3.1. Cronograma. Para cada objetivo deben indicarse: el investigador responsable del mismo, los participantes
involucrados, el período de ejecución (expresado en trimestres) y los hitos y entregables esperados con indicación del
trimestre previsto (Tx) para su consecución.
Máximo 32 000 caracteres
Ejemplo:
O1: Breve descripción del objetivo 1
Responsable: Nombre y apellidos
Participantes: Nombre y apellidos P1; Nombre y apellidos P2; Nombre y apellidos P3; Nombre y apellidos P4;
Nombre y apellidos Pn
Período de ejecución (en trimestres); p. ej., T1 – T2
H1 – Breve descripción del hito 1 y trimestre previsto, p. ej., T1
H2 – Breve descripción del hito 2 y trimestre previsto, p. ej., T2
Hn – Breve descripción del hito n …
E1 – Breve descripción del entregable 1 y trimestre previsto, p. ej., T2
En – Breve descripción del entregable n…
The objectives are described next, with their related task in parentheses.
O1: Develop music analysis techniques for the extraction of useful descriptors based on multimodal music information,
including audio, symbolic, image, video, gesture and physiological information.
Responsible: Rafael Ramírez Melendez (UPF)
Participants: Rafael Ramírez-Meléndez (P1), José Manuel Iñesta Quereda (P2), Melissa Mercadal Coll (P3), Manel
Palencia-Lefler (P4), Jose Oncina Carratalá (P5), Carlos Pérez Sancho (P6), David Rizo Valero (P7), Pedro Ponce de
León Amador (P8), Antonio Pertusa Ibañez (P9), Esteban Maestre (P10), Alfonso Antonio Pérez Carrillo (P11), Javier
Gallego-Sánchez (P12), Placido Illescas Casanova (P13), María Hontanilla Alfonso (P14), Jose Francisco Bernabeu
Briones (P15), Maria Luisa Bernabeu Lledó (P16), José Javier Valero Mas (P17), Jorge Calvo Zaragoza (P18), Javier
Sober Mira (P19), Oscar Mayor Soto (P20), Panagiotis Papiotis (P21), Sergio Iván Giraldo Méndez (P22), Zacharias
Vamvakousis (P23), Alvaro Sarasúa Berodia (P24), requested technicians for UPF and UA-1
Execution period: T1...T4
H1: Report: First versions of image descriptors (1.4), T3
E1: Multimodal transcription corpus of jazz solos (1.1), T4
E2: A symbolic dataset annotated with their genre, melody track, tonal and functional analysis, and chord sequences, in
MIDI and MusicXML format (1.1), T4
E3: A repository of synchronized multimodal recordings relevant to the project (1.1), T4
E4: Library of music audio content description (1.2), T4
E5: Report: chordal and melodic tonal analysis (1.3), T4
31
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
E6: Library that performs a music analysis, harmonic, tonal and functionally (1.3), T4
E7: Report: Review of preprocessing and feature extraction techniques for OMR (1.4), T4
E8: Algorithms for the recognition and performer gestures (1.5), T4
O2: Design and implement basic science machine learning and pattern recognition techniques. These techniques will be
used to build models for different aspects of music processing, obtaining new multi-modal interactive paradigms for
music learning and training.
Responsible: Jose Oncina Carratalá (UA)
Participants: P1, P5, P8, P9, P10, P15, Maria Luisa Micó Andrés (P25), Juan Ramón Rico Juan (P26), Jorge Calera
Rubio (P27), Áureo Serrano Díaz-Carrasco (P28), requested technician for UA-2.
Execution period: T1...T8
H1: Report: First versions of active learning methods and context review (2.2), T4
H2: Report: First versions of efficient similarity search methods and context review (2.3), T4
H3: Development of efficient classification techniques for dynamic environments (2.3), T8
H4: Report: First classifier combination versions and context review (2.4), T4
H5: Developed classifier combination techniques (2.4), T8
E1: Techniques and algorithms for integrating the multimedia data (2.1), T6
E2: Report: Evaluation of active learning techniques in interactive systems (2.2), T6
E3: Basic science and applied active learning techniques to exploit interaction (2.2), T5
E4: Report: Dynamic similarity search and reduction techniques and evaluation (2.3), T7
E5: Report: Classifier combination methods for O3 using multimodal data sources (2.4), T8
O3: Development of interactive multimodal prototypes for music learning and training. All the prototypes should be
available by T10, but the last semester they can be enhanced with the evaluation feedback received from O4.
Responsible: María Luisa Micó Andrés (UA)
Participants: P1, P2, P8, P9, P10, P11, P18, P19, P20, P21, P22, P25, P26, Dolors Sala Batlle (P29), requested
technicians for UPF and UA-1
Execution period: T5...T12
E1: Interactive multimodal music transcription prototype accesible for on-line operation (3.1), T10
E2: Interactive interfaces for symbolic music analysis, in JavaFX2 and OpenMusic (3.2), T8
E3: Public performance database for browsing, recording, annotating, and visualizing music performances (3.3), T7
E4: Expressive music performance prototype (3.4), T10
32
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
E5: Prototypes for offline and online optical music recognition (3.5), T10
O4: The resulting prototypes will be evaluated from a pedagogical point of view and also using classical scientific
evaluation criteria.
Responsible: Melissa Mercadal Coll (UPF, ESMUC)
Participants: P1, P3, P4, P8, P10, P11, P22, P23, P29, Maria Cristina Marinescu (P30), requested technicians for UPF
and UA-2
Execution period: T1-T12
H1: Teachers’ confirmation of the benefit of incorporating technology in the training process (4.1), T4
H2: Definition of models for the assessment of the interactive technologies and for the assessment of the student
learning process (4.2), T5
H3: Field trial with students and teachers for a period of time (4.3), T7
E1: Report: Novel music pedagogy technology-enhanced methods. (4.1), T6
E2: Report: A set of of music performance exercises and corresponding expert musicians’ recordings. (4.2), T5
E3: Report: Definition of the portfolio for independent learning of music using interactive technologies. (4.2), T6
E4: Report: Assessment of the success of the proposed technologies and learning techniques based on the students
participating in the field trial. (4.3), T12
E5: Report: Evaluation of prototypes with a classical scientific methodology and using active learning metrics (4.4), T12
O5: In order to carry out the previous objectives, it is necessary to coordinate the project, control the scientific works and
the diffusion and exploitation activities.
Responsible: José Manuel Iñesta Quereda (UA)
Participants: P1, P2, requested technicians for UPF, UA-1 and UA-2
Execution period: T1-T12
H1: Yearly reports of the working groups (5.1)
E1: Project website (5.2)
E2: Publications and conferences, general outreach activities (5.3)
The dependencies between objectives can be seen in Fig. 5, and the chronogram of the tasks in Fig. 6.
Besides developing the website, the requested technicians will carry out the following tasks:
- UPF technician: In the first stage of the project, its work will be centered in the music analysis objective (O1), focusing
on the subtasks carried out by UPF (audio and gestures acquisition, audio analysis, and video analysis). Then, he will
33
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
contribute to develop the O2 prototypes for public performances database and expressive performance. Finally, he/she
will assist with the pedagogical evaluation at O4.
- UA-1 technician: In the first stage, its work will be centered in the music analysis objective (O1), focusing on the
subtasks carried out by UA (symbolic acquisition, symbolic analysis, and image analysis). Then, he/she will contribute to
develop the O2 prototypes for interactive music transcription, interactive music analysis, and optical music recognition.
- UA-2 technician: During the first part of the project, its work will be focused on applying and developing pattern
recognition and machine learning techniques (O2). Then, he will be centered in the scientific evaluation of the prototypes
at O4.
In case that funding of any technician is not included in the final resolution, the objectives associated to their tasks may
be reviewed.
34
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
3
4
5
6
E2
7
8
5.3.2. Cronograma (gráfico). En el cronograma, marque la duración del OBJETIVO (X) y señale con H1...Hx los HITOS y con E1…Ex los ENTREGABLES, en su caso, de
cada objetivo
2
H3; H4
Año 2 (trimestres)
1
X
Año 1 (trimestres)
Objetivo
E1
H3
H1; H2
8
X
7
H3;H5;E5
X
6
E4
E2
1 (p. ej.)
5
E1;E2
E3
X
E2
E3
X
H3
H1
H2; E1
X
E1;E3
X
H1
H2;E2
X
X
H1
X
X
X
H1;E1
1
Objetivo
X
X
1
X
X
2
X
H1
3
H1;H2;H4
E1…E8
4
3
2
2 (p. ej.)
X
X
AÑOS 1 y 2
X
X
Año 2 (trimestres)
4
X
Año 1 (trimestres)
5
35
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
2013
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
3
Año 1 (trimestres) Cont.
2
AÑOS 1 y 2
4
10
X
11
E4;E5
X
12
1
2
13
5
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Objetivo
9
E1;E4;E5
X
H1;E2
AÑOS 3 y 4
X
X
X
Año 3 (trimestres)
3
X
X
Objetivo
4
X
1
5
7
Año 2 (trimestres) Cont.
6
15
Año 4 (trimestres)
14
8
16
36
9
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas a Proyectos de I+D+i
«RETOS INVESTIGACIÓN»
2013
PROGRAMA DE I+D+i ORIENTADA A LOS RETOS DE LA
SOCIEDAD
11
Año 3 (trimestres) Cont.
10
AÑOS 3 y 4
12
13
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Objetivo
15
Año 4 (trimestres) Cont.
14
16
37
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
6. IMPACTO
ESPERADO DE LOS RESULTADOS
Describa brevemente el impacto científico-técnico, internacional, social o económico que se espera de los resultados
del proyecto en el reto elegido.
El impacto esperado de los resultados debe completarse también en la aplicación de la solicitud. Su contenido podrá ser
publicado a efectos de difusión si el proyecto resultara financiado en esta convocatoria1.
Máximo 3500 caracteres
The core beneficiaries of the project are expected to be music students and conservatories. In particular, ESMUC,
BMAT, and the conservatories of Pais Vasco, Alicante, Murcia and Valencia will benefit as EPO from this project.
Music learning and enjoyment of performance are a key subject of a multitude of stakeholders and communities:
scientific research, technology development, creative industries and professionals, music archive owners, collaborative
music environments, broadcasting and media, music production, etc. For this reason, we expect the project to have an
impact on music pedagogy, science, technology, economy and society:
• The project will produce novel pedagogic methods based on the capabilities opened by the developed technologies,
such as self-monitoring using real time feedback, augmented information based on signal processing techniques that is
not possible to perceive with traditional methods.
• The project will also have impact on technologies such sound analysis, computer vision (of gestures and optical music
recognition) and multimodal data storage and sharing.
• The research outcomes will also have scientific impact through a multidisciplinary research agenda at the intersection
of pedagogy, kinematics, signal processing, visualisation and music performance. From the onset of the project,
experimental results and datasets with the recordings will be available to the music teaching communities and also to the
scientific community. Additionally, the dissemination plan includes the participation to the most relevant conferences
within the topics relevant to TIMuL and also the publication in top-level journals.
• The project will also be in position to impact society, e.g. by the release of applications (e.g. on mobile platforms) and
to the society as a whole.
• The project will contribute to a higher level of personalised, technology-based tutoring, potentially leading to its widespread penetration in schools and at home.
• The project will contribute with an extensive open source library of music analysis and processing tools for the music
education context. When possible, all technological outcomes will be released under open licenses in order to increase
the popularity of the TIMuL platform among scientists and developers.
Contribution to society
The impact of the project proposed on technology enhanced music pedagogy is expected to increase in the long term. A
targeted effort is required to address future education needs and broaden the impact to achieve a greater societal
1 Si el proyecto resultara financiado en esta convocatoria, el órgano concedente podrá solicitar las conclusiones y los resultados del
mismo, que podrán ser publicados a efectos de difusión.
38
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
impact. By testing novel technologies for learning, the project aims at sparking creativity at both professional and
amateur level and stimulating new teaching methodologies. The feedback and observations from the project testbeds will
enable to capture data on workflows, and gather a deeper understanding of the learning processes. The data and
knowledge gained about the individual and collective learning processes has the potential of a major impact on
education in general (general pedagogic methods), and on the clarity of information conveyed during the learning
processes. Further societal benefits will include new ways of interacting with technologies and media content, and new
communication systems in and outside of education.
39
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
7. DIFUSIÓN Y TRANSFERENCIA DE LOS RESULTADOS
7.1. PLAN DE DIFUSIÓN.
Máximo 4000 caracteres
The proposed project will prepare and implement a dissemination plan whose main objective is to ensure that the
project‘s research and development outcomes are widely and timely disseminated to the music community, scientific
community, the industry, and the general public. The active participation of the ESMUC and the CM guarantees a proper
gathering of user requirements, the pedagogic validation of the technologies to be developed in realistic situations, and
thus the focus on applicable technologies for the consumer market in music conservatories. Additionally, the project will
encourage third parties (institutions, companies, end users) interested in evaluating and exploiting results from the
project to further disseminate and circulate information about the project and its outcomes. Moreover, TIMuL team will
join efforts with the PHENICX EC project, coordinated by MTG-UPF, which aims at improving the user experience in the
frame of classical music concerts and is recording a great number of concerts and rehearsals in music schools.
Multiple dissemination strategies will be adopted. These include (1) Internet dissemination strategy with multiple targets:
academy, industry, and the general public. It will be implemented through a web portal and using social networks. (2)
Scientific publications addressed to the academic community, implemented with contributions to primary international
journal and conferences and (3) organization of public events as satellite events of the ESMUC Festivals for broad
dissemination addressing both expert musicians and non-expert users.
The Internet will be a major dissemination channel for the project. An Internet web portal will be developed at the
beginning of the project. It will include thematic areas particularly focused for different kinds of users and stakeholders,
e.g., an area with description of research results and links to publications, an area for the hardware and software
developments with possibility of downloading material, an area devoted to testbed applications and possible exploitation.
Scientific publications are mainly targeted for dissemination within the scientific, academic community. Publications will
include:
• Preparation of one or two special issues in an international scientific journal: the project partners will apply for special
issues devoted to the research challenges addressed by the project.
• Submission of scientific papers to primary international scientific journals: the papers will introduce the research
challenges and will present the results achieved by the project. The journals considered as possible targets for
publications include IEEE Transactions, SPIE journals, ACM Transactions (e.g., on Multimedia), and International
Society for Music Education (ISMA) journals. Conferences include IEEE and ACM conferences with the project focus.
• Organization of workshops and special sessions at primary scientific conferences: Some of the researchers
participating in this project regularly organise workshops related to the research challenges addressed by the project.
• Submission of scientific papers to primary international conferences: the papers will introduce the research challenges
and will present the results achieved by the project.
Organization of public events will be performed along the whole duration of the project with a special focus on
particularly relevant project phases:
40
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
• At the beginning of the project, for establishing the dissemination framework, processes, and infrastructure, and for
producing a detailed dissemination plans. A start-up event is planned during the first 6 months.
• At the mid-execution, launching the first technological results with the purpose of validating those outcomes involving
non-expert users.
At the end of the project a final public event, including public installations and demonstrations, will be organized for broad
dissemination of the final results and developed sample applications.
7.2. PLAN DE TRANSFERENCIA Y EXPLOTACIÓN, en su caso, de los resultados del proyecto, incluyendo aquellas
entidades interesadas en los resultados del proyecto coordinado, concretando su participación y/o aportaciones al
desarrollo del mismo.
Máximo 4000 caracteres
The securing of exploitation channels for the project results is planned in the following sectors:
(i) Exploitation of the project's systems toward the consumer market of music learning, especially in music
conservatories and universities. To this aim, we will consider to propose the results from project to different business
organisations (e.g. Yamaha with whom the MTG-UPF has had a long R&D relationship) for possible exploitation;
(ii) Exploitation towards other market niches, including the computer games industry. Many of the outcomes of the
project could be applied for the development of music based games in the line of Guitar-Hero. The MTG-UPF has strong
links with BMAT Company who has showed interest in the results of the project.
(iii) Separate exploitations of the single project plug-ins and libraries (e.g. on music synthesis or gesture processing as
well as the management of big content repositories) including possible different market niches;
(iv) Exploitation of the project's multimodal data for scalable music content fruition in other kind of music applications
(e.g., Music Information Retrieval).
41
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
8. FINANCIACIÓN PÚBLICA Y PRIVADA (PROYECTOS Y CONTRATOS DE I+D+i)
DEL EQUIPO INVESTIGADOR
Indique únicamente la financiación en los últimos 5 años (2009-2013), ya sea de ámbito autonómico, nacional o
internacional. Incluya también las solicitudes pendientes de resolución.
Debe indicar: referencia, título, investigador principal, entidad financiadora, duración y cuantía de la subvención,
además de las siguientes claves: 0 = es el mismo tema; 1 = está muy relacionado; 2 = está algo relacionado; 3 = sin
relación; C o una S según se trate de una concesión o de una solicitud.
Máximo 8 proyectos o contratos por subproyecto integrante del proyecto coordinado.
Máximo 28 000 caracteres
Ejemplo:
1. Referencia XXX2009-nnnnn. “Título”, “Investigador principal”, MICINN, 01/2010-12/2012. xx.xxx €. 1-C
2. Referencia XXX2012-nnnnn. “Título”, “Investigador principal”, MINECO, 01/2013-12/2015. xxx.xxx €. 3-C
3. FP7-NMP-20xx-SMALL-x. “Título”, “Investigador principal”, UE, 01/2013-12/2015. xxx.xxx €. 2-C
4. FP7-PEOPLE-2012-ITN. Título”, “Investigador principal”, UE, 01/2013-12/2015. xxx.xxx €. 1-S
…
gRFIA-UA group:
1. CSD2007-00018. “Multimodal Interaction in Pattern Recognition and Computer Vision” (MIPRCV), Jose Oncina,
Ministerio de Ciencia e Innovación, 10/2007-09/2012. 360.000€, 1-C
2. TIN2009-14247-C02-02. “Description and retrieval of music and audio information / Descripción y recuperación de
información musical y sonora” (DRIMS), Jose Manuel Iñesta, Ministerio de Ciencia e Innovación, 01/2010-12/2013.
135.100€, 1-C
3. CICYT TIN2006-14932-C02-02. “Semantic processing of digital music” (PROSEMUS), José Manuel Iñesta, Ministerio
de Ciencia e Innovación, 10/2006-10/2009. 90.000€, 1-C
4. CICYT TIN2009-14205-C04-C1. “Interactive and adaptive techniques for machine learning, recognition and
perception”, María Luisa Micó Andrés, Ministerio de Ciencia e Innovación, 01/2010-12/2012. 114.100€, 2-C
5. PROMETEO/2012/017. “Explotación de la realimentación en tareas interactivas de reconocimiento de formas”,
Jose Oncina, Conselleria de Educación Comunitat Valenciana, 01/2012-12/2013. 29.335€ (2012) + 47.300€ (2013) +
dos años (se renueva cada año), 2-C
6. C 2012-2013. “Pascal 2 (European Excellence Network)”, Jose Oncina, VII Framework Program, 03/2008-03/2013.
Funding depends on merits, 2-C
42
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
7. DPI2006-15542-C04-01. “Pattern Recognition Applications to Atomation of Quality Control and other Industrial
Processes”, Jorge Calera Rubio, Ministerio de Ciencia e Innovación, 10/2006-09/2009. 75.000€, 3-C
8. GRE-12-34. “Desarrollo de un entorno interactivo para la representación de música antigua española”, David Rizo
Valero, Universidad de Alicante, 09/2013-08/2015. 5.400€, 1-C
MTG-UPF group:
1. FP7-2011.8.2-601166, “PHENICX: Performances as Highly Enriched and Interactive Concert Expericences”, Emilia
Gómez, European Commission, 01/02/2013 - 31/01/2016. 561.400€. 1-C
2. “Vocaloid 5”, Jordi Bonada, Yamaha, 01/04/2013 - 31/03/2014. 136.000€. 1-C
3. EC / FP7-ICT-2009-C-250026, “SIEMPRE”, Xavier Serra, 01/05/2010 - 30/04/2013. 202.500. 1-C
4. TIN2009-14247-C02-01, “DRIMS“, Xavier Serra, MICINN, 2009-2012. 131.500€. 1-C
5. Referencia 002345, “Music Therapy versus Recreation/Reminiscing Group Sessions on Behaviors, Cognition, and
Salivary Cortisol Levels of Older Adults in the Middle to Later Stages of Alzheimer’s Disease and Other Related
Dementia (ADRD): A Comparison Across Two Cultures”, “Andrea Cevasco, Melissa Mercadal”, University of Alabama,
Sanitas y URLL-Universitat Ramon Llull, 1.10.2013-31-8-2014. 6.200€. 1-S.
6. TSI-070100-2008-318, MUSICA 3.0, Xavier Serra, 01/10/2008-31/12/2010. 442.756€. 2-C
7. FP7-215749, SAME, Xavier Serra, 01/01/2008-31/12/2010. 315.000€. 2-C
8. Referencia 002345, “Qualitat de Vida de les Persones amb Demència: Contribucions de la Música”, “Carme Solé
Resano”, URLL - Universitat Ramon Llull, 1.9.2009-31.8.2010. 2.000€. 2-C.
43
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
9. RELACIÓN DE LAS PERSONAS QUE COMPONEN EL EQUIPO
DE TRABAJO
Relacione las personas del equipo de trabajo que participarán en la ejecución del proyecto de investigación (de acuerdo
con el artículo 18.9 de la convocatoria). Recuerde que el currículum de los doctores aquí incluidos deberá aportarse en
la aplicación de solicitud del proyecto.
Máximo 12 000 caracteres
Indique NOMBRE Y APELLIDOS y las siguientes claves según proceda:
TITULACIÓN: Doctor (D); Licenciado o ingeniero (L); Graduado (G); Máster (M); Formación profesional (FP); Otros (O)
TIPO DE CONTRATO: En formación (F); Contratado (C); Técnico (PT); Entidad extranjera (EE); Otros (OC)
DURACIÓN DEL CONTRATO: Indefinido (I); Temporal (T)
1. Nombre y apellidos. G-F-T
2. Nombre y apellidos. FP-PT-I
3. Nombre y apellidos. EE
SP1: gRFIA-UA group
• Equipo Investigador:
1. José Manuel Iñesta Quereda*. D-C-I
2. Jose Oncina Carratalá. D-C-I
3. María Luisa Micó Andrés. D-C-I
4. Juan Ramón Rico Juan. D-C-I
5. Jorge Calera Rubio. D-C-I
6. Carlos Pérez Sancho. D-C-I
7. Pedro Ponce de León Amador. D-C-I
8. Antonio Pertusa Ibañez. D-C-I
• Equipo de trabajo:
1. Javier Gallego-Sánchez. D-C-T
2. David Rizo Valero. D-C-T
3. Placido Illescas Casanova. M-C-I
4. María Hontanilla Alfonso. L-C-I
5. Jose Francisco Bernabeu Briones. M-O-I
6. Maria Luisa Bernabeu Lledó. M-C-I.
7. Aureo Serrano Díaz-Carrasco. M-PT-T.
44
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
8. José Javier Valero Mas. M-F-T
9. Jorge Calvo Zaragoza. M-F-T
10. Javier Sober Mira. O-PT-T
SP2: MTG-UPF group
• Equipo Investigador:
1. Rafael Ramírez-Meléndez*. D-C-I
2. Melissa Mercadal Coll. D-C-I
3. Dolors Sala Batlle. D-C-I
4. Manel Palencia-Lefler. D-C-I
• Equipo de Trabajo:
1. Esteban Maestre. D-C-T
2. Alfonso Antonio Pérez Carrillo. D-C-T
3. Maria Cristina Marinescu. D-C-T
4. Oscar Mayor Soto. L-C-T
5. Panagiotis Papiotis. L-F-T
6. Sergio Iván Giraldo Méndez. L-F-T
7. Zacharias Vamvakousis. L-F-T
8. Alvaro Sarasúa Berodia. L-F-T
* Investigadores principales
BRIEF DESCRIPTION OF THE PARTICIPANT GROUPS
SP1: gRFIA-UA group
The Pattern Recognition and Artificial Intelligence Group is a research group of the Department of Software and
Computing Systems in the University of Alicante. gRIFA is specialized in computer music applications of pattern
recognition and machine learning algorithms. The gRFIA-UA group was born in the 90s from the eponymous group of
the Polytechnic University of Valencia. Currently, the whole gRFIA-UA is composed of 20 researchers, including 12
Ph.D.
45
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
The group has produced 8 Ph.D. theses, and over 100 articles in journals with ISSN. These journals include prestigious
international journals such as Machine Learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE
Transactions on Systems Man and Cybernetics, and Pattern Recognition. Moreover, the group has also over 50 book
chapters published, and has participated in more than 100 international conferences. Also, 7 national and 4 international
books have been edited.
The research activity of the gRFIA-UA group is currently focused on the following research areas:
● Pattern Recognition and Computer Vision, working specifically in classification techniques based on the geometric and
syntactic approaches. Applications on food industry and in other aspects of
industrial robotic handling are being developed.
● Machine Learning. In this field several learning algorithms that build correct models (finite
automaton, stochastic automaton, tree grammars, etc.) from some language samples (that may be
strings or trees), for classification tasks. Metrics inference for probabilistic edit distance measures.
● Audio and music signal and sequence analysis. This research area is the one involved mainly in
the present project and most of their current interests are part of this report.
SP2: MTG-UPF group:
The Music Technology Group (MTG) is a research group of the Department of Information and Communication
Technologies in Universitat Pompeu Fabra, specialized in sound and music computing. With more than 50 researchers
coming from different and complementary disciplines, the MTG aims at finding a balance between basic and applied
research and, at the same time, promotes interdisciplinary approaches that incorporate knowledge from both
scientific/technological and humanistic/artistic disciplines. The group carries out research on topics such as sound
processing and synthesis; music content description; interactive music systems; computational models of perceptual and
music cognition; expressive performance modeling and the technologies related with music social networks.
The MTG-UPF has a broad experience in collaborative research projects at national and international levels and
participates in European projects since the 4th Framework Programme. MTG leads several undergraduate and
postgraduate academic programmes, such as the Master and PhD in Sound and Music Computing. The group has
produced more than 28 PhD theses and has a long record of international research publications.
MTG’s mission is to contribute to the improvement of the technologies related to sound and music communication,
carrying out competitive research at the international level and at the same time transferring its results to society. To that
goal, the MTG usually organizes dissemination events involving academic, artistic and industrial audiences at
international level and other outreach activities (open door days, summer schools, Music Hack Days, etc.). In terms of
technology transfer, the MTG has carried out R&D projects in collaboration with private entities for more than 15 years,
some of them international companies such as Yamaha Corp. in Japan and at the moment has created 3 spin-off
companies devoted to the exploitation of mature technologies developed by the research group: Barcelona Music and
Audio Technologies (BMAT), Reactable Systems and Voctro Labs. As a result of these technology transfer activities, 29
patents have been presented, and some of them have been extended internationally.
46
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
10. CAPACIDAD FORMATIVA DE LOS EQUIPOS SOLICITANTES
Complete únicamente en caso de que se solicite la inclusión de alguno de los subproyectos integrantes del proyecto
coordinado en la convocatoria de “Contratos predoctorales para la formación de doctores”2.
Para cada uno de los subproyectos, especifique:
- identificación del IP (o de los IP) del subproyecto;
- plan de formación previsto en el contexto del proyecto coordinado;
- relación de tesis realizadas o en curso con indicación del nombre del doctorando, el título de tesis y la fecha de
obtención del grado de doctor o de la fecha prevista de lectura de tesis para el subproyecto solicitado (últimos
10 años).
Máximo 12 000 caracteres.
Subproject 1: gRFIA-UA group
Some of the researchers involved in this subproject lecture in the PhD program of the department, specifically in the
areas related to the goals of this project (machine learning and computer music). This PhD program has been historically
granted with the quality mention from the corresponding Spanish Agency (course 2005-2006: BOE 14/07/2005 Ref.
MCD-2005 00095, course 2006-2007: BOE 30/08/2006 Ref. MCD-2005 00095, courses 2007-2008 and 2008-2009:
BOE 12/10/2007 Ref. MCD2005-00095). The gRFIA group is currently supervising 4 PhD theses.
Moreover, the UA group is involved in a master course by the Informatics Research Institute (IUII) and the Jean Monnet
University (Saint Etienne) that has subsumed the contents formerly offered by the doctoral courses cited above. We are
persuaded that, in the context of the same or even better quality indicators of the personnel involved in these master
courses, they provide an excellent environment for formative issues.
Subproject 2: MTG-UPF group
This subproject involves research at several levels in topics in which the MTG group has recognised international
experience (e.g. signal processing, expressive music performance modeling, music interaction, multimodal music
interfaces, singing voice procesing). Currently, MTG members involved in this proposal are supervising 3 PhD theses,
from which 2 are to be finished during 2013.
Furthermore, the MTG-UPF currently runs a Master on Sound and Music Computing (SMC) in which some of the
researchers participating in this proposal teach courses. The mission of the SMC Master is to train the researchers and
professionals that will push forward the sound and music technologies of the new information society. It combines
practical and theoretical training in topics such as computational modeling, audio engineering, perception, cognition, and
interactive systems. More specifically, the Master trains the students on the technologies for the analysis, description,
synthesis, transformation and production of sound and music, and also on the technologies and processes that support
music interaction. The SMC master together with the UPF doctoral program provides an optimal environment for the
formation of new researchers in the field of the proposed project.
These two facts places the MTG in a perfect position to accept grant holders from the Spanish Researcher Training
Program (FPI).
La inclusión en la convocatoria de “Contratos predoctorales para la formación de doctores” solo será posible en un
número limitado de los proyectos aprobados
2
47
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
48
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
11. IMPLICACIONES ÉTICAS Y/O DE BIOSEGURIDAD
DE LA INVESTIGACIÓN PROPUESTA
Si en la aplicación electrónica de solicitud ha contestado afirmativamente en alguno de los aspectos relacionados con
implicaciones éticas o de bioseguridad allí recogidos, explique los aspectos éticos referidos a la investigación que se
propone, las consideraciones, procedimientos o protocolos a aplicar en cumplimiento de la normativa vigente, así como
las instalaciones y las preceptivas autorizaciones de las que dispone para la ejecución del proyecto coordinado.
Máximo 8000 caracteres
Any data collected for user or context modelling will be strictly anonymous unless the users of the TIMuL consortium
select to waive this identity protection procedure. In all cases the personal identity of the data will be strictly protected
from third parties and will only be used for testing purposes within the project. TIMuL will comply with data protection
acts, directives, and opinions, both at European and at National level. These include:
• Directive 95/46/EC of the European Parliament and the Council of 24 October 1995 on the protection of individuals with
regard to the processing of personal data and on the free movement of such data.
• The Charter of Fundamental Rights of the European Union, specifically the article concerning the protection of personal
data31.
• The opinions of the European Group on Ethics in Science and New Technologies in their report “Citizens Rights and
New Technologies: A European Challenge” on the Charter on Fundamental Rights related to technological innovation.
• In particular recommendations related to ICT concerning data protection and individuals freedom and autonomy.
TIMuL will perform user studies and tests. Following the best practice for ethics in Human-Computer Interaction (Ethics
in HCI and Usability, 2010), the data collected during the user evaluations will be automatically anonymised or at least
become pseudonymous and used for research purposes only, not to be transmitted to third parties, and in addition any
information given for community or personalization benefits will be given voluntarily. The data may include, but is not
limited to, personal information about the user such as: name, date of birth, interests, location, images, or relations to
other users. These in any case will not include sensitive personal data (e.g. health, sexual lifestyle, ethnicity, political
opinion, religious or philosophical conviction). Even though the participation is voluntary and could even be anonymous,
informed consent is necessary and will be sought from each individual user before her or her data is even stored. This
will be accomplished by formulating acceptance terms of usage, and depending on how far-reaching data collection is,
informed consent will be requested at several levels of agreement (e.g. people may agree that TIMuL analyses the data
they upload, but not their user interactions, because this may intrude deeper into their privacy). Part of the terms of
usage will be the information of users about the legal aspects of obtaining information for evidential purposes.
At the technical level reasonable technical measures concerning data security of personal data will be applied; for
instance, transmission of personal data over open communication channels will be done in encrypted form only. Another
aspect of privacy is the protection from spamming for which appropriate tools will be devised.
With respect to ethic and other issues the consortium states the following:
• Participants within the consortium conform to current legislation and regulations in the countries where the research will
be carried out.
• Participants conform to EU legislation, international conventions and declarations.
• No issues related to protection of animals exist.
The partners to this project will ensure that the workplan is elaborated on the basis of the above described in order to
ensure formal acceptance and adherence by partners.
49
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
ANEXO I
Incluya las fórmulas, reacciones químicas, etc., que por el tipo de letra del texto del formulario no hayan podido ser
insertadas en el mismo. El número de ecuación debe coincidir con su llamada en el texto.
Máximo 3 páginas
Respete la extensión máxima indicada. Recuerde que en virtud del artículo 11 de la convocatoria NO SE ACEPTARÁN
NI SERÁN SUBSANABLES MEMORIAS CIENTÍFICO-TÉCNICAS que no se presenten en este formato.
50
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
ANEXO II
Únicamente en el caso en el que se considere necesario para aclarar ciertos aspectos del proyecto, incorpore las
imágenes o figuras (formato TIFF, JPEG o GIF), hasta un máximo de 10, a las que se haya hecho referencia en el texto.
Respete el número máximo de figuras indicado. Recuerde que en virtud del artículo 11 de la convocatoria NO SE
ACEPTARÁN NI SERÁN SUBSANABLES MEMORIAS CIENTÍFICO-TÉCNICAS que no se presenten en este formato.
Figure 1. Typical scenario involving the project technologies. The musician is provided with real-time visual and auditory
feedback with information about the sound and instrument control parameters of the performance. The performance
information may be stored in or compared with performances in a database
Figure 2. Example of vibrato widget. The same exercise played by different famous performers is shown together with
the student performance.
51
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Figure 3. Timbre representations based on two different type of features (sound and control). Based on sound features,
a spectrogram with a linear regression of the spectral peaks (left) and based on controls, the so called Schelleng’s
Diagram (right) showing in red the trajectory of controls in that 2D space.
Figure 4. Main areas of research in the TIMuL project, objectives and tasks, and the expertise of participating
institutions
52
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Figure 5. Dependencies of the project objectives. Objectives O1 and O2 are closely related, and their results will be
used for prototype development (O3), which may also require to re-adapt them. Evaluation (O4) will influence the
prototype development stage, and coordination and dissemination (O5) will be performed during all the project.
53
Subdirección General de Proyectos de Investigación
Convocatoria de ayudas « RETOS INVESTIGACIÓN »
PROGRAMA ESTATAL DE I+D+I ORIENTADA A LOS
RETOS DE LA SOCIEDAD
2013
MEMORIA CIENTÍFICO-TÉCNICA DE PROYECTOS COORDINADOS
Figure 6. Chronogram of tasks
54

Documentos relacionados