Unifying Semantical Annotation and Querying in Biomedical Image
Transcripción
Unifying Semantical Annotation and Querying in Biomedical Image
Unifying Semantical Annotation and Querying in Biomedical Image Repositories One Solution for Two Problems of Medical Knowledge Engineering Daniel Sonntag, Manuel Möller Santiago Redondo Salvo! Sistemas de Información en Medicina! Máster en Ingeniería Biomédica 1. Introducción • ¿Qué sucede cuando llega un enfermo al medico? http://www.elrincondelamedicinainterna.com/2014_01_11_archive.html 1. Idea • Hasta ahora las imágenes no proporcionan información explícita adicional sobre su contenido y si lo hacen no comparten la semántica • En realidad todas ellas pertenecen al dominio médico, comparten reglas y conceptos comunes • Emplear imágenes médicas anotadas con contenido semántico que proporcionen la base para el diagnóstico asistido por computador y la ayuda en la toma de decisiones clínicas 1. Problema • Introducir en el SI ese conocimiento médico sobre el contenido de las imágenes Cantidades ingentes de datos => Automatizar Supervisión y validación Cómo almacenarlo para que sea útil Introduction THESEUS-MEDICO Introduction • Images 1. Solución Knowledge Engineering in the Medical Domain Final Remarks Problem: Proyecto THESEUS-MEDICO, sistema informático Many individual medical applications, Clinical no common semantics Records que unifica mediante el uso debut ontologías BUT: Queries are not arbitrary, instead based on La anotación semántica de las imágenes (mediante un GUI) anatomy, physiology, pathology… – e.g. only heart has left ventricle – e.g.en certain spatial relations información (diálogo lenguaje between the organ regions/segments Treatment Plans La recuperación de esta natural) Proposed solution: “Show me the CT scans & records of patients with an enlargement in the dimension of the lymph node in the neck” Image Type: CT Scan Region: left ventricle part_of heart Image Type: CT Scan Region: left ventricle part_of heart 1. Indice • Desafíos en la Ingeniería del Conocimiento Médico • Técnicas para analizar y consultar conjuntos de imágenes anotadas con contenido semántico • Interacción mediante interfaz de consultas multimodal basado en lenguaje natural • Descripción del repositorio de imágenes anotadas • Trabajos relacionados y conclusiones 1. Sistema MEDICO Figure 6. Generic Architecture of a Multimodal Dialogue System. Figure 7. Overall MEDICO Semantic Search Architecture. 2. Desafíos • A partir de información de bajo nivel: 2 CHALLENGES Various challenges exist in medical knowledge engineering, all of which arise from the requirements of the clinical reporting process. The clinical reporting process focuses on the general question What is the disease? (or, as in the lymphoma case, Which lymphoma?). To answer these questions, semantic annotations on medical image contents are required. These are typically anatomical parts such as organs, vessels, lymph nodes, etc. Image parsing and pattern recognition algorithms can extract the low-level image feature information. The low-level information is used to produce higher-level semantic annotations to support tasks such as differential diagnosis. For this purpose, we envision a flexible and generic image understanding software for which image semantics, which are expressed using concepts from existing medical domain ontologies, play a major role for access and retrieval. Unfortunately, although automatic detection of image semantics seems to be technically feasible (e.g., see (Kumar et al., 2008)), it is too error-prone (at least on the desired annotation level where multiple layers of tissue have to be annotated at different image resolutions). Accordingly, one of the major challenges is the so-called knowledge acquisition bottleneck. We cannot easily Imágenes médicas, pruebas y otros análisis • Obtener y anotar información de alto nivel semántico: Signos y síntomas • Que ayuden a proporcionar un diagnostico médico Figure 1: Graphical User Interface of the Annotation Tool 1 shows the graphical user interface of the annota- 2. Desafíos • Basados en la comunicación: Knowledge acquisition bottleneck or knowledge elicitation • Basados en la ingeniería de ontologías médicas: El conocimiento es opaco al ingeniero por el vocabulario específico Existencia y/o creación de ontologías médicas completas y comprensivas Múltiples jerarquías, complejas y con muchas relaciones Entorno de riesgo y nulo margen de error. Precisión en el conocimiento Dificultad en el modelado de los sistemas por la falta de dominio del tema Basado en FOIS2008-Wennerberg.pdf detecting 19 body landmarks very quickly and robustly in about 20 seconds. By forming an anatomical network, the landmarks can be used to restrict the search area in the context of organ detection. New anatomy can be easily incorporated since the framework can be trained and handles the segmentation of organs and the detection of landmarks in a unified manner. The detected landmarks and segmented organs are used in multiple ways. First, they facilitate the semantic navigation inside the body (see Figure 2, left), and second, they are used for the generation of semantic annotations such as “spleen” or “splenomegaly”. 3. Herramienta de anotación de imágenes Figure 1: Graphical User Interface of the Annotation Tool Figure 2. MEDICO application that integrates automatic landmark and organ detection with manual image annotations. 1 shows the graphical user interface of the annotation tool. Images can be segmented into regions of 3. Herramienta de anotación de imágenes • Permite a los médicos anotar imágenes para lo que se reutilizan diferentes terminologías y ontologías de referencia: FMA (Foundational Model of Anatomy) para la anatomía corporal (75k conceptos y 2.1M relaciones) RadLex para expresar la manifestación en una imagen de características anatómicas particulares o enfermedades relacionadas ICD-10 (International Classification of Diseases) para clasificar enfermedades • Búsqueda en PubMed (y otros) de información relacionada • Guarda el historial de interacciones del médico con el sistema • Mantiene un repositorio RDF remoto con las imágenes, su semántica correspondiente y el resto del historial clínico 3. RDF • RDF (Resource Description Framework) es una familia de especificaciones de la W3C. Originalmente diseñado como un modelo de datos para metadatos se usa como un método general para la descripción conceptual o modelado de la información que se implementa en los recursos web • Se basa en declaraciones sobre los recursos en forma de expresiones sujeto-predicado-objeto o triples • Ej.: La idea de "El cielo tiene el color azul" en RDF es como el triple de un objeto que denota "el cielo", un predicado que denota "tiene el color" y un objeto que denota "azul" http://en.wikipedia.org/wiki/Resource_Description_Framework 1993). 3. In the clinical staging and patient management process the general concern is with the next steps in the treatment process. The results of the clinical staging process influence the decisions that concern the patient management process in a later phase. 3. Anotaciones Figure 3. MEDICO semantic annotation scheme. 4. Interface de consulta • • Interacción dependiente del contexto Multimodal Text to Speech Show me the internal organs: lungs, liver, then spleen and colon. Pantalla táctil pa str ac m all as 1 2 3 4 Figure 3: Multimodal Touchscreen Interface. The clinician can touch the items and ask questions about them. 5 4. Ejemplo de diálogo • U: “Show me the CTs, last examination, patient XY.” • S: Shows corresponding patient CT studies as DICOM picture series and MR videos. • U: “Show me the internal organs: lungs, liver, then spleen and colon.” • S: Shows corresponding patient image data according to referral record. • U: “This lymph node here (+ pointing gesture) is enlarged; so lymphoblastic. Are there any comparative cases in the hospital?” • S: “The search obtained this list of patients with similar lesions.” • U: “Ah okay.” Our system switches to the comparative records to help the radiologist in the differential diagnosis of the suspicious case, before the next organ (liver) is examined. • U: “Find similar liver lesions with the characteristics: hyper-intense and/or coarse texture ...” • S: Our system again displays the search results ranked by the similarity and matching of the medical ontology terms that constrain the semantic search. 16 Daniel Sonntag, Martin Huber, Manuel Möller et al. 4.3. Speech and Touchscreen Interaction Design (Surface plane) 4. Ejemplo de diálogo This plane deals with the logical arrangements of the design elements. In the case of a multimodal dialogue system, the logical arrangement results in a user-system natural dialogue whereby the user input is speech and touch and the system output is generated speech or the generation of SIEs which display windows for images, image regions, or other supported interaction elements. The implemented clinical workflow is best explained by example. Consider a radiologist (R) at his daily work of the clinical reporting process (also cf. section 3.1) with the speech-based semantic dialogue shell (S): The potential application scenario (provided by Siemens AG) includes a radiologist which treats a lymphoma patient; the patient visits the doctor after chemotherapy for a follow-up CT examination. R: “Show me my patient records, lymphoma cases, for this week.” S: Shows corresponding patient records. R: “Open the images, internal organs: lungs, liver, then spleen and colon of this patient (+ pointing gesture (arrow)).”S: Shows corresponding patient image data according to referral record. The presentation planer of the The potential application scenario (provided by Siemens AG) includes a radiologist which treats a lymphoma patient; the patient visits the doctor after chemotherapy for a follow-up CT examination. R: “Show me my patient records, lymphoma cases, for this week.” S: Shows corresponding patient records. R: “Open the images, internal organs: lungs, liver, then spleen and colon of this patient (+ pointing gesture (arrow)).”S: Shows corresponding patient image data according to referral record. The presentation planer of the dialogue system rearranges the semantic interface elements (SIEs). The top-most picture frame, showing the patient information in the header, is interactive; when touching it, special image regions and region annotations are highlighted (two arrows). R: Switches to the 5th image and clicks on a specific region (automatically determined). 4. Ejemplo de diálogo 4. Ejemplo de diálogo Design and Implementation of a Semantic Dialogue System… 17 S: The system rearranges the semantic interface elements (SIEs) to signalize that the dialogue focus is on regions. R: “This lymph node here (+ pointing gesture), annotate Hodgkin-Lymphoma.” S: Annotates the image with RDF annotations (cf. Figure 3, highlighted pathological part) and displays a label for the recognized ICD-10 term. R: “Find similar lesions with characteristics: hyper-intense and/or coarse texture.” S: MEDICO displays the search results in the record table (also see first screenshot) ranked by the similarity and match of the medical terms that constrain the semantic search (left) and opens to signalize that the dialogue focus is on regions. R: “This lymph node here (+ pointing gesture), annotate Hodgkin-Lymphoma.” S: Annotates the image with RDF annotations (cf. Figure 3, highlighted pathological part) and displays a label for the recognized ICD-10 term. R: “Find similar lesions with characteristics: hyper-intense and/or coarse texture.” S: MEDICO displays the search results in the record table (also see first screenshot) ranked by the similarity and match of the medical terms that constrain the semantic search (left) and opens the first hit, Peter Maier (arrow), the record, and his images that correspond to the search. The system rearranges the SIEs for the two patients for a comparison. R: “Get the findings of this patient” S: Opens the findings (text) and highlights the medical terms in different groups. 4. Ejemplo de diálogo One of the radiologist’s goals is to estimate the effectiveness of the administered medicine. In order to finish the reading / pathology, additional cases have to be taken into account for comparison. We try to find these cases by matching the medical RDF annotations (FMA, RadLex, ICD-10) of 4. Modelado del diálogo • KEMM: A Knowledge Engineering Methodology in 14 Daniel Sonntag, Martin Huber, Manuel Möller et al. the Medical Domain, Wennerberg et al. 2008 Figure 4. Usability planes and corresponding design issues for implementation. Defining the users and their needs on the strategic planes is the first step in 4. Modelado del diálogo • Query Pattern Derivation Query Pattern Derivation Introduction • Ontology Identification • Ontology Modularization and Pruning • Ontology Customization • Ontology Alignment • Reasoning-Based Ontology Enhancement • Testing and Deployment LOGO Knowledge Engineering in the Medical Domain Final Remarks P.Wennerberg, S.Zillner, M.Möller, P.Buitelaar, M.Sintek, FOIS 2008, Saarbrücken Data Processing: RadLex & FMA 4. Modelado del diálogo Introduction Knowledge Engineering in the Medical Domain Anatomy Corpus Final Remarks Radiology Corpus Steps: all text sections of each corpus through the TnT part-of-speech parser (Brants, 2000) ! extract all nouns in the corpus ! compute a relevance score (chi-square) for each ! …by comparing anatomy & radiology frequencies respectively with those in the British National Corpus ! 4. Arquitectura técnica • Arquitectura distribuida: escalabilidad y uso de dispositivos móviles a central Triple Store (see section 5.2). Exnotations of an image can also be used to line resources on the web such as PubMed ww.ncbi.nlm.nih.gov/pubmed) and Clinicaltp://clinicaltrials.gov) for similar cases. ULTIMODAL INTERFACE multimodal query interface implements a aware dialogue shell for semantic access to edia, their annotations, and additional texrial. It enhances user experience and usproviding multimodal interaction scenarios, ch-based interaction with touchscreen instalr the health professional. edical Dialogue ecommendations can support building up ying new medical knowledge repositories? edge engineering methodology (Wennerberg 008) helped us to formalize these require- Figure 2: Architecture of the Dialogue System, where external components, such as automatic speech recognition (ASR), natural language understanding (NLU), and text-tospeech Synthesis (TTS), are integrated. 9 S: Our system again displays the search results ranked by the similarity 4. Arquitectura técnica • Interface multimodal Gestor de ventanas táctil específico (similar al visto en tablets) • Sistema de diálogo Middleware Se comunica en SPARQL con los servicios del Backend NLU proporciona directamente términos de la ontología • Bus de eventos Pasa mensajes entre los distintos actores 5. Backend services Figure 6. Generic Architecture of a Multimodal Dialogue System. Figure 7. Overall MEDICO Semantic Search Architecture. • Triple Store • Semantic Search • Semantic Navigation 5. Triple store • RDF: Implementación seleccionada Sesame por su sencillo despliegue online y rápida estrategia de persistencia para el almacenamiento. • Sistema central para almacenamiento y recuperación de información sobre el dominio médico, práctica clínica, metadata de pacientes y anotaciones semánticas de las imágenes tion 4. c Mediator, we imThe system also allows us to perform a semantic or the purpose of query expansion based on the information in the medithin the dialogue ical ontologies. Accordingly, a query for the anatomulate and maintain ical concept lung also retrieves images which are not er-level Java funcannotated with “lung” itself but parts of the lung. The s of a number of query expansion technique is implemented in Java ocol for accessing and provided as an API. Below we show a SPARQL Here, we•provide a A bajo nivel el acceso a los RDFin se query example, according to ourdatos query model the realiza t library to handle semantic mediante search layer in el figure 4, which retrieves all directamente lenguaje de consulta se Case Repositoimages of patient XY annotated with the FMA conr architecture conSPARQL cept “lung”. dialogue system), er, and a dynamic SELECT ?personInstance ?patientInstance ?imageRegion ?imageURL WHERE { ?personInstance surname ?var0 . resses information FILTER (regex(?var0, "XY", "i")) . ge layer hosts the ?patientInstance referToPerson ?personInstance . ?patientInstance participatesStudies ?studyInstance . ve semantic medi?seriesInstance containedInStudy ?studyInstance . ?seriesInstance containsImage ?mdoImageInstance . ing an appropriate ?mdoImageInstance referenceFile ?imageURL . terogeneous infor?imageRegion hasAnnotation ?imageAnnotation1 . ?imageAnnotation1 hasAnatomicalAnnotation ?medicalInstance1 . ologies. ?medicalInstance1 rdf:type fma:Lung. 5. Triple store ory ?imageInstance hasComponent ?imageRegion . ?imageInstance hasImageURL ?imageURL . ?mdoImageInstance referenceFile ?imageURL . } triple store setup , is based on two ntiate between de- Note that this query spans across patient metadata (the name, automatically extracted from the image header) and anatomical annotations (manually added 5. Relación entre el triple store y semantic search Design and Implementation of a Semantic Dialogue System… Triple Store Figure 9. Three Tier Search Architecture. 23 5. Semantic search • Las anotaciones manuales y la aplicación de búsqueda semántica usan el mismo repositorio RDF. Uso simultáneo • Búsqueda usando funciones muy complejas • Expansión de consultas basada en la información de las ontologías • Operaciones de manipulación de datos mediante librarías específicas 5. Semantic navigation 22 • Navegación semántica de conceptos anatómicos disponibles para todos los actores del sistema. • Se accede mediante un interface XML RCP / Java Daniel Sonntag, Martin Huber, Manuel Möller et al. Figure 8. Semantic Navigation Interface Element. Semantic navigation Semantic Navigation shows anatomical concepts in a browser window. This window can be accessed by the dialogue shell through the XML RCP / Java Interface. In this way, additional clinical reporting process relevant information can be accessed by the radiologist (Figure 8). 6. Trabajos relacionados • Agregación de datos con imágenes médicas y ontologías: Cancer Biomedical Information Grid (https:// cabig.nci.nih.gov) • Además uso de tecnologías Semantic Web: myGrid (http://www.mygrid.org.uk)