2017 BioMedical and Health Informatics Workshop

Projects

Engineering the patient-provider experience

It is essential to design technologies and systems that promote appropriate interactions between physicians and patients. This study explored how physicians interact with Electronic Health Records (EHRs) to understand the qualities of the interaction between the physician and the EHR that may contribute to positive physician-patient interactions.

Revolutionizing medicine with machine learning

Machine Learning is on the cliff of revolutionizing medical diagnosis. Diagnostic applications of machine learning are rapidly transitioning from the theoretical to the real-world. The transformational potential of diagnostic applications cannot be overstated from an at-home tool for early detection to an instant “second opinion” for a complex diagnostic case. Machine learning as a diagnostic tool will generate incredible efficiencies and cost savings for patients, doctors, and hospitals, and most importantly of all, it will save lives.

In a quest to build more trustable Computer-Aided Diagnosis (CAD) systems for lung cancer, the CDM Medical Informatics Lab and the Imaging Institute at University of Chicago have been collaborating for over a decade to build the next generation CAD system with advanced imaging analytics and reasoning capabilities that can assist in the clinical decision making process. The collaboration involves three stages of research: 1) predictive modeling for high-level diagnostic interpretation derived from low-level image data, 2) learning the human visual perception of similarity using low-level image features and expert-in-the-loop feedback, and 3) evaluating the effects of smart capabilities on traditional CAD systems and medical experts' performance.

I3RIS: Interactive, Iterative, Integrated Radiology Image Search

The advancements in medical imaging technologies have generated billions of images that are digitally stored and indexed in different data repositories worldwide. Current search mechanisms and query tools used to access these images in clinical practice are text-based only and are not sophisticated enough to fulfill the types of queries that clinicians need. Leveraging the richness of the medical data, the long-term objective of this interdisciplinary effort between DePaul University and University of Chicago is to provide the most useful information, the best images, and the most relevant data sources to clinicians at the point of care.

 Our specific goals are to design, develop, and evaluate a hybrid search engine that unlocks valuable information from onsite and online radiology data sources (in-house proprietary teaching files and publically available online peer-reviewed teaching files, radiology journals, and imaging related textbooks) to provide radiologists the most relevant information needed at the time of patient care. Our central hypothesis is that having a search mechanism that maps naturally from the user’s limited internal memory of observed cases to a wealth of examples available onsite and online would allow clinicians to make faster, more confident and accurate diagnoses by removing the innate error caused by the limits of human memory. To test the central hypothesis, we propose to 1) create a hybrid text and image distributed database by integrating radiology teaching files, textbooks, and journals, 2) extract knowledge from integrated data sources to augment medical decision making, and 3) develop a domain-specific interactive user interface with iterative query refinement.

Comparison of NLP vendors mapping CT codes from HIE sites to LOINC

We are mapping CT test names from sites in the New York metropolitan region Healthix health information exchange (HIE) to LOINC® to support a system that will alert providers at the point of order entry of prior CT exams performed anywhere across the HIE. In this study we compared manual mapping with two commercially available natural language processing (NLP) tools. Manual mapping CT codes can be laborious, requiring effort and domain knowledge. LOINC’s existing mapping tool, RELMA®, can accelerate the process by providing a list of best possible matching LOINC codes for each local code, but only maps one local code at a time and requires manual oversight. We sought a more automated method using these two NLP tools to generate a list of local codes mapped to LOINC that can be reviewed in batches rather than one code at a time.

Phylogenetic trees through compression

A phylogenetic tree on a set of species is a hypothesis about how genetically close the various species are to each other. The data used for this are DNA/RNA sequences. Biologists have their methods, that are almost always statistically based and computationally expensive. Our work determines the distance using normalized compression distance, a technique based on the theoretical notion of Kolmogorov complexity. Because the Kolmogorov complexity of a string is not computable, we use instead a compression algorithm (e.g., Huffman, Lempel-Ziv). One outcome is that our approach is much less expensive computationally. One goal of the project at this point is to investigate how well this approach works with some of the large collection of very recent compression algorithms specifically designed for DNA/RNA sequences. We do this by generating trees with our method from a set of sequences for which there are published trees generated using the methods traditionally used in biology. Another goal is to scale up the technique. We have applied it to sets of 20 to 30 species; we would like to apply it to sets of at least an order of magnitude higher.

Functional neural mapping for behavior modeling using big data computing

A major goal in neuroscience research is to understand behavior at the level of neural networks. While many studies have attempted to tackle this goal, their resolution is not at the single neuron level or their scope is not extensive enough to make a concrete connection between behavior and neural networks. Caenorhabditis elegans provides clear advantages to overcome both of these challenges due to its simple nervous system and completely deciphered anatomical neural map. Moreover, C. elegans exhibits behaviors found in higher organisms, including food search behavior. In this interdisciplinary collaborative project between DePaul University and Rosalind Franklin University Medical School, we will use C. elegans to build functional networks of interneurons for food search behavior.

We propose to perform in-depth research and develop new, powerful, and scalable image processing, indexing and data mining methods for efficient and effective analysis-based mapping of neural networks to locomotory search behaviors. Our proposed study will work on neuron-ablated C. elegans image datasets, and focus on (1) extracting representations of movement characteristics, (2) discovering and indexing behavior patterns in large sequential image data, (3) modeling search behavior similarity based on the discovered patterns, and (4) learning functional neural networks from combinations of behavioral models. The amount of data that will be generated from this research study will be in the petabytes range, making it crucial to employ cutting edge big data computing techniques on advanced large-scale distributed systems to make this study tractable.