Browse Prior Art Database

A Method and System for Automatically Selecting Patient Cohorts from Electronic Health Records using Flexible Search and an Automatic Longitudinal Patient Record formation

IP.com Disclosure Number: IPCOM000241430D
Publication Date: 2015-Apr-25
Document File: 5 page(s) / 129K

Publishing Venue

The IP.com Prior Art Database

Abstract

A method and system is disclosed for automatically selecting one or more patient cohorts from one or more electronic health records using flexible search and automatically forming a longitudinal patient record (LPR).

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 30% of the total text.

Page 01 of 5

A Method and System for Automatically Selecting Patient Cohorts from Electronic Health Records using Flexible Search and an Automatic Longitudinal Patient Record formation

Disclosed is a method and system for automatically selecting one or more patient cohorts from one or more electronic health records using flexible search and automatically forming a longitudinal patient record (LPR). At the start, the patient data records are acquired, de-identified and analyzed using advanced text analytics is to extract concepts. The concepts are extracted using text mining based on medical vocabularies derived from sample clinical content, and prefix-matching for increased flexibility. Subsequently, the intelligence of the analysis is used for creating the LPR incrementally with an appropriate search query, such as, but not limited to, Structured Query Language (SQL) formulation. Moving on, specific clinical protocols, such as, but not limited to, Health Level-7 (HL7) are employed for ingesting and presenting sensitive patient data. The HL7 messages are parsed to extract relevant clinical data about the patients. Thereafter, the one or more patient cohorts are expanded through patient similarity.

The major steps involved in the processing are explained in detail below.


1. Automatic patient data acquisition

In this step, HL7 exchanges are employed to intercept the messages flowing between systems. The HL7 messages are then parsed to extract relevant clinical data about the patients. The HL7 headers indicate the nature of the content captured in the payload such as an Admission, Discharge, and Transfer (ADT) message containing demographics, admission and discharge information which can be useful in formulating clinical protocol queries targeting certain populations by demographics or the length of stay during hospitalization. In order to form a longitudinal record, subsets of HL7 message types that capture clinical content are processed to extract relevant patient data.

Thereafter, in order to have a de-identified collection of information for clinical search purposes, limited de-identification is performed. The algorithm for de-identification goes through both structured and unstructured sections of HL7 and takes corrective action appropriate for the field for de-identification.


2. Extraction of clinical concepts from unstructured reports

In this step, one or more clinical concepts such as, but not limited to, medications and procedures are extracted from reports using vocabulary-driven concept extraction. Firstly, a list of vocabulary words is derived for each concept to be

1


Page 02 of 5

detected in textual reports. Examples of concepts include, but need not be limited to, disease names, diagnoses, symptoms or conditions, measurements, exams, procedures and medication names. The vocabularies can be derived from both standard vocabularies or can be generated from reports through data mining. The reports are, then, pre-processed to isolate th...