Doctors write information in free text fields. They are rich in detail, but poorly arranged for a machine. Photo credit: Logoboom / Shutterstock
Medical records are a rich source of health information. Combined, the information it contains can help researchers better understand diseases and treat them more effectively. This includes COVID-19. To unlock this abundant resource, researchers must first read it.
We may have moved on from the days of handwritten medical notes, but the information recorded in modern electronic health records can be just as difficult to access and interpret. It's an old joke that doctors' handwriting is illegible, but it turns out that their typing isn't much better.
The sheer volume of information in the health records is breathtaking. Every day, health care workers in a typical NHS hospital generate so much text that it takes an old man to search, let alone read, through it. Using computers to analyze all of this data is an obvious solution, but far from easy. What makes perfect sense to a human can be very difficult for a computer to understand.
Our team uses artificial intelligence to fill this gap. By teaching computers how to understand human doctors' notes, we hope they will gain insight into how to fight COVID-19 by finding patterns in many thousands of patient records.
Why health records are difficult
A significant part of a health record consists of free text that is written in narrative form like an e-mail. This includes the patient's symptoms, a history of their illness, and information about any pre-existing medical conditions and medications they are taking. It may also include relevant information about family members and lifestyle. And because this text was entered by busy doctors, there are also abbreviations, inaccuracies, and typos.
This type of information is known as unstructured data. For example, a patient record might say, "Ms. Smith is a 65-year-old woman with atrial fibrillation and had CVA in March. She had a history of #NOF and OA. Family history of breast cancer. She has been prescribed apixaban. No history of bleeding . "
This very compact paragraph contains a great deal of data on Mrs. Smith. Another human reading the notes would know what information is important and extract it in seconds, but a computer would find the task extremely difficult.
Teaching machines for reading
To solve this problem, we use what is known as Natural Language Processing (NLP). Based on machine learning and AI technology, NLP algorithms translate the language used in free text into a standardized, structured set of medical terms that can be analyzed by a computer.
These algorithms are extremely complex. You need to understand context, long word strings and medical concepts, distinguish current events from historical ones, identify family relationships, and much more. We teach them to do this by feeding them existing written information so they can learn the structure and meaning of the language – in this case publicly available English text from the internet – and then use real medical records for further improvement and testing.
Using NLP algorithms to analyze and extract data from health records has great potential to transform health care. Much of what is recorded in narrative text on a patient's notes is usually never seen again. This could be important information, such as the early warning signs of serious illnesses like cancer or stroke. The ability to automatically analyze and flag important issues could help improve care and avoid delays in diagnosis and treatment.
Finding ways to fight COVID-19
By compiling health records with these tools, we are now using these techniques to identify patterns relevant to the pandemic. For example, we recently used our tools to find out whether drugs commonly prescribed to treat high blood pressure, diabetes, and other conditions – called angiotensin converting enzyme inhibitors (ACEIs) and angiotensin receptor blockers (ARBs) – the likelihood of one severe illness increase with COVID-19.
The virus that causes COVID-19 infects cells by attaching to a molecule on the cell surface called ACE2. Both ACEIs and ARBs are thought to increase the amount of ACE2 on the surface of cells, raising concerns that these drugs could increase people's risk from the virus.
However, the information needed to answer this question – how many seriously ill COVID-19 patients receive these drugs – can be recorded in their medical records both as structured prescriptions and as free text. This free text must be in a computer searchable format for a machine to answer the question.
With our NLP tools, we were able to analyze the anonymized records of 1,200 COVID-19 patients and compare the clinical results with whether patients were taking these drugs or not. Fortunately, we found that people who were prescribed ACEIs or ARBs were no more likely to be seriously ill than those who were not taking the medication.
We're now expanding the use of these tools to find out who is most at risk from COVID-19. For example, we used them to examine the links between ethnicity, pre-existing health conditions, and COVID-19. This has revealed some notable things: if you are black or of mixed ethnicity, you are more likely to be hospitalized with the disease, and Asian patients in hospital are at greater risk of being admitted to the intensive care unit or to die of COVID -19.
We also used these tools to assess the early warning levels, which predict which patients admitted to the hospital are most likely to become seriously ill, and to suggest what additional measures could be used to improve those levels. We are also using the technology to predict upcoming spikes in COVID-19 cases based on patients' symptoms recorded by doctors.
Blood pressure medications reduce death and serious illness in COVID-19 patients
This article is republished by The Conversation under a Creative Commons license. Read the original article.
Teaching Computers to Read Health Records Helps Combat COVID-19 (2020 Oct 19).
accessed on October 19, 2020
This document is subject to copyright. Apart from fair treatment for the purpose of private study or research, no
Part may be reproduced without written permission. The content is provided for informational purposes only.