Machine learning has found its place in various applications in the healthcare sector. These applications include medical imaging and diagnosis, natural language processing (NLP) as well as deep learning in clinical genomics. NLP, specifically, is expected to exhibit significant growth in the industry. This is due to the large amount of unstructured narrative text information that is generated in the healthcare sector. This plethora of data introduces significant opportunity for the development of NLP solutions.
NLP is a field of artificial intelligence that deals with the interaction between computers and humans using the natural language. The ultimate goal of NLP is to read, understand and make sense of human languages in a way that is valuable. Due to the complexity of natural languages, this is understandably a difficult task to accomplish. This is because a comprehensive understanding of any language involves understanding how words and concepts are connected to deliver a message.
The main techniques used for extracting information from human language are syntactic analysis and semantic analysis. Syntactic analysis is based on sentence syntax and is used for the assessment of whether language adheres to grammatical rules. Semantic analysis refers to sentence semantics – the meaning conveyed by a text. Semantic analysis is understandably the more difficult of the two, involving the inclusion of abstract language concepts over fixed grammatical rules.
Everyday examples of NLP include smart assistants such as Siri and Cortana, predictive text like autocorrect or autocomplete, and language translation such a Google Translate. All of these examples include processing human language, whether in text or audio format, to extract valuable information and perform a specific operation with that information. Can you think of how this technique be applied in the healthcare sector?
A survey performed on 4 720 doctors working in direct patient care in the USA found that the average doctor spends 8.7 hours a week on administration. This administration includes the transcription of patient notes into electronic health record (EHR) data. The application of speech recognition to the healthcare sector has allowed healthcare professionals to quickly transcribe notes for useful EHR data entry. Accurate, medically tailored note transcription frees up the time of medical professionals, allowing them to spend more time with patients, increasing physician efficiency.
A speech recognition system trained on over 270 h of medical speech data and 30 million tokens of text was capable of transcribing speech with a word error rate below 16% in a clinical use case – an error rate similar to that of human transcriptionists. The application of deep learning algorithms to these systems can result in significant accuracy and speed improvements.
Medical triage refers to the process of prioritising patients' need for care in a hospital or pre-hospital setting. Due to the incredible load that doctors find themselves under, it is understandable that they need to prioritise their patients by severity. Several companies, such as Babylon Health, Health Tap, and Your.MD have developed ‘AI doctors’. These services provide health advice to patients with common symptoms, freeing up primary care access for more complex cases. Babylon in particular has reported diagnostic accuracy comparable to that of human doctors.
These services invariably utilise NLP to decipher patient inputs into useful information that can be fed into other machine learning models for diagnosis. These systems aim to benefit the needs of patients and doctors alike. Patients that require doctors' attention for low-priority care can utilise these services on demand. This ultimately reduces the load on doctors, allowing them to focus more time on high-priority patients.
Electronic storage of health records, EHR, has become the standard in many hospitals. The collection of health data in a standardised format will aid in increasing hospital efficiency, improving patient outcomes, and reducing physician burnout. A large step in this process, however, is the transfer of existing physical documentation into electronic form.
Much like the application of NLP in speech recognition for clinical documentation, NLP can be used for this application. This will allow the rapid scanning and integration of medical information into an electronic database. Once digitised, this data can be used for various other purposes such as by physicians in treating patients, for research, or for the development of other machine learning models that can be applied to a clinical environment.
The previous examples mainly discussed the application of NLP to assist with administrative tasks in the health sector. However, NLP also has a significant capacity to improve medicine and medical research. Various studies involving NLP for the prediction of specific conditions have been performed with great results.
A study was performed in 2020 to investigate the capacity of NLP in the identification of pulmonary embolism (PE) events compared to the currently used ICD-10 codes. 1000 randomly selected healthcare encounters with a CT pulmonary angiogram were reviewed and reported on by 2 independent observers. NLP, ICD-10 codes, and manual report review were performed on the reports and the results compared. PE events were found in 13% of reports. NLP was capable of outperforming ICD-10 in sensitivity, 96% compared to 92.9%, and specificity of 97.7% compared to 91% in ICD-10.
Another study aimed to use NLP to predict the outcome of ovarian cancer surgery. Specifically, to identify if NLP (with machine learning) of preoperative CT scans improves the ability to predict postoperative complication and hospital readmission when compared to discrete data predictors alone. The results showed that the utilisation of discrete features alone allowed the prediction of postoperative preadmission with an area under the curve (AUC) of 0.56. This improved to 0.70 with the addition of NLP of the preoperative CT scans.
A study applied NLP as a supplement to ICD-9 codes and laboratory values to better define and risk-stratify patients with cirrhosis. NLP was applied to 5 343 primary care patients with ICD-9 codes indicating chronic liver disease. 168 of these were manually reviewed at random as a benchmark. The algorithm performed with a positive predictive value (PPV), negative predictive value (NPV), sensitivity, and specificity of 91.78%, 96.84%, 95.71%, and 93.88% respectively. The study states that the NLP component of the algorithm was the most important. This section performed with a PPV, NPV, sensitivity, and specificity of 98.44%, 93.27%, 90.00%, and 98.98& respectively.
Although NLP offers a significant amount of potential development in the health sector, this does not come without its challenges.
Machine learning models, although objective, are a product of the data that is used for their training. As a result of this, biases in training data result in biases in machine learning models, and NLP models are no exception. These biases are the result of training sets that are not fully representative of the population of interest, have missing or misclassified data, and measurement error. Improving population representation in EHR data will help to reduce model bias and improve model performance in general.
The training of an NLP model involves the definition of outcomes against which the model can validate. Once the outcomes are set, the model is tested by its ability to predict those specific outcomes. This is powerful when specific outcomes are defined, but it is clear that this has limitations for the prediction of long-term risk factors associated with chronic diseases.
A large amount of the NLP applications discussed above involve the training of the model on EHR data. In order to ensure model accuracy, this EHR data must be as consistent as possible. However, due to the nature of the free text used to describe patient interactions and assessments, a large amount of variation can exist. This is particularly apparent between different medical specialisations and professionals. The standardisation of these reports will assist with model development and accuracy. SNOMED-CT has been adopted as a standard vocabulary among members of the NHS. The resultant challenge, however, is allowing doctors to describe patient subtleties and intricacies while using highly structured vocabulary.
NLP has the capacity to make a significant impact in the healthcare sector. It has the capability to reduce the amount of time that doctors spend on administrative tasks, reduce the load of low-risk patients on doctors, as well as assist with medical research. However, this potential for significant change does not come without its challenges. The development of these models is subject to model bias, strict outcomes, and raw data inconsistencies. It is the responsibility of those involved in model development to take these factors into account to ensure that these models can apply to whole populations within reasonable accuracies.