The Ethics of Machine Learning in Medicine

A high-level discussion of the ethics surrounding the inclusion of AI/ML in medicine. This includes informed consent, safety, bias, and data privacy concerns.

Artificial intelligence (AI) and machine learning (ML) are becoming ever more ingrained in our day-to-day lives. These algorithms are used for targeted advertising, GPS navigation systems like Google Maps, as well as by Alexa and Siri. It is no surprise that machine learning algorithms are also finding their way into the healthcare sector. This integration has resulted in a significant amount of contention, specifically surrounding the ethics of using computers in a deeply ‘human’ industry.

Photographer: Brett Jordan | Source: Unsplash

Applications of ML in Medicine

ML algorithms have a wide range of applications in the healthcare sector. These include, but are not limited to:

  1. The analysis of medical research papers using natural language processing (NLP) to extract meaningful information regarding diagnosis and treatment
  2. The automation of administrative tasks such as the transcription of medical reports also with the use of NLP
  3. Image recognition applied to the processing and interpretation of radiological scans
  4. Robotic surgery to assist surgeons with operations
  5. Assistive robots that can serve as companions for the elderly or disabled
  6. Genome sequencing

With each of the examples provided above, it is important to remember that a large amount of personal data is required for the training of these algorithms. The generation, acquisition, storage, and utilisation of this data as well as the eventual application of the resulting machine learning models all have associated ethical concerns.

Ethics and Machine Learning

There are four main ethical challenges pertaining to the implementation of AI in healthcare. These include: (1) informed consent, (2) safety and transparency, (3) algorithmic fairness and biases, and (4) data privacy.

Informed Consent

What is informed consent?

The ethics surrounding informed consent will be one of the most immediate challenges pertaining to the integration of AI into clinical practice. Informed consent is the process of informing the patient about the proposed test, treatment, or procedure. This includes a discussion of the benefits and risks as well as alternative options. This knowledge allows the patient to make an informed decision to consent or not consent to the plan recommended by the medical professional.

The application of AI or ML technologies into medical procedures adds another layer of complexity to this process. This is because more information must be conveyed by the medical professional to the patient. The patient must then weigh this information into their decision-making process. The ethical considerations regarding AI/ML and informed consent involve the level of detail of information that medical professionals are expected to know and how much they are expected to share with the patient.

Informed consent pertaining to ethics of AI/ML

In order for medical professionals to adequately allow for informed consent, they are required to be knowledgeable in the operation of AI/ML models that they are using. This is a difficult task owing to the complexity and black-box nature of these algorithms. However, professionals should be able to:

  • Provide a basic explanation of how the AI/ML system works
  • Explain their experience using this system
  • Describe the risks vs benefits of using the AI technology as a supplement instead of only human input
  • Distinguish the human and machine roles and responsibilities in diagnosis, treatment, and procedures for the patient
  • Detail the safeguards that have been implemented
  • Explain any issues relating to data confidentiality and privacy
“For an informed consent process to proceed appropriately, it requires physicians to be sufficiently knowledgeable to explain to patients how an AI device works” - Schiff et al

The points above illustrate that the onus falls on the medical health professional to become as well informed as possible on new technologies. This will improve their understanding of the operation, risks, and benefits of AI/ML solutions that they interact with. This, however, does not absolve data scientists and machine learning experts from providing models/systems that have a degree of transparency that allows for this understanding. The impact of patient informed consent must therefore be deeply considered in the development of these systems from the data generation stage up to model deployment.

Ethics of informed consent
Photographer: National Cancer Institute | Source: Unsplash

Safety and Transparency


The safety of the implementation of these models is obviously of utmost importance. These AI/ML models will be assisting medical professionals in making decisions that will affect people's lives. Therefore, it is the responsibility of AI developers to ensure that models are as safe and reliable as possible. Developers must also ensure that there is a level of transparency that makes the model’s shortcomings known. This goes hand-in-hand with the ethics of informed consent. It is the responsibility of the system developer to ensure that the healthcare professional understands the system. Thereafter, the medical professional will be able to adequately inform the patient.

System developers can ensure safety in the following ways. Firstly, by being cognizant of the reliability and validity of datasets used in development. AI/ML stakeholders must take care to validate the quality of their input data. The quality of this data has a significant impact on the performance of the resulting model. The completeness, correctness, and appropriateness of the data used for model training must be evaluated in the context of the problem that is being solved.

A public example of how poor quality input data has resulted in a poorly performing medical AI/ML model is IBM’s Watson for Oncology. It was designed as a tool to assist oncologists with the recommendation of cancer treatment options for patients. This system failed to launch due to providing unsafe and incorrect recommendations for cancer treatment in the testing phase. This was as a result of the system being trained on ‘synthetic’ cancer cases created by doctors at the Memorial Sloan Kettering Cancer Centre rather than real data.

Photographer: Angiola Harry | Source: Unsplash


While the data used for model training has an impact on model safety, the ‘black-box’ nature of ML and AI models has led to a significant amount of concern regarding transparency. The detailed inner workings of ML/AI models are typically hidden from developers. As such, the rationale behind their operation and decision-making cannot exactly be evaluated, which can reduce their trustworthiness. The mechanical operation of these models must therefore be explained as much as possible to give users the confidence to accept these technologies.

Furthermore, it is the responsibility of developers to educate end-users on any shortcomings or limitations that might be present in the resulting model. For example, the developers of Watson for Oncology could have reported on the type of data used for model training. This would have provided stakeholders with the necessary context to discern why the model was not performing as expected.

There are, however, caveats that exist with providing absolute transparency on model development, operation, and pitfalls. These include the protection of investments and intellectual property as well as the protection of data and cybersecurity. A balance must, therefore, exist where sufficient transparency is provided to ensure safety in operation without compromising on security – a difficult feat to accomplish.

Photographer: Aleks Dahlberg | Source: Unsplash

Algorithmic Fairness and Biases

Stemming from safety and transparency is the concern regarding the ethics of model fairness and bias. The capacity for AI to globalise healthcare, rehumanize medicine, and improve healthcare in lower-income settings is determined again by the data used for model development. There is unfortunately a significant amount of scope for bias and thus discrimination to be worked into these ML solutions. This could be the result of a wide range of factors. The factors include non-representative data being used for training, how data scientists process the data, and the context in which AI is used.

Gender imbalance

Simply put, algorithms that are trained on unrepresentative data often perform worse on underrepresented groups. For example, a study was performed to investigate the effect of gender imbalance on 3 deep learning algorithms used for the diagnosis of thoracic diseases. The study found a consistent decrease in performance for the underrepresented genders when a minimum balance was not fulfilled.

Geographical distribution

Another study investigated the geographical distribution of cohorts used for medical model training in the USA. The study found that a disproportional amount of deep learning models were trained on cohorts from only 3 states. With little or no representation from the remaining 47 states. These states have economic, educational, social, ethnic, and cultural features that are not necessarily representative of the whole population of the USA. Therefore, the models trained on this limited, unrepresentative data will have intrinsic bias worked in by the selected data.

The utilisation of non-representative datasets could be the result of limited data availability. In this instance, these issues might be resolved as more data becomes available. If data remains limited, however, it is the responsibility of developers to disclose the populations that the algorithm is appropriately used on. The development of these models must, however, work to bridge the gap between high and low-income populations. If low-income groups are blatantly excluded from model data, this will exacerbate pre-existing inequalities.

Photographer: Tim Mossholder | Source: Unsplash

Data Privacy and Ethics

The development of ML/AI leverages on user data. Without a large amount of this data, the creation of these models would not be possible. Naturally, the ethics surrounding the acquisition, management, and commercialisation of this data are paramount. If patients and health care professionals do not trust AI/ML models on account of data-related issues, the adoption of these methods will ultimately fail.

Commercialisation of data

Data in the healthcare industry has the capacity to reach values in the billions of dollars. The knowledge that people's healthcare data is potentially generating wealth for companies is not necessarily a comforting sentiment. For example, in 2017, the Royal Free NHS Foundation Trust was in breach of the UK Data Protection Act when it provided over 1.6 million patient records to Google DeepMind. The Royal Free NHS Foundation Trust had made a deal with DeepMind to exchange the data for the free use of their app, Streams, for 5 years.

Reciprocity does not require ownership, but those that use patient data must show that they are adding value to the health of the very same patients whose data is being used. This delicate balance between the monetisation of patients’ data and the reciprocation of the value that this data has generated is paramount. This will ensure that professionals and patients trust AI/ML solutions and are willing to work to improve them to reap the benefits.

“the price of innovation does not need to be the erosion of fundamental privacy rights”- Elizabeth Denham (UK Information Commissioner’s Office)
Ethics of commercialisation
Photographer: Bermix Studio | Source: Unsplash

Management of healthcare data

The adoption of ML/AI will invariably lead to a significant increase in the amount of data that is stored. This flood of data must be adequately managed, and it will become imperative to protect the privacy of patients outside of the doctor-patient relationship. Legislation will have to be developed to ensure that data is safeguarded and not shared with other companies. The sharing of this data could impact the insurance premiums, job opportunities, or even the personal relationships of patients.

The protection of the data from malicious intent, such as hacking or theft, will not be discussed. However, the protection of patients' identities can be done by the anonymisation of the data. Although this seems like a simple solution, this can lead to a degradation in model performance. Consider a model trained on data that does not consider the patient's age or gender, for example, this could result in bias being unintentionally introduced into models.

“The constraints are therefore more ‘human’ than ‘tech’”

In a world ideal for ML model development, user data would not require such strict management, security, and anonymisation. However, due to human constraints, strict management is required to ensure that people are protected from malicious entities. This ultimately hinders the progress of AI/ML applications, unfortunately. A delicate balance must be managed to ensure that AI/ML systems can be developed while maintaining the ethical usage of the data.

Visual illustration of the effect of data regulation on ML model development

Conclusion on Ethics in Medicine

The integration of AI/ML solutions in the health sector has great capacity to bridge equality gaps, rehumanise medicine, and improve people's lives. However, the hype surrounding this integration must be appropriately managed. The ethics surrounding this amalgamation must be seriously considered if we are to maintain medicine as a human-centric industry. As such, models must be developed with people in mind.

The effect of these solutions on informed consent, safety and transparency, algorithmic fairness and biases, and data privacy must be considered. This consideration and hype-management will ensure that people are willing to adopt these technologies. It is only through people and the inclusions of ethics that these technologies will be allowed to flourish and revolutionise the world of medicine.


Enjoyed this read?

Stay up to date with the latest AI news, strategies, and insights sent straight to your inbox!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.