0
Views
0
CrossRef citations to date
0
Altmetric
Position Paper

Triage 4.0: On Death Algorithms and Technological Selection. Is Today’s Data- Driven Medical System Still Compatible with the Constitution?

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon show all
Article: 1989243
Received 22 Sep 2021
Accepted 30 Sep 2021
Published online: 17 Nov 2021

ABSTRACT

Health data bear great promises for a healthier and happier life, but they also make us vulnerable. Making use of millions or billions of data points, Machine Learning (ML) and Artificial Intelligence (AI) are now creating new benefits. For sure, harvesting Big Data can have great potentials for the health system, too. It can support accurate diagnoses, better treatments and greater cost effectiveness. However, it can also have undesirable implications, often in the sense of undesired side effects, which may in fact be terrible. Examples for this, as discussed in this article, are discrimination, the mechanisation of death, and genetic, social, behavioural or technological selection, which may imply eugenic effects or social Darwinism. As many unintended effects become visible only after years, we still lack sufficient criteria, long-term experience and advanced methods to reliably exclude that things may go terribly wrong. Handing over decision-making, responsibility or control to machines, could be dangerous and irresponsible. It would also be in serious conflict with human rights and our constitution.

“AS A MEMBER OF THE MEDICAL PROFESSION:

I SOLEMNLY PLEDGE to dedicate my life to the service of humanity;

THE HEALTH AND WELL-BEING OF MY PATIENT will be my first consideration;

I WILL RESPECT the autonomy and dignity of my patient;

I WILL MAINTAIN the utmost respect for human life;

I WILL NOT PERMIT considerations of age, disease or disability, creed, ethnic origin, gender, nationality, political affiliation, race, sexual orientation, social standing or any other factor to intervene between my duty and my patient … ”

So demands “The Physician’s Pledge” (the “Declaration of Geneva”1) in its new 2017 version, also known as the Hippocratic Oath. But are the procedures used in medicine still compatible with this – or is our society now threatened by technical selection? Given the recent use of various forms of triage methods, among them algorithm-based and data-driven ones, serious concerns arise.

Illustration of the Problem

For thousands of years, in fact, for the entire history of humanity, machines were never allowed to make autonomous, unsupervised life-and-death decisions. This taboo may soon be broken, or this has happened already. Not only military machines, but also more mundane robots like autonomous vehicles, may decide about lives in split seconds. There are also attempts to give this an ethical foundation, as the recent work on the “trolley problem” [1–6] and the “moral machine experiment” [7]2 illustrate. These new approaches, which may fundamentally change the practices in hospitals and emergency medicine, must undergo constitutional scrutiny.

In cases of scarce medical resources, the decisive criterion used for allocating these resources is typically the probability of the prospective patient to survive. Risk stratification normally comprises three groups: individuals who are unlikely to benefit from a treatment (e.g. because they are anyway expected to die); individuals who can wait for treatment (and will get it, as soon as capacity becomes available – but perhaps never); and individuals who urgently require treatment in order to survive. But how to assess the chance of surviving or whether someone belongs to a certain risk group? Besides using frameworks such as the sequential organ failure assessment (SOFA) or the acute physiology and chronic health evaluation II (APACHE II) [8,9], where the task of risk assessment is currently performed by humans, it may also be performed by machines at some point in time.

Machines, in particular AI systems based on machine learning (ML), are believed to augment or even surpass human capabilities in predicting the outcome of diseases or the likelihood of survival [10–12]. Based on this premise, medical AI tools are increasingly being heralded as a paradigm change in clinical practices. During the Covid-19 pandemic, AI tools became especially attractive, not just for purposes of early warning, tracking, and diagnosis, but also for prognosis [13].

In August 2019, Wynants et al. (2020) [14] listed an impressive number of 107 prognostic models for predicting progression to severe disease, intensive care unit admission, ventilation, intubation, length of hospital stay, and mortality risk (39 models). The sobering conclusion of the study is that all models – except for one [15] – were rated as having a high risk of bias, overfitting due to small or modest sample sizes, and exaggerating reported predictive performance. While the predictive performances of the models were reported to be moderate to excellent, the high risk of bias implies that model performance in out-of-sample detection will be lower, i.e. the generalisability to real-life settings is rather questionable. Hence, the researchers concluded: “We cannot yet recommend any of the identified prediction models for widespread use in clinical practice” [16]. Other studies have yielded similar daunting results [17,18].

A general conclusion from the studies is that predictive models for assessing mortality risk may perform well in the patients that the studies look at, but will not perform well in other patients. Therefore, models that have only been assessed based on calibration performance, while a validation with independent data samples (for different locations, population compositions etc.) is lacking, should not be considered for use. Moreover, machine learning (ML) models for Covid-19 prognostic, similar to ML models in general, make a trade-off between accuracy and fairness. Take for example the role of age in ML models. One can either use age-sensitive models (since age is a significant predictor of Covid-19 outcomes), and thus be relatively accurate, or one can use fair models, which are not age-sensitive, but also less accurate.

Models incorporating race face a similar conflict between accuracy and discrimination. For example, risk prediction models for lung disease, kidney disease, breast cancer, death after heart failure, and other illnesses assign lower scores to patients of colour (possibly because the current health system is biased against them). This may result in less or worse ML-based treatment recommendations as compared to white patients [19]. In other words: discrimination effects in the past, which are reflected by treatment performance data used to train ML/AI systems, may perpetuate discrimination in the future. Concretely, this could mean that algorithm-based triage would put a person of colour into the no treatment group, where a white person would be treated, or in a delayed treatment group, where a white person may get treated immediately, despite comparable health conditions.

Discussion of Some ML-Related Issues in the Health Care System

In view of the possible or even actual use of algorithms for life-or-death decisions3 and the serious issues with the underlying predictive models, there looms a dangerous diffusion of responsibility with unforeseeable consequences for our society. To illustrate the problem, we describe below – in a simplified way – some crucial elements of the process chain in the health care system and how they interact with each other.

  • The reality in hospitals includes capacity constraints, as the COVID-19 crisis has revealed again. In such circumstances, prioritisation decisions are implicitly or explicitly made. In the worst case, these can be triage decisions resulting in the death of people, who would otherwise be saved under normal circumstances.

  • Prioritisation decisions are often serious and difficult. It cannot be assumed that all people involved in the related decisions do always fully understand all their ethical and legal implications, so mistakes can easily happen.

  • For such and other reasons, medical personnel is increasingly supported in their decisions by measured data and expert systems or even AI systems. This is often very helpful. However, for certain decisions, such decision support can be problematic, because such tools are often created by software developers, who have limited knowledge of medical, legal, ethical, and societal issues.

  • The algorithms are typically subject to business secrecy and are frequently updated, such that their effects are partially unknown and, in some cases, are not verifiable due to a lack of algorithm transparency. Accordingly, their results may change unpredictably. Nevertheless, algorithms may exert “epistemic authority”: the compliance with algorithmic recommendations often seems to be advised. It may also appear to be an advantage, in case a lawsuit later occurs.

  • Business secrecy is not the only reason why expert or AI systems are often lacking transparency. Machine learning systems have often been characterised as “black boxes” [20], because in contrast to what is common in science and in medical research, ML/AI-based systems are typically not based on transparent, validated, and reproducible causal relations. Moreover, as machine learning progresses, ML-based decision-making and its outcomes may change, without any explanation given. Although machine learning programs have high potential for specific tasks such as skin cancer detection [21], they can also sometimes fail dramatically. “Black-box medical algorithms” should, therefore, be monitored closely; the accuracy of predictions is not granted and requires continuous external validation.

  • In concrete applications, medical decisions are often based on criteria such as the expected effectiveness of the measures taken. In this context, decisions are recommended by algorithms that judge between different treatment options – including the possibility of no treatment. The underlying machine decisions are typically based on mean values, such as those obtained in extensive medical studies. However, such mean values are not suited to make precise forecasts on the individual level, because the variation in the data is often large. In other words, a treatment, which is better on average, may actually be worse for certain individuals.

  • Triage decisions may also be made on the basis of data that was never collected and intended for triage decisions. As a result, data quality can be insufficient or the application context of the data can be inappropriate. Therefore, the application in a Triage context is likely to be problematic.

  • Data-driven decisions are often very sensitive to details of the algorithm used or the dataset evaluated. In other words, taking a data-driven decision using a different dataset or a different algorithm (as other research teams might provide them) may result in pretty different priorities. Consequently, the selection of patients who are disadvantaged by triage decisions or medically prioritised can greatly vary with the procedures, algorithms, or datasets used. Hence, the approach of data-driven decisions may suffer from an undesirable, but hidden degree of randomness and arbitrariness, which is particularly worrying when it comes to life-and-death decisions. Therefore, “data-driven” or “evidence-based” does not automatically mean that the resulting decisions are scientifically objective and sound at the level one would require for grave decisions. Despite these methodological shortcomings, vital decisions are increasingly taken on the basis of algorithms.

  • The legitimacy and meaningfulness of the algorithms and personal data used in such contexts may be questioned as well. For example, some of these algorithms seem to be based on life expectancy 4, even though this may vary a lot.5 There are at least two undesirable consequences: (1) While life expectancy is actually quite variable [22],6 the application of algorithms to decide the level of life support may determine the lifespan pretty much in the sense of a self-fulfiling prophecy. (Thereby, “good” and “bad luck” would be largely eliminated – and hope as well.) (2) While the expectation of a long life may be rewarded, the expectation of a short life would be almost like a ”death sentence”, particularly when the prediction is incorrect.7

  • If algorithms were developed with the aim to meet limited budgets or reduce costs,8 this could imply that AI-based technological selection might shorten lifespans systematically. This is a serious threat since, after all, Big Data-driven medical decisions are often also supposed to ensure the efficient use of available funds. Such a development can shift the focus away from human dignity and equality to economic profit, and it may obfuscate the rationing of health-care resources.

  • If, in addition, individual insurance coverage (the treatments reimbursed by the respective health insurance) is taken into account, social selection could happen.9 Furthermore, as known from recent AI research, such discriminatory effects [23] can even occur where not intended at all, namely as a result of the opacity of the algorithms or the training data used.10

  • Despite all this, in the context of the current pandemic, triage decisions involving life-and-death decisions are increasingly being made.11 They are often presented as inevitable in view of insufficient medical capacities.12 Nevertheless, triage decisions are considered to be extremely problematic and acceptable only in war times or extreme disasters.

In conclusion, the application of algorithms for life-and-death decisions can imply technological selection and a mechanisation of death13 [6], particularly if these are allowed to act autonomously.

  • Nevertheless, there is a trend to transfer problematic decisions to data-driven algorithms and autonomous systems. In some countries, algorithm-based triage decisions seem to be already in use.14 However, this makes human beings objects of machine decisions, which is not compatible with human dignity. Besides, the transfer of human responsibility to machines undermines the principle of accountability.

  • Political decisions limiting the health-care budget will inevitably determine how many people are ultimately affected by triage decisions, even if not intended.15 However, algorithm-based triage will also depend on a person’s health condition and, hence, on the person’s social conditions and genetics. Therefore, algorithm-based life-death decisions, i.e. technological selection, may also imply social and eugenic selection.

  • The COVID-19 pandemic has demonstrated the threat of introducing (at least partially) automated procedures for life-death decisions into everyday clinical practice. Thereby, triage measures would no longer be limited to situations of wars and serious disasters, but become a “new normal”. We consider such uses of triage procedures as unconstitutional, which must be prevented by political and legal oversight.

Threats of Algorithmic Decisions

We have explained above why undesirable, potentially hidden, genetic and social selection effects are to be expected in a health-care system, which is increasingly operated in a data-driven way. To minimise these, we suggest that medical institutions should regularly produce statistical evaluations of any genetic, social, and behavioural selection effects. Moreover, the application of algorithms as well as the algorithms’ functioning should be regularly audited. Discrimination as well as the non-compliance with state-of-the-art quality standards should be sanctioned. Otherwise, there is an acute danger that those people, who need our medical support and solidarity most, will be particularly disadvantaged by the data-driven systems emerging today.

In our opinion, current data-based approaches do not adequately meet constitutional requirements (principle of equality, self-determination, right to life) and policy goals (such as equity) today. They are also not scientifically sound enough (given the sensitivity of many data analytics approaches). To unfold their full benefits, it must be ensured that Big Data, Artificial Intelligence and other technological innovations are used in a fair way – and that problematic applications are avoided. Otherwise, it is to be feared that the democratic principle of equality will increasingly be replaced by discriminatory scoring systems, which would ultimately lead to a fundamentally different society.

Although constitutional and fair data-based methods may be possible to develop, the methods currently in use do not meet all justified expectations. If one takes the principle of equality seriously, according to which no human life should count more than another, then the blessings of modern medicine would have to benefit all people equally. This fairness requirement would imply that health-disadvantaged people with a shorter life expectancy should get greater medical support per year, not less. This, however, would require a completely different use of data-based methods and different goal functions.

Note that our assessment is supported by the recent WHO Guidance on “Ethics and Governance of Artificial Intelligence for Health”, which points out:16 “Use of computerized decision-support programs – AI or not – to inform or guide resource allocation and prioritization for clinical care has long raised ethical issues. They include managing conflicts between human and machine predictions, difficulty in assessing the quality and fitness for purpose of software, identifying appropriate users and the novel situation in which a decision for a patient is guided by a machine analysis of other patients’ outcomes. In some situations, well-intentioned efforts to base decisions about allocations on an algorithm that relies only on a rules-based formula produce unintended outcomes”. In its section on “The ethics of resource allocation and prioritization” it also warns that “if an AI technology is trained to ‘maximize global health’, it may do so by allocating most resources to healthy people in order to keep them healthy and not to a disadvantaged population”. Last, but not least, “Use of AI tools for triage or rationing is one of the most compelling reasons for ensuring adequate governance or oversight. Although intentional harm is not ethically controversial – it is wrong – the possibilities of unintended bias and flawed inference emphasize the need to protect … people and processes from computational misadventure”.

Technological and Ethical Awareness Needed

AI systems will certainly play an increasing role in the medical profession. Augmented intelligence has a large potential in assisting medical doctors, medical personnel, and patients in health-related decision-making. These systems can also raise the quality and cost-effectiveness of health care. However, there are a lot of caveats concerning fairness, equality, and bias in the treatment of certain patient groups, regarding the accuracy and validity of results, unintended side effects, and the transparency of algorithms. Therefore, the use of AI systems in practice will typically raise serious ethical questions.

Health care professionals must be aware of the underlying principles, possible weaknesses, unintended consequences, and ethical problems when applying AI systems. After all, they are responsible for the results. Thus, in a recent report, the US National Academy of Medicine demands: “Develop and deploy appropriate training and educational programs to support health care AI” [24]. These programs have to be integrated in the medical education curricula and they should become a high priority in the continued education of health care professionals.

AI courses for continued medical studies are already offered by certain medical schools (e.g. Massachusetts Medical Society, The Radiological Society of North America, Mayo Clinic, Stanford University School of Medicine) [25]. However, courses in continued medical education should, besides a focus on engineering and technical issues, place a higher emphasis on understanding the principles of algorithms and on reflection of related ethical problems. Consequently, not only medical experts and engineers should be responsible for teaching AI courses in medicine. It is highly important that AI programs in medical studies and in continued education “must be multidisciplinary and engage AI developers, implementers, health care system leadership, frontline clinical teams, ethicists, humanists, patients, and caregivers” [24].

Disclosure Statement

No potential conflict of interest was reported by the author(s).

Notes

1 https://www.wma.net/policies-post/wma-declaration-of-geneva/ (last accessed on 18 August 2021).

2 A reflection of this research paper has been provided in this and many other newspaper articles: https://www.finanzen100.de/finanznachrichten/wirtschaft/ethische-probleme-im-auto-oma-oder-obdachlosen-ueberfahren-makabrer-mit-test-zeigt-dilemma-selbstfahrender-autos_H439792768_308609/. Note that the Moral Machine experiment has recently been seriously called into question by scientists: [26, 27]. Ethicists have also come to different conclusions, see for example (last accessed August 18. 2021): https://www.bmvi.de/SharedDocs/DE/Publikationen/DG/bericht-der-ethik-kommission.pdf; https://www.researchgate.net/publication/318340461; https://www.zeit.de/gesellschaft/zeitgeschehen/2020-03/deutscher-ethikrat-coronavirus-behandlungsreihenfolge-infizierte; https://www.ethikrat.org/fileadmin/Publikationen/Ad-hoc-Empfehlungen/deutsch/ad-hoc-empfehlung-corona-krise.pdf

3 Triage-Software in Notaufnahmen: Der nächste Schnellschuss aus dem Hause Spahn, Netzpolitik.org (23 March 2021) https://netzpolitik.org/2021/triage-software-in-notaufnahmen-der-naechste-schnellschuss-aus-dem-hause-spahn/ (last accessed on 18 August 2021).

4 Soliman T. Der Todesalgorithmus: Computer berechnet Lebenserwartung, Das Erste; 2017 Dec 14; Available from: https://daserste.ndr.de/panorama/archiv/2017/Der-Todesalgorithmus-Computer-berechnet-Lebenserwartung,todesalgorithmus112.html

5 In particular, the mean life expectancy varies considerably across countries. Thus, if such an algorithm is fed with international data, there is a risk of life-shortening decisions in countries, where the mean life expectancy would actually be higher.

6 In: Nature Public Health Emergency Collection, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7585743/

7 Wilson, J. 23andMe raises questions about at-home genetic testing, CNN health; 2013 Nov 26; Available from: https://edition.cnn.com/2013/11/26/health/23andme-fda-genetic-testing/index.html

8 Bundesrat Berset auf Kollisionskurs mit der Verfassung? Seine Pläne für einen “Deckel” bei den Gesundheitskosten stossen auf scharfe Kritik, Neue Zürcher Zeitung (17 November 2020) https://www.nzz.ch/schweiz/bundesrat-berset-auf-kollisionskurs-mit-der-verfassung-seine-plaene-fuer-einen-deckel-bei-den-gesundheitskosten-stossen-auf-scharfe-kritik-ld.1587291

9 Ungleichbehandlung je nach Krankenkasse: Wo man versichert ist, kann über Leben oder Tod entscheiden, Tagesanzeiger (6 March 2021) https://www.tagesanzeiger.ch/wo-man-versichert-ist-kann-ueber-leben-oder-tod-entscheiden-752186910381

10 Also note that ethnic, social, genetic, and/or behavioural factors may sometimes correlate.

11 “Weiche Triage wird bereits angewendet”: Wie Ärzte entscheiden, wer behandelt wird, Focus (19 April 2021) https://www.focus.de/gesundheit/news/triage-auf-intensivstationen-dritte-welle-wenn-mediziner-entscheiden-muessen-wer-behandelt-wird_id_13195703.html

12 even though some insiders have actually denied such capacity shortages: Corona und Krankenhäuser: “Ein Patient wird unter Umständen doppelt gezählt” WELT (16 June 2021) https://www.welt.de/politik/deutschland/plus231872027/Corona-Krankenhaeuser-Ein-Patient-wird-unter-Umstaenden-doppelt-gezaehlt.html

13 Helbing D, Seele P. Death by algorithm? Project Syndicate. 2020 Nov 26. https://www.project-syndicate.org/commentary/artificial-intelligence-resilience-covid19-climate-change-by-dirk-helbing-and-peter-seele-2020-11.

14 Hao K. Doctors are using AI to triage COVID-19 patients. The tools may be here to stay. MIT Technol Rev. 2020 Apr 23; Available from: https://www.technologyreview.com/2020/04/23/1000410/ai-triage-covid-19-patients-health-care/

15 Apparently, emergency capacities may have been reduced by the circumstance that empty beds were profitable: RKI-Schreiben zu Intensivbetten: “Monetäre Anreize” für falsche Angaben, Tagesschau (17 June 2021) https://www.tagesschau.de/investigativ/wdr/intensivbetten-daten-101.html

16 World Health Organization. Ethics and governance of artificial intelligence for health: WHO guidance. 2021. p. 49f. Available at https://www.who.int/publications/i/item/9789240029200

References

Alternative formats

 

Related research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.