Operationalizing machine-assisted translation in healthcare

0
Operationalizing machine-assisted translation in healthcare
  • Rawal, S. et al. Association between limited English proficiency and revisits and readmissions after hospitalization for patients with acute and chronic conditions in Toronto, Ontario, Canada. JAMA 322, 1605–1607 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Lion, K. C., Lin, Y.-H. & Kim, T. Artificial intelligence for language translation: the equity is in the details. JAMA 332, 1427–1428 (2024).

    Article 
    PubMed 

    Google Scholar 

  • Flores, G. The impact of medical interpreter services on the quality of health care: a systematic review. Med. Care Res. Rev.62, 255–299 (2005).

    Article 
    PubMed 

    Google Scholar 

  • Schulson, L. B. & Anderson, T. S. National estimates of professional interpreter use in the ambulatory setting. J. Gen. Intern. Med. 37, 472–474 (2022).

    Article 
    PubMed 

    Google Scholar 

  • Diamond, L. C., Schenker, Y., Curry, L., Bradley, E. H. & Fernandez, A. Getting by: underuse of interpreters by resident physicians. J. Gen. Intern. Med. 24, 256–262 (2009).

    Article 
    PubMed 

    Google Scholar 

  • Detz, A. et al. Language concordance, interpersonal care, and diabetes self-care in rural Latino patients. J. Gen. Intern. Med. 29, 1650–1656 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Betancourt, J. R., Green, A. R., Carrillo, J. E. & Ananeh-Firempong, O. Defining cultural competence: a practical framework for addressing racial/ethnic disparities in health and health care. Public Health Rep. 118, 293–302 (2003).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Molina, R. L. & Kasper, J. The power of language-concordant care: a call to action for medical schools. BMC Med. Educ. 19, 378 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Harvey, S. M., Branch, M. R., Hudson, D. & Torres, A. Listening to immigrant Latino men in rural Oregon: exploring connections between culture and sexual and reproductive health services. Am. J. Mens. Health 7, 142–154 (2013).

    Article 
    PubMed 

    Google Scholar 

  • Gavvala, S. Ensuring understanding: Language-concordant discharge instructions. Rice Univ. Baker Inst. Public Policy, Issue Brief. (2023).

  • Karpińska, P. Computer aided translation – possibilities, limitations and changes in the field of professional translation. J. Educ. Cult. Soc. 8, 133–142 (2017).

    Article 

    Google Scholar 

  • Davis, S. H. et al. Translating discharge instructions for limited English-proficient families: strategies and barriers. Hosp. Pediatr. 9, 779–787 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Choe, A. Y. et al. Improving discharge instructions for hospitalized children with limited english proficiency. Hosp. Pediatr. 11, 1213–1222 (2021).

    Article 
    PubMed 

    Google Scholar 

  • Diamond, L. C., Wilson-Stronks, A. & Jacobs, E. A. Do hospitals measure up to the national culturally and linguistically appropriate services standards?. Med. Care 48, 1080–1087 (2010).

    Article 
    PubMed 

    Google Scholar 

  • Rights (OCR), O. for C. Summary of Guidance to Federal Financial Assistance Recipients Regarding Title VI and the prohibition against national origin discrimination affecting limited English proficient persons. (2007).

  • Wu, Y. et al. Google’s neural machine translation system: bridging the gap between human and machine translation. Preprint at (2016).

  • Koehn, P. & Knowles, R. Six challenges for neural machine translation. In Proc. First Workshop on Neural Machine Translation (eds. Luong, T., Birch, A., Neubig, G. & Finch, A.) 28–39 (Association for Computational Linguistics, Vancouver, 2017). https://doi.org/10.18653/v1/W17-3204.

  • Vaswani, A. et al. Attention is all you need. in Advances in Neural Information Processing Systems 30 (Curran Associates, Inc., 2017).

  • Tu, T. et al. Towards conversational diagnostic artificial intelligence. Nature 642, 442–450 (2025).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Brewster, R. C. L. et al. Performance of ChatGPT and Google Translate for Pediatric Discharge Instruction Translation. Pediatrics 154, e2023065573 (2024).

    Article 
    PubMed 

    Google Scholar 

  • Ortega, J. E., Castro Mamani, R. & Cho, K. Neural machine translation with a polysynthetic low resource language. Mach. Transl. 34, 325–346 (2020).

    Article 

    Google Scholar 

  • Adebara, I., Abdul-Mageed, M. & Silfverberg, M. Linguistically-Motivated Yorùbá-English Machine Translation. In Proc. of the 29th International Conference on Computational Linguistics (eds Calzolari, N. et al.) 5066–5075 (International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 2022).

  • Goh, E. et al. GPT-4 assistance for improvement of physician performance on patient care tasks: a randomized controlled trial. Nat. Med. 1–6 (2025).

  • Savage, T. et al. Fine tuning large language models for medicine: the role and importance of direct preference optimization. Preprint at (2024).

  • Mirza, F. N. et al. Using ChatGPT to facilitate truly informed medical consent. NEJM AI 1, AIcs2300145 (2024).

    Article 

    Google Scholar 

  • Van Veen, D. et al. Adapted large language models can outperform medical experts in clinical text summarization. Nat. Med. 30, 1134–1142 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Zaretsky, J. et al. Generative artificial intelligence to transform inpatient discharge summaries to patient-friendly language and format. JAMA Netw. Open 7, e240357 (2024).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Nondiscrimination in Health Programs and Activities. Federal Register (2024).

  • Damschroder, L. J., Reardon, C. M., Widerquist, M. A. O. & Lowery, J. The updated consolidated framework for implementation research based on user feedback. Implement. Sci. 17, 75 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Xu, Z., Jain, S. & Kankanhalli, M. Hallucination is inevitable: an innate limitation of large language models. Preprint at (2024).

  • Liu, N. F. et al. Lost in the middle: how language models use long contexts. Trans. Assoc. Comput. Linguist. 12, 157–173 (2024).

    Article 

    Google Scholar 

  • Levy, A., Agrawal, M., Satyanarayan, A. & Sontag, D. Assessing the impact of automated suggestions on decision making: domain experts mediate model errors but take less initiative. In Proc. 2021 CHI Conference on Human Factors in Computing Systems 1–13 (Association for Computing Machinery, New York, NY, USA, 2021). https://doi.org/10.1145/3411764.3445522.

  • Kuperman, G. J. et al. Medication-related clinical decision support in computerized provider order entry systems: a review. J. Am. Med. Inform. Assoc. 14, 29–40 (2007).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Data controls in the OpenAI platform – OpenAI API. https://platform.openai.com.

  • Ng, M. Y., Helzer, J., Pfeffer, M. A., Seto, T. & Hernandez-Boussard, T. Development of secure infrastructure for advancing generative AI research in healthcare at an academic medical center. Res. Sq. rs.3.rs-5095287 (2024).

  • Vedula, K. S. et al. Distilling large language models for efficient clinical information extraction. Preprint at (2024).

  • Woods, A. P. et al. Limited English proficiency and clinical outcomes after hospital-based care in English-speaking countries: a systematic review. J. Gen. Intern. Med. 37, 2050–2061 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Manuel, S. P., Nguyen, K., Karliner, L. S., Ward, D. T. & Fernandez, A. Association of English language proficiency with hospitalization cost, length of stay, disposition location, and readmission following total joint arthroplasty. JAMA Netw. Open 5, e221842 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Kojima, T., Gu, S. S., Reid, M., Matsuo, Y. & Iwasawa, Y. Large language models are zero-shot reasoners. In Proc. 36th International Conference on Neural Information Processing Systems 22199–22213 (Curran Associates Inc., Red Hook, NY, USA, 2022).

  • Bakken, S. AI in health: keeping the human in the loop. J. Am. Med. Inform. Assoc. 30, 1225–1226 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Swaminathan, A. et al. Natural language processing system for rapid detection and intervention of mental health crisis chat messages. NPJ Digit. Med. 6, 1–9 (2023).

    Article 

    Google Scholar 

  • BigQuery enterprise data warehouse. Google Cloud https://cloud.google.com/bigquery.

  • The Snowflake AI Data Cloud – Mobilize Data, Apps, and AI. https://www.snowflake.com/content/snowflake-site/global/en.

  • Create Your Azure Free Account Or Pay As You Go | Microsoft Azure. https://azure.microsoft.com/en-us/pricing/purchase-options/azure-account/search.

  • Zhang, X., Rajabi, N., Duh, K. & Koehn, P. Machine Translation with Large Language Models: Prompting, Few-shot Learning, and Fine-tuning with QLoRA. In Proc. Eighth Conference on Machine Translation (eds. Koehn, P., Haddow, B., Kocmi, T. & Monz, C.) 468–481 (Association for Computational Linguistics, Singapore, 2023). https://doi.org/10.18653/v1/2023.wmt-1.43.

  • Rafailov, R. et al. Direct preference optimization: your language model is secretly a reward model. In Proc. 37th International Conference on Neural Information Processing Systems 53728–53741 (Curran Associates Inc., Red Hook, NY, USA, 2023).

  • Looker Studio. Google for Developers https://developers.google.com/looker-studio.

  • Lommel, A. R., Burchardt, A. & Uszkoreit, H. Multidimensional quality metrics: a flexible system for assessing translation quality. In Proc. Translating and the Computer 35 (Aslib, London, UK, 2013).

  • Chen, X., Acosta, S. & Barry, A. E. Evaluating the accuracy of Google translate for diabetes education material. JMIR Diabetes 1, e3 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Lopez, I., Haredasht, F. N., Caoili, K., Chen, J. H. & Chaudhari, A. Embedding-driven diversity sampling to improve few-shot synthetic data generation. Preprint at (2025).

  • Popović, M. chrF++: words helping character n-grams. In Proc. Second Conference on Machine Translation (eds. Bojar, O. et al.) 612–618 (Association for Computational Linguistics, Copenhagen, Denmark, 2017). https://doi.org/10.18653/v1/W17-4770.

  • Rei, R., Stewart, C., Farinha, A. C. & Lavie, A. COMET: A Neural Framework for MT Evaluation. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (eds. Webber, B., Cohn, T., He, Y. & Liu, Y.) 2685–2702 (Association for Computational Linguistics, Online, 2020). https://doi.org/10.18653/v1/2020.emnlp-main.213.

  • Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. Bleu: a method for automatic evaluation of machine translation. In Proc. 40th Annual Meeting of the Association for Computational Linguistics (eds. Isabelle, P., Charniak, E. & Lin, D.) 311–318 (Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 2002). https://doi.org/10.3115/1073083.1073135.

  • Mathur, N., Baldwin, T. & Cohn, T. Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics. In Proc. 58th Annual Meeting of the Association for Computational Linguistics (eds. Jurafsky, D., Chai, J., Schluter, N. & Tetreault, J.) 4984–4997 (Association for Computational Linguistics, Online, 2020). https://doi.org/10.18653/v1/2020.acl-main.448.

  • Lopez, I. et al. Clinical entity augmented retrieval for clinical information extraction. NPJ Digit. Med. 8, 1–11 (2025).

    Article 

    Google Scholar 

  • Swaminathan, A. et al. Selective prediction for extracting unstructured clinical data. J. Am. Med. Inform. Assoc. 31, 188–197 (2024).

    Article 

    Google Scholar 

  • Bates, B. A. et al. Validity of International Classification of Diseases (ICD)-10 diagnosis codes for identification of acute heart failure hospitalization and heart failure with reduced versus preserved ejection fraction in a national medicare sample. Circ. Cardiovasc. Qual. Outcomes 16, e009078 (2023).

    Article 
    PubMed 

    Google Scholar 

  • Gothe, H. et al. Algorithms to identify COPD in health systems with and without access to ICD coding: a systematic review. BMC Health Serv. Res. 19, 737 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Shoemaker, S. J., Wolf, M. S. & Brach, C. Development of the Patient Education Materials Assessment Tool (PEMAT): a new measure of understandability and actionability for print and audiovisual patient information. Patient Educ. Couns. 96, 395–403 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • Johnson, A. E. W. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data 10, 1 (2023).

  • Carrell, D. et al. Hiding in plain sight: use of realistic surrogates to reduce exposure of protected health information in clinical text. J. Am. Med. Inform. Assoc. 20, 342–348 (2013).

    Article 
    PubMed 

    Google Scholar 

  • National Standards for Culturally and Linguistically Appropriate Services (CLAS) in Health and Health Care. Federal Register (2013).

  • Li, Z. et al. Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages. In Special Track on AI Alignment 28186–28194 (Association for the Advancement of Artificial Intelligence, 2025). https://doi.org/10.1609/aaai.v39i27.35038.

  • Xie, Y. et al. Weakly supervised scene text generation for low-resource languages. Expert Syst. Appl. 237, 121622 (2024).

    Article 

    Google Scholar 

  • Khoong, E. C. & Rodriguez, J. A. A research agenda for using machine translation in clinical medicine. J. Gen. Intern. Med. 37, 1275–1277 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar 

  • link

    Leave a Reply

    Your email address will not be published. Required fields are marked *