A Method of Text Information Normalization of Electronic Medical Records of Traditional Chinese Medicine
Keywords:EMR of TCM, Normalization, Event extraction, Named entity recognition
AbstractElectronic medical records (EMR) of Traditional Chinese Medicine (TCM) contain rich contents such as chief complaints, subcutaneous symptoms, history of present illness, and past medical history, which are important reference bases for TCM diagnosis. However, there are a lot of terminology and expression irregularities since this information is frequently conveyed in natural language. In this paper, we propose a method to normalize the textual information of EMR of TCM and select the text of medical history with a strong narrative such as the history of present illness and past medical history, as well as the text of symptoms such as chief complaints and subcutaneous symptoms as the main research object. The text is then processed separately according to the type of text. For symptom texts such as chief complaints and subcutaneous symptoms, named entity recognition technology is directly applied to extract symptom entities directly; for medical history texts such as the history of present illness and past medical history, event extraction is performed first to divide the treatment events, and then named entity recognition technology is applied to extract various entities, and finally, the various entities are stored in a database. Using this method, experiments are conducted on the EMR of the orthopedic injury department of a hospital, in which the recognition rate of the symptom entity in the symptom text reaches 92.28%, and the recognition rate of entities such as symptoms and diseases in the medical history text reaches 89.86%. The validity of this method is verified. This method normalizes the natural language writing part of the EMR and stores it in a structured way, which is convenient for the subsequent data analysis and mining, and lays a solid foundation for the intellectualization of TCM.
Basic specifications for electronic medical records (Trial), China Phar. 21 (2010), 1063–1064.
Liu Yihui, Ye Hui, Yi Jun, et al., Text information extraction of traditional Chinese medicine electronic medical records based on naive, Modernization of Tradit. Chin. Med. and Materia Medica-World Sci. and Tech. 22 (2020), 3563–3568.
Ren Huiling, Guo Jinjing, Sun HaiXia, et al., Thinking of the study on medical terminology standardization, J. Med. Inform. 39 (2018), 2–7.
Zhang Pan, Shen Shaowu, Tian Shuanggui, et al., Knowledge engineering planning and design for clinical big data in Chinese medicine, Lishizhen Med. and Materia Medica Res. 33 (2022), 764–766.
Ma Siyuan, Cheng Longlong, Huang Shuo, Cui Bingjian, Event extraction of Chinese electronic medical records based on BiGRU-CRF, In 2021 4th International Conference on Artificial Intelligence and Pattern Recognition (AIPR 2021), Association for Computing Machinery, New York, NY, USA, 2021, pp. 592–598. https://doi.org/10.1145/3488933.3488981
Cheng Nan, Hou Hao, Niu YaJun, et al., Application of post structured electronic medical record based on NLP technology, Henan Med. Res. 30 (2021), 4510–4513.
Hou Weitao, Ji Donghong, Research on clinic event recognition based Bi-LSTM, Appl. Res. Comp. 35 (2018), 1974–1977.
Yu Jie, Ji Bin, Liu Lei, Li Sha-sha, Ma Jun, Liu Hui-jun, Joint extraction method for Chinese medical events, Comp. Sci. 48 (2021), 287–293.
Liu Ziqing, The extraction of clinical manifestations and clinical events from outpatient electrical medical records of traditional Chinese medicine, Guangzhou University of Chinese Medicine, 2021.
Liu Kai, Zhou Xuezhong, Yu Jian, Zhang Run-shun, Named entity extraction of traditional Chinese medicine medical records based on conditional random field, Comp. Engg. 40 (2014), 312–316.
Liang Wen Tong, Zhu Yanhui, Zhan Fei, et al., Named entity recognition of electronic medical records based on BERT, J. Hunan Univ. Tech. 34 (2020), 54–62.
Chen Chen, Wu Fenlin, Named entity recognition in the electronic medical record based on BERT, Automation & Instrum. 3 (2021), 173–176.
Liu Yibin, Construction and research of Chinese electronic medical record named entity recognition corpus, Guangzhou University of Chinese Medicine, 2020.
Lin Feng, Research on automatic extraction and coding method of TCM clinical symptom information, Hubei University of Chinese Medicine, 2021.
D. Ahn, The stages of event extraction, ARTE‘06: Proceedings of the Workshop on Annotating and Reasoning about Time and Events, Association for Computational Linguistics, PA, USA, 2006, pp. 1–8.
Gao Su, TaoHu, Jiang Yanzhao, et al., Sentence-level joint event extraction of traditional Chinese medical literature, Tech. Intell. Engg. 7 (2021), 15–29.
Ma Chunming, Li Xiuhong, Li Zhe, et al., Survey of event extraction, J. Comp. Appl. (2022), pp. 1–20.
Chen Shudong, Ouyang Xiaoye, Overview of named entity recognition technology, Radio Commun. Tech. 46 (2020), 251–260.
Qingchuan Wang, E. Haihong, A BERT-based named entity recognition in Chinese electronic medical record, In Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition (ICCPR 2020), Association for Computing Machinery, New York, NY, USA, 2020, pp. 13–17. https://doi.org/10.1145/3436369.3436390
Yang Yanling, Li Yan, Zhong Xinyu, et al., Named entity recognition of TCM medical records based on BiLSTM-CRF, Info. Tradit. Chin. Med. 38 (2021), 15–21.
Li Ni, Guan Huan-mei, Yang Piao, Dong Wen-yong, BERT- IDCNN-CRF for named entity recognition in Chinese, J. Shandong Univ. (Nat. Sci.), 55 (2020), 102–109.
Zhang Qi, Li ChengJun, Liu Jingshu, Research on name entity recognition in military field based on BERT_ IDCNN_ CRF, Aerosp. Electron. Warfare, 37 (2021), 56–60.
Qu Qianqian, Kan Hongxing, Named entity recognition of Chinese medical text based on BERT-BiLSTM-CRF, Electron. Des. Engg. 29 (2021), 40–43+48.
Wang Jun, Wang Xiulai, Luan Weixian, et al., Research on named entity recognition of scientific research talents field based on BERT model, Comp. Tech. Dev. 31 (2021), 21–27.
Zhang Zhifei, Clinical named entity recognition from Chinese electronic medical records using a double-layer annotation model, Nanjing Univ. Posts and Telecommun. 2020.
Yang Fan, Deng Wenping, Sun Jing, et al., Classification and codes of primary symptoms in traditional Chinese medicine, China Information Association for Traditional Chinese Medicine and Pharmacy, Beijing, 2019.
Published by the National Committee for the Validation of Scientific and Technical Terms, Traditional Chinese Medicine Terminology: Internal medicine, gynecology, and pediatrics (2010), Science Press, Beijing, 2011.
Xie WenLi, Mao ShuSong, Xie Dan, Corpus construction for TCM clinical symptom based on information coding standard, In Proceedings of the 2nd International Symposium on Artificial Intelligence for Medicine Sciences (ISAIMS 2021), Association for Computing Machinery, New York, NY, USA, 2021, pp. 493–499. https://doi.org/10.1145/3500931.3501015
Zhang Chunju, Zhang Xueying, Li Ming, et al., Interpretation of temporal information in Chinese text, Geogr. Geo-Inf. Sci. 30 (2014), 1–7.
Z. Du, D. Tang, D. Xie, Automatic extraction of clinical symptoms in traditional Chinese medicine for electronic medical records, IEEE International Conference on Bioinformatics and Biomedicine (BIBM), IEEE, Houston, TX, USA, 2021, pp. 3784–3790. https://doi.org/10.1109/bibm52615.2021.9669345
How to Cite
Copyright (c) 2022 Can Li, Dan Xie
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).