Front. Med.    2014, Vol. 8 Issue (3) : 347-351
Extracting terms from clinical records of traditional Chinese medicine
Cungen Cao1,*(),Meng Sun2,Shi Wang1
1. Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
2. Software College of Beihang University, Beijing 100191, China
Health records of traditional Chinese medicine contain valuable clinical information which can be used for improvement of disease treatment and for medical research. In this paper, we present a practical iterative extraction method for extracting terms from the records. The method is based on a set of extraction rules, the Mesh, and the likelihood ratio technique, and achieved a precision rate of 88.18% and a recall rate of 94.21%.

Keywords term extraction      rule-based      likelihood ratio     
27 August 2014    Issue Date: 09 October 2014
ProcedureTerms foundPercentage of totalCorrect termsPrecisionExamplesProcedureTerms found
Preprocessing15538.18%14593.55%腹部毕满(abdominal distention)口臭纳差(ozostomia and anorexia)脑血栓形成(cerebral thrombosis)
1st iteration12731.28%11388.98%舌淡胖(pale and enlarged tongue)脉细弱(thready and weak pulse)素患高血压(hypertension all along)
进食发呛(choking when eating)胸痛偏左(pain on left chest)下肢不能活动(lower limbs cannot move)
2nd iteration8420.69%6880.95%神识模糊(coma)腰腿酸软(debility of the loins and legs)脉濡或沉滑(soft, deep and slippery pulse)
舌尖边有瘀斑瘀点(tip of tongue with ecchymosis)没有表情(no expression)
3rd iteration102.46%770.00%脉两寸、两尺均无力(weak cun-pulse and chi-pulse)不省人事(unconsciousness)
下床欲解大便(want to get out of bed to defecate)时有憋气(feeling suffocated sometime)
Result optimization307.39%2583.33%See in Table 3
Tab.1  Precision of extraction via different iterations
Body wordSourceBody wordSourceBody wordSource
语(speech)失语(aphasia)步履(walk)步履困难(walks with difficulty)治疗(treatment)治疗病情(treatment of the disease)
气(Qi)气短(shortness of breath)苔(tongue fur)少苔(little tongue fur)物(thing)视物不清(blurred vision)
言语(speech)言语尚清(clear speech)寝(sleep)寐寝不安(sleep disorder)小时(hour)两小时(two hours)
Tab.2  Some body structure words found by descriptive words
Rule’s typeExamples
Improved bycovering苔黄腻而干(dry tongue with yellow and greasy fur)狂躁呼叫(manic shout)两小时后软瘫不用(flaccid paralysis after two hours)无痒痛感(no itching feeling)
口溢稀涎(salivation of the labial angle)中风(stroke)左侧偏瘫(hemiplegia of left body)语强(stammering)
自汗(spontaneous perspiration)尚能行走(still can walk)呼吸不规灼手(irregular breathing burning hand)
Edit distance语言蹇塞(stammering)语言蹇涩(stammering)
Tab.3  Some results of result optimization
呱声大响(loud quack)夜寐不宁(sleep disorder)
弛缓性瘫痪(flaccid paralysis)昏仆(faint)
急躁易怒(impatient and irritable)步履艰难(walks with difficulty)
讲话欠利(stammering)夜寐欠佳(not sleep well)
瞳孔尚等大等圆(equal pupil)不能站立(cannot stand)
对光反射迟钝(dull reflection to light)舌强语睿(stiffness of tongue and stammering)
痰声渡渡(phlegm sound)痰声浓液(phlegm sound like concentrated liquid)
血压急剧升高(sharp rise in blood pressure)脘胀泛恶(gastric cavity swelling and nuasea)
Tab.4  Undiscovered terms
