AWS Comprehend NLP Service



Comprehend Medical - AWS Service


Comprehend Medical is an NLP(Natural Language Processing) service that comes under the AWS Analytics Services which helps to extract the medical-related information from the text files using the ML algorithm. There is a separate component called “Comprehend” that also makes the text analytics for non-medical related use cases such as customer service data, review data and so on. 


Using Comprehend Medical service, we can extract data such as medical condition, type, dosage amount, frequency, forms., etc and helps to prepare the medical report. This leverages many ways to improve patient care, speed up the medical checks, analyze the insights of patient history and medication usage.





Firstly, let us assume that the text files(perhaps generated by Transcribe or any other sources) are placed in the S3 bucket. For instance, below is the text file for this case


Mrs. Abc is a 52-year-old self-employed, independent consultant for DEMILEE-USA. She used to have allergies when she lived in Seattle but she thinks they are worse here. In the past, she has tried Claritin and Zyrtec. Both worked for a short time but then seemed to lose effectiveness. She has used Allegra also. She used that last summer and she began using it again two weeks ago. It does not appear to be working very well. She has used over-the-counter sprays but no prescription nasal sprays. She does have asthma but does not require daily medication for this and does not think it is flaring up.


If you use lambda function or sagemaker notebook or any other python IDE where the AWS Environment has configured, go ahead on the further steps


Initiate the API service by invoking the below command,


comprehend = boto3.client(service_name='comprehendmedical')


Here I have taken the sample text file directly assigned on the variable. In other cases, perhaps we might extract from the S3 service. 


text_input=’’’Mrs. Abc is a 52-year-old self-employed, independent consultant for DEMILEE-USA. She used to have allergies when she lived in Seattle but she thinks they are worse here. In the past, she has tried Claritin and Zyrtec. Both worked for a short time but then seemed to lose effectiveness. She has used Allegra also. She used that last summer and she began using it again two weeks ago. It does not appear to be working very well. She has used over-the-counter sprays but no prescription nasal sprays. She does have asthma but does not require daily medication for this and does not think it is flaring up.
’’’


Pass the text file into the comprehend API call, which will perform the text analytics and extract the medical information into the JSON file. Here I have extracted into the list and refine only the needed information for an instance name, age, and allergies. 


entity_list = comprehend.detect_entities(text_input)['Entities']


#gather name field if mentioned in the text file
        for entity in entity_list:
            if entity['Type'] =='Name':
                name = entity['Text']
                break


#gather age field if mentioned in the text file
        for entity in entity_list:
            if entity['Type'] =='AGE':
                age = entity['Text']
                break


#gather the list of allergies if mentioned in the text file
        allergies = []
        for entity in entity_list:
            if ((entity['Type'] == 'BRAND_NAME' or entity['Type'] == 'GENERIC_NAME') and \
                len(entity['Traits'])!=0 and entity['Traits'][0]['Name']=='NEGATION'):
                allergies.append(entity['Text'])
        


Preferably the results may be stored in the persistent storage component like dynamo DB, S3 or RDS. Furthermore, the data will be processed with other analytical services like glue, quick sight in AWS. However, below the result of extracted content from Comprehend Medical


print(name, age, allergies)


Abc, 52, [‘Allegra’, ‘asthma’]


Hope this would be informative for you!. Thanks for reading the post. Please do follow me for more topics. 

Recent Posts