Research profile
  • Home
  • Publications
  • ResearchTopics

2018 spring Course
<Bio-medicine Text Mining and Knowledge discovery>

研究生课程《生物医药文本挖掘和知识发现》




2018年讲授基本遵循2017年教学大纲并略有增删。全学期内容由7个案例组成,循序渐进地引导学生使用代码并进行问题求解。其中第一章借用Dr. Alex Chengyu Fang的词计算案例,第二章介绍词云计算,第三章介绍GO,第四章介绍HPO/SIDER/DRUGBANK等数据集的使用,第五章介绍突变、蛋白、和化合物的实体识别和Shell编程,第六章介绍GloVe词嵌入和语义计算,第七章介绍事件提取,第八章做整体回顾。


​讲授札记:
从今年课程执行情况来看,7个问题的设计符合难度递升原则,是合理的。由于课时原因,一些NLP的新理论和方法没有办法展开,在后续课程中考虑增加学时。


                                       
   Outline 


oWeek 1: (9th, Mar)
•Chapter 0. Introduction of BioNLP and This Course
•Chapter 1. First Class of Linux and Lexicon Analysis -Alex's Case(教师Slides, download here.)
数据: Brown Corpus Sample.
史志茹:《Token/Term Ratio》(Slides download here.)

oWeek 2:  (16th, Mar)
•Chances for earning points, 3 talks (8-10 mins for each)
•Chapter *. Idea of this Course
•Chapter 2. R programming & Word Cloud (Slides download here)
杨晨:《Word Cloud代码详解》(Slides download here.)
郭梦瑶:《Word Cloud应用》(Slides download here.)
钱胜:《Rapidly hybrid speciation in Darwin's finches》(Slides download here.)

oWeek 3:  (23rd, Mar)
•Chances for earning points, 3 talks (8-10 mins for each)
•Chapter 3. Gene Ontology (Dataset and Text Retrieval and Ontology)(Slides download here)
​
Suggested link: ftp://ftp.geneontology.org/go/www/GO.tools_by_type.term_enrichment.shtml
Suggested paper: AgriGO-https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2896167/
付海涛:《GO富集分析》(Slides download here.) (Codes and Data download here.)
万靖:《昼夜节律的GO分析和文本挖掘》(Slides download here.)

​oWeek 4: (30th, Mar)
•Chances for earning points, 3 talks (8-10 mins for each)
•Chapter 4. Side Effect and Human Phenotype Ontology (Case study for drug repurposing)
HPO disease-gene-phenotype link ref (download here). 

许璇:《Psoriasis Drug repurposing》 (Slides download here.)
佟馨宇:《Case study for drug repurposing》 (Slides download here.)

oWeek 5: (8th, Apr)
•Chances for earning points, 3 talks (8-10 mins for each)
•Chapter 5. Clinical Trial and Its R package application (Download the slides)
*Chapter 5'. NER and Shell programming. (Codes for shell was distributed in class. Please cautiously use the curl command. Please...)
薛雅文:《利用shell编程实现NLP》(Slides download here.)
廖璇:《Shell programming for NER》(Slides download here.)

oWeek 6:  (13th, Apr)
•Chances for earning points, 3 talks (8-10 mins for each)
•Chapter 6‘. GloVe, a start of semantic computation. 
丁可:《GloVe, Global Vectors for word representation》(Download the slides)
章胜:《Tensorflow implementation of GloVe》(Codes and data)

李桐:《GloVe思想与密码子偏好性》(Slides download here.)
周开银:《词向量对数据挖掘任务的影响》(Slides download here.)
张亚亮:《GloVe for PubMed texts》(Slides, Codes)

oWeek 7:  (20th, Apr)
•Chances for earning points, 3 talks (8-10 mins for each)
•Chapter 6. Event Extraction and Knowledge Discovery (A little bit Python coding and Parsing tree thing)
Text data set 1: SE for salsalate (Download here.)
Text data set 2: SE for viagra (Download here.)

毛盛强:《Event Extraction and Knowledge discovery》(Slides download here.)
王书言: 《用Metamap和Semrep发现salsalate和viagra的副作用》(Slides download here.)

oWeek 8:  (27th, Apr)
•Chances for earning points, 2 talks (8-10 mins for each)
•Conclusion Talk <History and Future of Bio Text Mining>

​

Picture
Picture
Picture

History link:
2017 Spring Course
2016 Spring course
​Alex's related course <Corpora based text processing for bioinformatics>

Picture

....

Powered by Create your own unique website with customizable templates.