Of patterns and coreference resolution. Figure 4 illustrates an example that our system missed two PPIs since it has no information about coreference that is essential to infer them. In this example, PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/26780312 our system can detect a NP pair of (a novel factor, PGDF alpha) according to Pattern 5. The system, then, could not identify any relation since the first NP does not contain any entity. However, in fact, there are two PPIs between `PGDF alpha’ and the two coreferences of `a novel factor’, which are `Platelet-derived growth factor’ and `PDGF-C’. We have investigated 100 false negative PPIs on the AIMed corpus and found that there are 21 false negative ones (21 ) caused by this error. It is clear that if PASMED could perform accurate coreference resolution, it would cover more interactions. Another solution would be to create more patterns to capture interaction expressions, such as `an interaction between A and B’, `a complex of/between A and B’, `A-B complex’, and `A-B binding’. There are 28 false negative interactions satisfying the expressions. However, these patterns are not general enough for all type relations; they are only specific for PPI and DDI.Semantic TenapanorMedChemExpress AZD1722 relations in MEDLINEPASMED has been applied to the whole MEDLINE and extracted more than 137 millions of semantic relations in the format of (entity 1, relation phrase, entity 2). The ten most frequent types of relations are listed in Table 9. The most common semantic relation type is the relation between `Amino Acid, Peptide or Protein’ entities, which count up to 3.4 million. This explains partially why PPI has been attracting considerable attention in the BioNLP community. Many of the previous studies focus on improving PPI performance [3-5]. There are many large-scale databases constructed from MEDLINE focusing on PPI, e.g., MedScan [45], AliBaba [46], and Chowdhary et al. [47]. Another type of relation that is also extensively studied in the community is the relation between genes and proteins, which is ranked third in Table 9. As with PPI, there are many studies and databases related to this type of relations, such as Chilibot [48], MEDIE [49], EVEX [50] and the BioNLP Shared Task [9].Figure 3 Examples of true extracted relations that are treated as false positive ones according to the annotated PPI and DDI corpora. (a) `associated_with’ relation. (b) `is_a’ relation.Nguyen et al. BMC Bioinformatics (2015) 16:Page 9 ofFigure 4 An example of two PPIs that need coreference information to be identified. Our system can detect a NP pair according to Pattern 5 but cannot extract any relations.The second most common type of relations in our extraction result is the ones between cell and protein entities, which appeared more than 3.1 million times in MEDLINE. This type of relations contain many localization and whole-part relations, the information of which is potentially very useful in biology. These relations are covered partially by localization events in the GENIA corpus. The events are represented as `Localization of Protein to Location’ where Location can be cells. Recently, the CG task [51] has also targeted events on `Localization of Proteins at/from/to Cells’. Somewhat unexpectedly, the relations between genes and diseases, which are another important type of biomedical relations [52], turned out to be much less common than PPIs. More specifically, its rank was the 41th and the number of relations extracted from MEDLINE was about 583,000. The last column in Table 9 shows tha.