An important observation made by Toutanova & Chen (2015)[1]: simple observed features make up a strong baseline for knowledge base completion. They experimented with as simple features as whether the subject and object appear in a known triple and whether the subject and relation do so.

Nickel et al. (2015)[2] survey various approaches.

Neelakantan, 2015

Knowledge base (in)completeness Edit

Screen Shot 2016-08-19 at 18.39.57

Table 1 in Min et al. (2013)

Min et al. (2013)[3] presented a valuable analysis of the completeness of knowledge bases. Table 1 (see photo) shows that many important attributes about people are missing. It is especially problematic when one uses the KB to evaluate. The authors show that a distant supervision algorithm suggest many (unlabeled) relations that are not in the KB but are actually correct.

Approaches Edit

Probabilistic logics Edit

One can cast KBC as logical inference and apply general frameworks such as Markov Logic Network, Probabilistic Inductive Logic Programming, Probabilistic Soft Logic.

Graph-based Edit

Path Ranking Algorithm (Lao et al., 2011)[4]

Distributed/neural network Edit

compositional training (COMP) (Gu et al., 2015), RNN model (Neelakantan et al., 2015) and PTransE (Lin et al., 2015a)

TODO: neighborhood mixture model (Nguyen et al., 2016)[5]

Evaluation Edit

Link (argument) prediction Edit

From Lin et al. (2015)[6]: "Link prediction aims to predict the missing h or t for a re- lation fact triple (h, r, t), used in (Bordes et al. 2011; 2012; 2013). In this task, for each position of missing entity, the system is asked to rank a set of candidate entities from the knowledge graph, instead of only giving one best result. [...] In testing phase, for each test triple (h, r, t), we replace the head/tail entity by all entities in the knowledge graph, and rank these entities in descending order of similarity scores calculated by score function fr. Following (Bordes et al. 2013), we use two measures as our evaluation metric: (1) mean rank of correct entities; and (2) proportion of correct entities in top-10 ranked entities (Hits@10). A good link predictor should achieve lower mean rank or higher Hits@10. In fact, a corrupted triple may also exist in knowl- edge graphs, which should be also considered as correct. However, the above evaluation may under-estimate those systems that rank these corrupted but correct triples high. Hence, before ranking we may filter out these corrupted triples which have appeared in knowledge graph. We name the first evaluation setting as “Raw” and the latter one as “Filter”."

Triple classification Edit

From Lin et al. (2015)[6]: "Triple classification aims to judge whether a given triple (h, r, t) is correct or not. This is a binary classification task, which has been explored in (Socher et al. 2013; Wang et al. 2014) for evaluation."

See also Edit

References Edit

  1. Toutanova, Kristina, and Danqi Chen. "Observed versus latent features for knowledge base and text inference." Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality. 2015.
  2. A Review of Relational Machine Learning for Knowledge Graphs. 2015. Maximilian Nickel, Kevin Murphy, Volker Tresp, Evgeniy Gabrilovich. Arxiv preprint
  3. B. Min, R. Grishman, L. Wan, C. Wang, and D. Gondek. 2013. Distant supervision for relation extraction with an incomplete knowledge base. In North American Association for Computational Linguistics (NAACL), pages 777–782.
  4. Ni Lao, Tom Mitchell, and William W Cohen. 2011. Random walk inference and learning in a large scale knowledge base. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 529–539. Association for Computational Linguistics. 
  5. Nguyen, D. Q., Sirts, K., Qu, L., & Johnson, M. (2016). Neighborhood Mixture Model for Knowledge Base Completion. Retrieved from
  6. 6.0 6.1 Lin, Y., Liu, Z., Sun, M., Liu, Y., & Zhu, X. (2015). Learning Entity and Relation Embeddings for Knowledge Graph Completion. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Learning, 2181–2187.

External links Edit