
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
00
2018-10-11
transformersnlp
Abstract
This paper introduces and evaluates the idea described in “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding”, and reports empirical results that helped shape subsequent work in transformers, nlp.