Bidirectional LMs like T5 learn superior representations, but the field mostly trains unidirectional LMs like GPT-3 since the "emergent" property of prompting was never seen in T5. We show that T5 can be prompted, outperforming GPT-3 with 50% fewer params. arxiv.org/abs/2209.14500
2
38
284
36K
90
Work done by Ajay Patel, @bryanlics, and @ccb from @upennnlp with collaborators @colinraffel @noahconst and @rasoolims
@upennnlp A quick summary of this paper. 42papers.com/p/bidirectiona…