Pasquale Minervini 🚀 looking for postdocs! @PMinervini, Twitter Profile

Pasquale Minervini 🚀 looking for postdocs! @PMinervini

2 weeks ago

Llama 3 was trained using intra-document causal masking, as suggested by @yuzhaouoe's paper "Analysing The Impact of Sequence Composition on Language Model Pre-Training"! 🚀🚀🚀 arxiv.org/abs/2402.13991

7 25 171 23K 91

Download Image

Pasquale Minervini 🚀 looking for postdocs! @PMinervini

2 weeks ago

@yuzhaouoe 🚀🚀🚀

0 0 13 1K 3

Download Image

Yi Tay @YiTayML

2 weeks ago

@PMinervini @yuzhaouoe Segmentation masks are like basic fundamental stuff that a code base should have though...

1 1 14 3K 1

@PMinervini @yuzhaouoe Nice observation, this is indeed helpful -- but also need to acknowledge that is this just a basic feature? thought NLP researchers were doing this all the time, maybe not as efficient as we did for llama3 in terms of implementations, e.g., do padding :)

1 0 2 446 1

Eduardo Pignatelli @edupignatelli

2 weeks ago

@PMinervini @yuzhaouoe 🚀

0 0 3 238 0

Charles Foster @CFGeek

2 weeks ago

@PMinervini @yuzhaouoe It’s wild that this isn’t standard

0 0 7 289 1

Pasquale Minervini 🚀 looking for postdocs! @PMinervini

Pasquale Minervini 🚀 looking for postdocs! @PMinervini

Yi Tay @YiTayML

Wenhan Xiong @XiongWenhan

Eduardo Pignatelli @edupignatelli

Charles Foster @CFGeek