Leo Boytsov @srchvrs, Twitter Profile

Leo Boytsov @srchvrs

2 weeks ago

This is probably the first encoder-decoder model in the recent model race.

Reka @RekaAILabs

2 weeks ago

This is probably the first encoder-decoder model in the recent model race.

2 58 369 155K 277

Download Image

8 5 68 16K 26

Xin Eric Wang @xwang_lk

2 weeks ago

@srchvrs Most of the multimodal LLMs I know are enc-dec models, which I believe include Gemini and GPT-4V.

2 0 6 1K 2

Leo Boytsov @srchvrs

2 weeks ago

@xwang_lk I didn't realize this. Thank you!

1 0 2 367 0

Deen Kun A. @sir_deenicus

2 weeks ago

@srchvrs @xwang_lk I don't think I agree with this since this looks like a T5 type architecture. enc-dec from start/text, which is not common.

1 0 0 119 0

Leo Boytsov @srchvrs

2 weeks ago

@sir_deenicus @xwang_lk Agree it's not common. I have double checked that Gemini is decoder only.

1 0 2 195 0

Xin Eric Wang @xwang_lk

2 weeks ago

@srchvrs @sir_deenicus Uncommon for text-only LLMs but common for multimodal LLMs. I think you are referring to text-only Gemini.

1 0 2 309 0

@srchvrs @sir_deenicus By Gemini. Currently, it is very difficult to build a decoder-only model to encode video, image, text, audio all within the same sequence. It is worth it to research but quite challenging.

3 0 1 767 0

Download Image

ëugene kharitonov 🏴‍☠️ @n0mad_0

2 weeks ago

@xwang_lk @srchvrs @sir_deenicus google-research.github.io/seanet/audiopa… AudioPalm is text-audio decoder-only

0 0 2 30 0