How should the AI research community broadly refer to the family of large, pre-trained, self-supervised, scalable models applicable to a fairly broad range of downstream tasks without fine-tuning? E.g., GPT-3, PaLM, Chinchilla, OPT, DALL-E, Flamingo, Gato, etc.
h/t @mmitchell_ai who prompted a new discussion on the de facto adoption of "foundation models" by @StanfordHAI, and suggested "base model" as a more neutral alternative with some precedents in the community. x.com/mmitchell_ai/s…
h/t @mmitchell_ai who prompted a new discussion on the de facto adoption of "foundation models" by @StanfordHAI, and suggested "base model" as a more neutral alternative with some precedents in the community. x.com/mmitchell_ai/s…
@raphaelmilliere "Academic politics is the most vicious and bitter form of politics, because the stakes are so low."
FWIW the most neutral yet informative alternative I can think of would simply be "large pre-trained models", but it's not as succinct, and it doesn't reflect the full range of properties these models share. This trade-off is probably inevitable.
@raphaelmilliere Large language models or large multi-modal models. Large pre-trained models also works.
@raphaelmilliere I don't think a single one of these models you cited is self-supervised.
@raphaelmilliere Beyond my area of expertise but I am reminded of George Box's truism "All models are wrong but some are useful". Seems there are more than models going on here. How about Autonomous Dynamic Algorithmic Models (or ADAM for short?) 😆
@raphaelmilliere In semiconductors there was LSI, VLSI etc…which were all descriptive without being evocative. Could be just large pre-trained nets, very large pretrained nets, VVLPN etc.