[ML News] Llama 3 changes the game
Meta’s Llama 3 is the latest AI model to hit the market, and it comes with a slew of new features that make it an exciting option for developers and researchers alike. With a new model, new license, and new opportunities, Meta’s Llama 3 is poised to revolutionize the field of artificial intelligence.
One of the key selling points of Meta’s Llama 3 is its advanced capabilities. The model has been trained on a vast amount of data, allowing it to generate high-quality output across a wide range of tasks. Whether you’re working on natural language processing, image recognition, or any other AI-related project, Meta’s Llama 3 has you covered.
In addition to its impressive performance, Meta’s Llama 3 also offers a new licensing structure that makes it easier than ever to access and use the model. The updated license allows for greater flexibility in how the model can be deployed and used, opening up new opportunities for collaboration and innovation in the AI community.
Furthermore, Meta’s commitment to trust and safety is evident in their responsible AI practices. They have developed a suite of tools and guidelines aimed at ensuring that AI technologies are used ethically and responsibly. By prioritizing these principles, Meta is setting a new standard for ethical AI development.
Overall, Meta’s Llama 3 represents a significant advancement in the field of artificial intelligence. With its cutting-edge technology, flexible licensing options, and commitment to responsible AI practices, this model is sure to drive innovation and progress in the industry. If you’re looking to take your AI projects to the next level, Meta’s Llama 3 is definitely worth considering.
6:23 "There is enough research to show that once you are capable at one language, you only need quite little data on another language to transfer that knowledge back and forth"
Does anyone give me related papers to this argument? I am interested in cross-lingual transfer in language models.
I really like the way you frame meta/zuck making llama 3 open source. They choose the option that is best for the company, but whats best changes. For research and optimization an open source model is better. For profit a closed source one is better.
What they do just depends on what is best at the moment, but i like that its open source for llama 3 right now and hope it will stay that way!
8:25
I'm not sure why people have reservations about Phi specifically. We don't know what data were used to train the other models and to what extend their performance rely on "fitting to the test dataset". Did OpenAI ever admit what role does the human-curated part of their training dataset play in the model's performance?
“ Extremely cool “ ?
Nah man. Its just a ploy sink a market they know they cant compete in. It has nothing to do with being cool. It’s a strategic move coated in a nice marketing narrative.
What they are doing is destroying the market, while increasing the risk for misuse.
Udio is really cool you guys should check it out. Made me feel like I am a producer/song writer.
Its very easy to get models to hallucinate when asking for music recommnds. LLama is no different.
if the training text would be plain ascii, and average token length 4 characters , the training dataset would have been ~ 55 terabytes plain ascii. wow!
No, it is not. Just another player and not the best of all.
The obsessively blaming the safety crowd IMO is kinda cringe and lame. It’s obvious why Open AI and Anthropic don’t open source their models, it’s for profitability reasons. They don’t even pretend like it isn’t and they don’t use safety as an excuse. Constantly blaming people who care about safety is gonna lead to a rude awakening when Facebook realizes it’s tanked enough competitor market share and announces its own fully closed off monetized models.
> The good that's come from these models far outweighs the bad
Really? Don't get me wrong, I think language models are great but I know people have lost their jobs over this, we've seen data breaches, people are falling in love with AI personas, one guy was driven to suicide, scams are on the rise… I have no shortage of bad things to mention that have come out of AI, but I can't think of anything truly good. I mean I'm sure a good number of people are a bit more productive in their work, but that doesn't seem like a worthy tradeoff to me.
I also disagree with your cavalier attitude towards safety based on past experience. It seems possible to me that as these models become more powerful, we may attain the AI singularity (ability for self-improvement). Once that happens, past experience will have very little wisdom to impart on us regarding what will happen next. It's very possible that we're worried for nothing, but given the scale of what's at stake, it only makes sense to be cautious.
Those t-shirt stripes are an example of reverse CAPTCHA – it spins humans right into dizziness and blackout, but AIs? They just keep watching and learning.
@13:16 "and with the past with Llama 2 we've already seen that all these people who have announced how terrible the world is going to be if we open-source these models have been wrong — have been plainly wrong. The improvement in the field, the good things that have happened undoubtedly, massively, outweigh any sort of bad things that happen, and I don't think there's a big question about that. It's just that the same people now say 'Well okay not this model, but the next model… is really dangerous to release openly.' So this is the next model, and my prediction today is it's going to be just fine, in fact it's going to be amazing releasing this."
@Yannic, That's quite a set of claims. What are all "the good things that have happened" beyond technical advances like more efficient models? I'm sure millions of people are more productive and writing better (or at least spewing grammatically correct verbiage), but are there actual studies of the good things, both with AI in general and open-source models? Meanwhile it's unclear how long it will be before we discover the awful uses of AI in the 2024 election cycles in major countries and other disinformation campaigns.
I'm willing to believe your take, but some evidence for your optimism would be nice.
Do I really need a better LLM than Llama3 70B? If I have a good agent with search, RAG, and memory, isn't that good enough?
Regarding Llama3, Sam looked scared out of his mind in a recent video. ClosedAI sucks.
The scores are high somehow and it makes me wonder whether they specially aligned the curated and the validation data when doing instruction finetuning!!
Old news there is Phi-3 now.
This has really changed my perspective (from pessimistic to a little more optimistic); both the dunking on the doomers but also, by releasing these models and being unapologetic about it, we can start to get rid of the mystique that has been given to them because of this Wizard of Oz game OpenAI was playing. letting people learn to deal with these systems by themselves and see what’s under the hood. I’m confident that’s going to lead to the more efficient use of these systems, something that’s achievable when the name of the game is just “MAKE MODEL BIGGA! MOAR DATA! MOAR CIMPUTE!!!” The power of having generalized approximators is wasted if all you use them for is effectively brute force on a graph.
The thing about data quality cannot be overstated. If we can be rational adults for a second and drop the hype, the fact of the matter is that calling these systems “artificial intelligence” and acting as if they’re machine god doesn’t change the fact that they’re not intelligent, doing anything close to it, nor have any of the cognitive properties the hypers and the doomers keep attributing to them. They are just functions; literal f(x)’s (granted big spicy ones). You’re fitting a function to data under some optimization procedure.
The relevance of the data is that in neural networks (and siblings) we have mathematical guarantees about them being able to fit anything *(within reason)— they’re general purpose approximators. That’s a super useful thing to have! Quite powerful. You know what the weakness is though? **You can fit anything**. Anything includes things you as a human don’t want! But if the thing you don’t want generated a signal that can be used to minimize loss, then the systems doing things you don’t want, is actually working as intended.
Being able to fit anything means that the function you’re using to fit ceases to be of central importance m, completely shifting the burden onto the data itself. Fitting these models (assuming you pulled it off) is just moving the data distribution from a data explicit format, to a functional representation.
Hopefully, this leads to a sobering if the field and maybe an attempt is made to return to symbolics with the gains of these models and maybe, just maybe, an artificial system could not just sound like a human, but reason like one.
Now imagine Llama 4 400b, running on 1-bit Mamba architecture, fully implemented in and optimized for Mojo… 🥹
(I'm basically free associating lol, no idea if that's possible)
Mixture of Depth is a promising direction in modularizing LLMs, you could basically use only part of the model for specific applications
More paper reviews please
I don't know what everyone is hyped up about. LLama3-8B (or at least the dolphin versions) sucks compared to best performing Mistral-7B finetunes/merges (such as NeuralBeagle).
Cool to have great open LLMs. Unfortunately, this is not the case for image generation models: all the recent advanced models like SDXL or Photoshop are not commercial free ones.
Interesting times in many ways.