Newsletter #1: 2022-12-07

Hi everyone! Welcome to the first and hopefully not the last of our little AICC “this week in machine learning” summaries.

A lot has been happening in the world of machine learning in the past few weeks. First, there was Meta/Facebook’s Galactica, a large language model trained specifically on scientific research. It was a bit of a technical and PR disaster for Meta. I’ll let Janelle Shane (of You Look Like a Thing… fame) explain a little bit more about how goofy it was during the whopping three days it was live.

Another interesting retrospective on Galactica came from the writer Eryk Salvaggio.

If you’d like some good, yet snarky, watching related to AI hype and missteps there’s the Distributed AI Research Institute’s videos here on this peertube instance. They had a video about Galactica that should be posted in the next couple of weeks but for now, there’s a solid few hours of AI Hype Theatre to watch!

The next big LLM result that has people talking is chatGPT. ChatGPT has been called “GPT 3.5” by some folks. It’s had some tuning. It’s got a nice shiny interface to make it a kind of like a dialogue! It’s led to a lot of discussion though about the role of LLMs in knowledge management, search, and education. Here are some interesting threads and posts I’ve found in the past week:

Along with this, huggingface has released a new course on using stable-diffusion and their diffusers library. And they had a really cool live event talking about the technology and where it’s going. My favorite talk of this was the second talk on tweaking stable-diffusion in a variety of ways. Watch til near the end of it for a really cool demonstration of next-frame prediction to create videos from a single image.

Now for a little bit of my own synthesis: I think that LLMs are potentially going to be really useful for education, research, &c. but I think these last two big releases show that the problem is how we’re approaching them. I don’t think they can ever be used as uncritical answer generators. They are, fundamentally, text generators not knowledge generators. They can produce some really surprisingly interesting and powerful artifacts, but they also tend to make up citations whole-cloth, attribute real papers to people who don’t exist, and reference research that has never happened. They generate the form of knowledge but not the content.

What would it look like to interact with one of these devices in a way that is about assisting us rather than doing work for us?