MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future.
Here’s their pick of AI trends to watch out for in 2024.
You get a chatbot! And you get a chatbot! In 2024, tech companies that invested heavily in generative AI will be under pressure to prove that they can make money off their products.To do this, AI giants Google and OpenAI are betting big on going small: both are developing user-friendly platforms that allow people to customize powerful language models and make their own mini chatbots that cater to their specific needs—no coding skills required. Both have launched web-based tools that allow anyone to become a generative-AI app developer.
In 2024, generative AI might actually become useful for the regular, non-tech person, and we are going to see more people tinkering with a million little AI models. State-of-the-art AI models, such as GPT-4 and Gemini, are multimodal, meaning they can process not only text but images and even videos. This new capability could unlock a whole bunch of new apps. For example, a real estate agent can upload text from previous listings, fine-tune a powerful model to generate similar text with just a click of a button, upload videos and photos of new listings, and simply ask the customized AI to generate a description of the property.
It’s amazing how fast the fantastic becomes familiar. The first generative models to produce photorealistic images exploded into the mainstream in 2022—and soon became commonplace. Tools like OpenAI’s DALL-E, Stability AI’s Stable Diffusion, and Adobe’s Firefly flooded the internet with jaw-dropping images of everything from the pope in Balenciaga to prize-winning art. But it’s not all good fun: for every pug waving pompoms, there’s another piece of knock-off fantasy art or sexist sexual stereotyping.
The new frontier is text-to-video. Expect it to take everything that was good, bad, or ugly about text-to-image and supersize it.
It’s no surprise that top studios are taking notice. Movie giants, including Paramount and Disney, are now exploring the use of generative AI throughout their production pipeline. The tech is being used to lip-sync actors’ performances to multiple foreign-language overdubs. And it is reinventing what’s possible with special effects. In 2023, Indiana Jones and the Dial of Destiny starred a de-aged deepfake Harrison Ford. This is just the start.
Away from the big screen, deepfake tech for marketing or training purposes is taking off too. For example, UK-based Synthesia makes tools that can turn a one-off performance by an actor into an endless stream of deepfake avatars, reciting whatever script you give them at the push of a button. According to the company, its tech is now used by 44% of Fortune 100 companies.
If recent elections are anything to go by, AI-generated election disinformation and deepfakes are going to be a huge problem as a record number of people march to the polls in 2024. We’re already seeing politicians weaponizing these tools.
Just a few years ago creating a deepfake would have required advanced technical skills, but generative AI has made it stupidly easy and accessible, and the outputs are looking increasingly realistic.
The coming year will be pivotal for those fighting against the proliferation of such content. Techniques to track and mitigate it content are still in early days of development. Watermarks, such as Google DeepMind’s SynthID, are still mostly voluntary and not completely foolproof. And social media platforms are notoriously slow in taking down misinformation. Get ready for a massive real-time experiment in busting AI-generated fake news.
Inspired by some of the core techniques behind generative AI’s current boom, roboticists are starting to build more general-purpose robots that can do a wider range of tasks.
The last few years in AI have seen a shift away from using multiple small models, each trained to do different tasks—identifying images, drawing them, captioning them—toward single, monolithic models trained to do all these things and more. By showing OpenAI’s GPT-3 a few additional examples (known as fine-tuning), researchers can train it to solve coding problems, write movie scripts, pass high school biology exams, and so on. Multimodal models, like GPT-4 and Google DeepMind’s Gemini, can solve visual tasks as well as linguistic ones.
The same approach can work for robots, so it wouldn’t be necessary to train one to flip pancakes and another to open doors: a one-size-fits-all model could give robots the ability to multitask. Several examples of work in this area emerged in 2023.
The problem is a lack of data. Generative AI draws on an internet-size data set of text and images. In comparison, robots have very few good sources of data to help them learn how to do many of the industrial or domestic tasks we want them to.
Lerrel Pinto at New York University leads one team addressing that. He and his colleagues are developing techniques that let robots learn by trial and error, coming up with their own training data as they go. In an even more low-key project, Pinto has recruited volunteers to collect video data from around their homes using an iPhone camera mounted to a trash picker. Big companies have also started to release large data sets for training robots in the last couple of years, such as Meta’s Ego4D.
This approach is already showing promise in driverless cars. Startups such as Wayve, Waabi, and Ghost are pioneering a new wave of self-driving AI that uses a single large model to control a vehicle rather than multiple smaller models to control specific driving tasks. This has let small companies catch up with giants like Cruise and Waymo. Wayve is now testing its driverless cars on the narrow, busy streets of London. Robots everywhere are set to get a similar boost.
This column does not necessarily reflect the opinion of overwrite.ai and its owners.
overwrite | real estate content creation, reimaginedFor Full Article: https://www.technologyreview.com/2024/01/04/1086046/whats-next-for-ai-in-2024/?_se=bWFyaWFtQG92ZXJ3cml0ZS5haQ%3D%3D