When AI meets Product: December’24 AI Product Updates

Keeping up to date with new AI models, products, ethics, and trends

Anna Via
7 min read3 days ago

Welcome to the December edition of “When AI Meets Product — AI Product Updates”. December was a very busy month in the field of AI, and in this edition you’ll find the most important updates:

  • GenAI Model Updates: almost all major players launched new models last month, some of them focused on advanced reasoning capabilities in large multimodal models while others focused on improving smaller text only models.
  • AI Product Updates: with some updates related to AI agents applications and AI-powered search features in a variety of digital products.
  • AI Ethics and Legislation Updates: covering the shift from evaluating models to the need to consider entire AI systems.
  • Industry Trends: key predictions for 2025 coming from tech and AI industry referents.

GenAI Model Updates

December brought a wave of new releases and updates from almost all of the GenAI model providers:

  • OpenAI had a particular way to celebrate Christmas through a 12-day series of updates that included the launch of o1 models, which use Chain of Thought and agentic loops to improve reasoning capabilities. Just a few days later o3 and o3-preview were announced (skipping o2 due to trademark constraints with Telefonica), which further extended o1’s reasoning capabilities. All the “o” versions are said to include new alignment strategies that should also make these models safer. In another fun turn of events, ChatGPT was made available through a phone call and WhatsApp. A less festive note for users who want to access the most powerful models through the ChatGPT interface: a new subscription tier charging $200/month. Finally, it was announced Sora, their video generation tool, is now available to the wider public.
  • The relatively under-the-radar Hangzhou AI Lab gained attention with the launch of DeepSeek-R1, a model that also adds reasoning steps to produce higher quality answers. It is said to be challenging the performance of OpenAI’s o1 models.
  • Google introduced Gemini 2.0, designed to integrate multiple reasoning steps — such as think, plan, remember, and take action. This new model is also focused on enabling multiple agents on top of it. Their new video generation model (Veo 2) also generated a lot of interest, as it looks more realistic than other models seen up until now, so did their new foundational world for games (Genie 2).
  • Amazon has finally entered the competitive landscape of AI model providers with the launch of their Nova models. They range from a small text-to-text model, to a large multimodal model, alongside image and video generation models. It is of special interest how they plan to position on pricing, claiming they will be “at least 75 percent less expensive than the best performing models in their respective intelligence classes in Amazon Bedrock”. Another important update on Bedrock is that it now enables the deployment of 83 open models from Hugging Face, further expanding the variety of model choices available to AWS clients.
  • Meta launched Llama 3.3, a 70 billion parameters text-only model designed to improve performance for text-based applications compared to previous versions of Llama.
  • Microsoft also launched a competitive small model, Phi-4 (14B parameters) that specializes in complex reasoning tasks.

Feeling overwhelmed by the multitude of AI model options on the market? You’re not alone!

When deciding which model to use, the best thing is to start by assessing general model capabilities. You can explore resources like model comparison platforms such as Chatbot Arena. On top of general capabilities, factors such as pricing, latency, and ease of integration should also be part your evaluation. The best choice will always depend on the specific use case, so be sure to test your top model candidates with your requirements, evaluations and metrics that matter for the given use case.

ChatBot Arena, example of model comparison platforms

AI Product Updates

General vs business specific models

As the race to build general-purpose models intensifies, many competitors are focusing on advancing reasoning capabilities to get closer to AGI. These strategies raise many questions: How many “really good” models will be available in the future? How will each competitor differentiate and create unique value? How effectively will companies apply and integrate these technologies into real-world use cases with positive ROI?

Some of these questions might help explain Cohere’s pivot towards business-specialized models, “driven by customer demand for models suited to specific tasks rather than broad, general-purpose models”. Companies will need to understand for which use cases general models work, and when specialized or fine-tuned models (provided my companies like Cohere or trained in-house) are needed.

Agents everywhere

From recent foundational model updates, it’s clear that agents are emerging as a central trend in AI. Agents are used not only to enhance LLMs’ ability to answer questions effectively by introducing structured reasoning steps and self-criticism, but also to unlock multiple automation opportunities.

A first mover rapidly positioning in this area has been Stripe, which recently launched an agent toolkit that allows agents to access financial services and tools. This enables features like earning and spending funds, managing common support operations, and billing.

If you want to learn more about agents, two recommended resources on the topic are: Anthropic’s best practices to build agents, and the AI agents market map.

AI-enabled search

Another area where current AI progress enables solving problems that before where impossible or really hard to solve is of course, search. Some examples of products already improving greatly their search features thanks to successfully integrating AI are:

  • Figma, which has introduced multimodal search to address specific user needs, such as finding inspiration through related designs (functional overview and technical post).
  • Datadog, which is enabling natural language searches for service health and ownerships through Bits AI.
  • Adevinta, which has also implemented natural language search use cases like the AI-driven feature “From Filters to Phrases: Our AI Revolution in Car Search”.

AI Ethics and Legislation Updates

From models to systems

As AI continues to become more popular an impactful, it’s becoming increasingly clear the importance to shift our focus from evaluating models in isolation to considering AI systems as a whole, from the design and the data used, to how humans interact with the system and are influenced by it.

The paper Human-AI coevolution discusses what these human-AI interaction mean and how they can evolve over time, especially in the context of recommender systems. It highlights risks like polarization and the need to address societal well-being alongside corporate goals.

Focusing on how to evaluate these systems, tools like datasheets for datasets or model cards remain valuable but are no longer sufficient. New proposals, such as “Use case cards: a use case reporting framework inspired by the European AI Act”, are promising to be able to perform this broader system evaluation.

Also in the context of defining ethical AI systems, Bring human values to AI is a great reference for principles to include in the end to end design (from human feedback and potential surprises to values and trade-offs),

Other references

  • #paid published the first guidelines for creators and brands for ethical and effective use of AI in content creation, including topics such as “help, not do”, “embrace AI openly” or “quality as a superpower”.
  • EU’sAI Act progress: The first draft for general purpose AI is already published, and will be iterated until April 2025.
  • Time’s Model Safety Report on a variety of foundational GenAI models, rates Anthropic’s Claude as the safest, while still noting room for improvement.
  • Interview with Arati Prabhakar, White House chief tech advisor, highlights real AI harms like deepfakes and image-based sexual abuse, in contrast to initial fears that turned out to be less risky (e.g. biological weapons), and the importance of building trust through privacy, fairness, and safety measures.
  • Interview with Kasia Chmielinski, former White House technologist, introduces the Data Nutrition Project, which provides “nutrition labels” for datasets to improve transparency and accountability. Surprisingly this project has also helped improve data quality and it has been used for educational purposes. On the product side, there were some interesting insights on the complexity of balancing business goals with inclusivity and ethical practices, and the importance of designing processes thoughtfully, running robust evaluations, and red teaming.

Industry Trends

As 2024 ends, here’s a summary of key AI insights and predictions for 2025:

Benedict Evans: “2025: AI Eats the World”
Covers how the immense interest in Generative AI (GenAI) last year has already had a great impact in areas like coding, marketing or customer support (even despite errors, hallucinations and limitations it still has). However there are also key challenges that still need to be solved through 2025: how far will this scale, how is this useful, how do we deploy this.

Menlo Ventures: “2024: The State of Generative AI in the Enterprise”
A survey to +600 US companies showing:

  • Shift from pilots to production with increased spending and application growth.
  • Top use cases: coding copilots, chatbots, enterprise search, meeting summarization.
  • Automation is still emerging, with growth expected in 2025.
  • A 50/50 buy vs. build approach and focus on value over quick wins.

Wrapping it up

That was it from When AI meets Product — December’24 AI Product Updates. I’m sure 2025 will bring us many more exciting progress in generative models, ethical frameworks, and real-world applications. Stay tuned!

--

--

Anna Via
Anna Via

Written by Anna Via

Machine Learning Product Manager @ Adevinta | Board Member @ DataForGoodBcn

No responses yet