When ML meets Product: May ’24 AI Product Updates
Keeping up to date with new AI products, use cases, trends, and resources

Welcome to the May edition of “When ML meets Product — AI Product Updates”. Due to the rapid advancements in the field of Artificial Intelligence, I dedicate quite some time to keeping up to date with the progress and finding insights into the busines sand product implications of these developments. This research usually includes reading (a lot!) of newsletters, medium blog posts, and various news outlets and resources. In this update, I’ll cover the latest AI product-related news, use cases, trends, and resources, that I’ve found most relevant in recent weeks.
- Product & Business trends: is GenAI causing a revolution in companies, or mostly just providing small, incremental value improvements?
- GenAI model updates: main updates from the bigger players’ foundational models, what is Apple up to (it has a lot of potential for digital products!), and other updates related to image, video, and voice generation products.
- Interesting AI use cases: focusing on two promising fields, AI hardware and AI coaching & teaching.
- Other relevant resources and future events.
Product & Business trends
GenAI revolution vs incremental value improvement
Are you having a hard time brainstorming on potential use cases or on how to achieve true innovation with GenAI in your company? You are not alone! According to a Bain analysis,
“Over 30% of Fortune Global 500 companies are investing in generative AI, but 85% of their announced initiatives are focused on incremental innovations”
In the April edition, I discussed some examples of very specific sectors where GenAI is truly revolutionizing the business: customer support and marketing. We are also seeing how people are becoming more efficient at specific tasks thanks to GenAI (brainstorming, writing, designing, coding…). However, in other sectors or use cases, we haven’t seen a big revolution yet, which links to the 85% of companies mentioned at Bain’s analysis. There, the use cases where GenAI is being leveraged are relatively simple features that provide only incremental value. Usually, this type of features have been relatively obvious since the beginning of the GenAI hype, and there haven’t been many new ones since.
MIT IDE exposes Two Reasons Why AI Isn’t a Tech Revolution Yet, and this topic has also been discussed in the interview “Is AI a platform shift or a paradigm shift, with Benedict Evans”, and his post “Looking for AI use-cases”. My main takeaways from all this are:
- For the sectors and problems that haven’t been revolutionized by GenAI yet, it is not clear if the reason is because it never will, or because there is something still missing (time, technology, mindset changes…). To move beyond the obvious, incremental value use cases (specific features, cost optimization…), it is key to think about how can GenAI fundamentally change the business model. As explained also in Six major GenAI trends that will shape 2024’s agenda , “re-imagine entire domains rather than isolated use cases”.
- GenAI is highly dependent on where and how you implement it. General purpose GenAI chatbots have the risk of being too broad, thus limiting what we can actually do, or know we can do with it. If implemented within specific features or with specific purposes, there is the risk of being perceived just as a ChatGPT wrapper, and not being able to provide any competitive advantage. If your competitor implements it, though, you might be forced to implement it too to keep up with it.
If GenAI hasn’t revolutionized your sector yet, I invite you to keep reading the post to be up-to-date with GenAI updates that might inspire you to innovate! If GenAI has revolutionized your sector, you are welcomed to keep reading too, you might be able to revolutionize it even further!
GenAI model updates
Bigger players updates
The war between the main GenAI model developers continues. In the last few weeks we have seen:
- Google releasing Gemini 1.5 Pro, and other important updates
- OpenAI releasing GPT4-Turbo with Vision
- Meta releasing Llama3
However, all these updates seem to be going in similar directions: more capabilities (better at reasoning, at following instructions…), bigger and bigger context windows (text that is allowed and “remembered” through the prompt), better at specific tasks (code generation, specific output formats like JSON…), full multimodality (including text, images, short videos, audio…), and cost efficiency (lowering costs while offering different sizes & price options for each model).
What is Apple up to?
There have been some news related to how Apple might be in talks with OpenAI to reach an agreement to use their foundational models. While it is not clear if that means Apple has decided not to build a competing GenAI foundational model, it is really interesting to see the recent papers published by some of their researchers (ReALM: Reference Resolution As Language Modeling, Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs). They explore something that can be of great potential for digital products: GenAI systems that can leverage the user’s screens and UIs.
Examples from the papers are, from a given screen or webpage, to be able to ask questions like “What’s the function of this screen”, or “How do I purchase X”. This is an interesting direction because it opens the possibilities of LLMs that are not limited to the context proactively provided by the users, but to further context from their laptops, phones, apps, or even their reality context (e.g. the room where they are, recorded by the phone or laptop camera).
Images, video, and voice generation updates
- Cool Product #1: Last month I mentioned Google’s VLOGGER, and how it enables the creation of a talking human video from a picture and text. Now a similar product, VASA-1, has been released by Microsoft.
- Cool Product #2: Last month I also mentioned Suno, an amazing song generator that I used to create my first country song “When the Algorithms Dance”. This month I discovered Udio, a similar product that is also incredible at generating songs (check my “Metrics in Harmony”).

- Cool Product #3: Viggle AI allows style transfer on videos.
- Cool Product #4: Hume is an empathic voice interface that adapts responses based on the detected expression or mood of the person’s voice.
- Cool Product #5: LeiaPix takes images from 2D to 3D.
- Cool Product #6: Krea allows image & video generation, modification, and correction.
Interesting AI use cases
In this edition we’ll deep dive into two other fields where GenAI might hae big potential: AI hardware and AI coaching & teaching.
AI hardware
The implications of the latest AI advancements on the future of the smartphone era remain uncertain. Smartphones have many advantages that are hard to surpass: lightweight, pocket-friendly size, ability to leverage LLMs already — potentially even run locally… Nevertheless, there are some players trying to shake up the gadget market through AI:
- Humane’s AI Pin, wearable multi-modal device powered with an LLM to assist while sending messages, taking pictures, taking notes…
- Limitless, wearable that records and connects to Mac or Windows to assist mainly in work environment tasks.
- Plaud, phone “sticker” that records the voice and is powered by ChatGPT.
- Rabbit, pocket multi-modal device with a natural language interface powered by a LLM.
- Ray-ban glasses, glasses powered by camera, audio and Meta AI.
- AI in a box, literally a box that can connect to your data and run an LLM locally to answer questions.
Ethan Mollick (Wharton professor who studies entrepreneurship, innovation and AI) recently reviewed most of these products in “Freeing the chatbot”, check out his newsletter if you are interested in knowing more.
Personally, I am really excited about this idea of being able to have a second brain, with secure access to all my digital and real-life data, so that I can ask questions, remember anything I need at any moment, and write with my “knowledge and style”. It is unclear if a dedicated device would be needed or if the same smartphone or laptop could be used to that end. Of course, this comes with risks, mainly related to data privacy and security. For me it would be a must the LLM runs locally and not in the cloud, and has strong security protocols: I wouldn’t want my “second brain” input data or responses, to be accessed by anyone other than me!
AI Coaching & teaching
Imagine a coach, teacher, or companion, that can help you with specific learning tasks 24/7, won’t judge you, and that will have infinite patience. This is the opportunity for products in the coaching, teaching, companion, and even therapy area.
- Meeno, the loneliness cure: Ex-Tinder CEO is building a product to help people develop social skills, especially for those who “haven’t had the chance to practice them safely”. It is not thought as a substitute to therapy, nor as a “boy/girlfriend” companion, but as a tool to help develop the skills and be able to improve relationships in the real world.
- Soul Machines: AI-powered digital avatars, available 24/7 and with 13 languages. One of their applications is using these avatars as teachers or coachers, to learn something new or practice conversation in your language or in a language you are learning. The product is discussed in a great podcast episode from Melissa Perris’s Product Thinking.
- Maybe we don’t need 3D moving avatars to learn and LLMs are enough, though! This idea is explored in another post from Ethan Mollick, where several specialized prompts are designed and reviewed to help users learn a specific topic. The use cases include role-playing, goal-playing, co-creation, or teach the AI as learning strategies. They also publish a paper with further technical details: Instructors as Innovators: a Future-focused Approach to New AI Learning Opportunities, With Prompts.
While all these products can be great as learning and coaching opportunities for all of us, there is the risk that similar “AI-avatar” products harder to defend from an ethical and value point of view succeed in the market and make people addicted to them. I’m talking about products trying to offer virtual boyfriends or girlfriends, options to talk to a “version” of someone who died, or influencers and famous people chatbots.
📝 Relevant resources
Stanford’s 2024 AI Index report, with key takeaways like:
- AI beats humans on some tasks, but not on all. It also makes workers more productive and leads to higher quality work.
- Industry continues to dominate frontier AI research, and frontier models get way more expensive
- Robust and standardized evaluations for LLM responsibility are seriously lacking
The 2024 MAD (Machine Learning, AI & Data) Landscape, with key takeaways like:
- The rise of AI Stack (vector databases, frameworks, guardrails…)
- Experiments vs reality (all major vendors rushing GenAI features, companies finding the obvious incremental value problems to solve)
- The death, or not, of traditional ML, and a future of LLMs, SLMs and hybrid approaches
- Where is all GenAI money going to (NVIDIA is a big winner, and also consulting firms!) and who is paying for the subsidized use of all the super expensive foundational GenAI models.
Dropbox CEO Drey Houston wants you to embrace AI and remote work. Interesting conversation with Dropbox CEO. I found specially relevant his vision on how Dropbox can help be that perfect storage of our digital life (from personal pictures to documents from work), and powered through LLMs the perfect 24/7 assistant.
GenAI in Production — Challenges and Trends, Verena Weber. On the importance of starting from the problem (and looking for papers and inspiration from there), and finding simple solutions and challenge them against the bigger ones. She also shares learnings from her experience working at Amazon — Alexa (model robustness complexity, and why it is needed to avoid retraining or updating models to maintain their performance on a huge variety of prediction requests).
🗓️ Future events


Wrapping it up
That was it from When ML meets Product — May’24 AI Product Updates. Hope you enjoyed the read! I’ll be happy to hear your thoughts, questions, or suggestions.