Artificial Intelligence in 2025 and Beyond

David Serrault
8 min readDec 30, 2024

--

An infinite forward tracking shot in an cyberpunk city lights with the year 2025 written in neo lights in the middle.
The year 2025 — Midjourney + KlingAI

“We tend to overestimate the effect of a technology in the short run and underestimate the effect in the long run.” Roy Amara

A retrospective on the AI researches publications that impressed me the most in 2024 and the trends they may heralds for 2025 and the years to come.

Keeping up with the intense news cycle in the AI field — particularly regarding its impact on society — feels like a marathon.

However, beyond the numerous corporate press releases, reading researchers’ publications is an excellent way to avoid overestimating the short-term effects of technology and, above all, not underestimating its long-term consequences — thus proving Amara’s law wrong…

Research projects provide a more realistic overview of how technology and its applications are evolving. This is why I therefore make a point of studying at least one AI research article in depth each week (summaries posted on my LinkedIn feed and on my personalised GPT. Stay tuned!).

The year’s end in an opportunity to put together this selection that reflects in my opinion a few of the profound transformations sparked this year that are likely to gain momentum in the coming years.

Welcome in 2025 !

1 — Toward More Frugal AI

The deployment of compact and resource-efficient models will accelerate in order to address personal data protection issues, energy impact reduction, and meet hardware constraints (smartphones, tablets, embedded devices). Collaborative approaches (Mixture-of-Agents) may also become widespread, allowing multiple specialized AIs to pool their expertise in more efficient architectures rather than relying on a single “giant” system to handle every possible task.

Reading list :

  • LLM in a flash: Efficient Large Language Model Inference with Limited Memory
    A research by a team at Apple on models that can run on smartphones or other “resource-constrained” devices. The emergence of lighter architectures meets critical needs: making AI accessible anywhere without always resorting to the cloud, while also reducing the risk of personal information being intercepted by executing processes locally on users’ devices.
  • Mixture-of-Agents Enhances Large Language Model Capabilities
    Another approach of optimisation that involves having multiple specialized models (Mixture-of-Agents) work together rather than endlessly inflating the size of a single, central model. Inspired by the principles of collective intelligence, this approach reduces computation costs and leverages each sub-model’s unique capabilities. In practice, user experience may be impacted (longer response times, additional coordination overhead), but the overall performance could surpass that of a single “giant” LLM — often at a lower cost.

2 — The World Generators

AI-powered generation of virtual worlds will revolutionise numerous fields, from the video game industry to industrial simulation or medical training. Models capable of interpreting or creating animated, even interactive scenes will advance further, aided by evolutions in hardware capabilities and algorithms.

Reading list :

  • Diffusion Models Are Real-Time Game Engines
    This team of Google researchers trained a neural network to produce real-time images of the vintage video game Doom. Although the results are far from competing with modern 3D engines (yet), this demonstration proves that diffusion models — initially designed for static images — can, with some tweaks and significant computing power, simulate a complex environment in real time. This work could also support other efforts aiming to give AI a deeper understanding of the laws of physics and spatial coherence.
  • Are Video Generation Models World Simulators?
    Now that we can convincingly generate images and text, the next frontier for AI is video generation or leveraging video sequences to better understand our environment. This is the challenge highlighted by Raphaël Millière, a lecturer in philosophy at Macquarie University in Sydney, who questions whether certain models (like OpenAI’s “Sora”) can become “world simulators.” These generated videos are likely not yet fully convincing, and the training data (possibly drawn largely from simulations built on Unreal Engine 5) remain limited. Nonetheless, this avenue illustrates the value of combining the power of LLMs (to understand context and narrate a scene) with methods from computer vision (for generating or analyzing images).

3 — The Action Models

AI agents will increasingly understand graphical user interfaces the way humans perceive them. Whether for automated testing, support systems for people with disabilities, or malicious “click-bots,” these “action models” will become more sophisticated. Designers will have to think about their interfaces for new types of usage… and new kinds of users…

Reading list :

All these publications describe systems capable of understanding and operating applications: clicking buttons, entering text, navigating interfaces. Far beyond the scripted bots used in software quality assurance, soon we might have “multimodal conversational agents” that can act as substitutes for humans in using graphical interfaces.

This is a strategic topic for the future of human-machine interaction, and research teams in both academia and industry are competing in creativity. Their common goal is to train models that can “observe” a smartphone or a computer screen and then interpret, click, type, scroll, or even speak at just the right moment. Various prototypes, such as Meta-GUI, AutoDroid, Ferret-UI, or UI-JEPA, demonstrate rapid advances in the field of multimodal interaction. The same dynamic is found in other research focusing on user behavior or on how to create “smarter” assistants using personalized LLMs.

4 — AI at the Service of the Designer… for The Better or for The Worse

AI-powered design tools will deliver productivity gains, and will potentially fuel creativity. However, they also pose a genuine risk of homogenising outputs and may rise manipulation techniques to an all new level if ethical safeguards are not in place.

Reading list :

  • Using Generative AI and Semantic Diversity for Design Inspiration
    This research team focuses on how AI can support creativity in design. They examined the impact of AI-generated images on the visual ideation phase, highlighting how AI can help break “design fixation” by offering more options for creatives to explore. Their experimental application, DesignAID, allowed a hundred designers to create mood-boards with the help of AI and compare them with traditionally made mood-boards. Early results indicate a positive role for AI.
  • “Create a Fear of Missing Out” — ChatGPT Implements Unsolicited Deceptive Designs in Generated Websites Without Warning
    A model’s ability to design interfaces can lead to misuse: the ethical dimension is more crucial than ever given that some algorithms can propose questionable concepts — or even blatant dark patterns — as demonstrated in this experiment. Simply asking ChatGPT or another LLM to “design an effective web interface” can spawn multiple forms of deception (fake reviews, artificial urgency, etc.). This shows that without guardrails, the AI’s “statistical” reasoning can naturally draw on morally dubious strategies discovered in its training corpus.

5 — Ethical and Cognitive Issues

As the capabilities of language models and generative systems grow, so will concerns about their impact on human creativity, disinformation, and manipulation. Companies and regulators will have to implement new standards to evaluate and validate AI reliability, while individuals will likely have to hone their media literacy and critical thinking skills.

Reading list :

  • Beware of botshit: How to manage the epistemic risks of generative chatbots
    In this article, the researchers examine how chatbots and LLMs — sometimes called “stochastic parrots” — can produce “bullshit” or, in the absence of malicious intent, “botshit.” This term describes the tendency to generate incorrect, approximate, or misleading text without the system being aware of it. The problem arises when people rely on these models for tasks where truthfulness really matters and verification is complex. The authors propose an analytical framework to manage this risk, depending on how critical truth is and how difficult it is to confirm.
  • Human Creativity in the Age of LLMs
    Another major question is how humans will preserve (or not) their creative abilities when relying on AI. The authors of this article present findings from a study at the University of Toronto, comparing the creativity of groups using AI to a control group without AI. In the short term, AI boosts productivity and inspiration, but over the medium term, ideas appear less original, and divergent thinking seems to weaken. The authors therefore emphasize the importance of using AI in a balanced way: leveraging it as a tool to spark ideas while maintaining a critical mindset and nurturing one’s own capacity for innovation.

All articles cited

  1. Alizadeh, Iman Mirzadeh , Dmitry Belenko, S. Karen Khatamifard, Minsik Cho, Carlo C Del Mundo, Mohammad Rastegari, Mehrdad Farajtabar (2024). LLM in a flash: Efficient Large Language Model Inference with Limited Memory. Apple. https://arxiv.org/pdf/2312.11514
  2. Wang, J., Wang, J., Athiwaratkun, B., Zhang, C., & Zou, J. (2024). Mixture-of-Agents Enhances Large Language Model Capabilities. https://arxiv.org/abs/2406.04692
  3. Valevski, D., Leviathan, Y., Arar, M., & Fruchter, S. (2024). DIFFUSION MODELS ARE REAL-TIME GAME ENGINES. Google. https://arxiv.org/pdf/2408.14837
  4. Millière, R. (2024). Are Video Generation Models World Simulators? Blogpost sur artificialcognition.net. https://artificialcognition.net/posts/video-generation-world-simulators/
  5. Sun, L., Chen, X., Chen, L., Dai, T., Zhu, Z., & Yu, K. (2022). META-GUI: Towards Multi-modal Conversational Agents on Mobile GUI. Shanghai Jiao Tong University. https://doi.org/10.48550/arXiv.2205.11029
  6. Zhou, X., & Li, Y. (2021). Large-Scale Modeling of Mobile User Click Behaviors Using Deep Learning. https://doi.org/10.48550/arXiv.2108.05342
  7. Wen, H., Li, Y., Liu, G., Zhao, S., Yu, T., Li, T. J.-J., Jiang, S., Liu, Y., Zhang, Y., & Liu, Y. (2023). Empowering LLM to use Smartphone for Intelligent Task Automation. Tsinghua University, University of Notre Dame, Microsoft Research Asia. https://doi.org/10.48550/arXiv.2308.15272
  8. You, K., Zhang, H., Schoop, E., Weers, F., Swearngin, A., Nichols, J., Yang, Y., Gan, Z. (2024). Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs. Apple. https://doi.org/10.48550/arXiv.2404.05719
  9. Fu, Y., Anantha, R., Vashisht, P., Cheng, J., & Littwin, E. (2024). UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity. Apple. https://arxiv.org/html/2409.04081v1
  10. Lin, K. Q., Li, L., Gao, D., Yang, Z., Wu, S., Bai, Z., Lei, W., Wang, L., & Shou, M. Z. (2024). ShowUI: One Vision-Language-Action Model for GUI Visual Agent. National University of Singapore, Microsoft.https://doi.org/10.48550/arXiv.2411.17465
  11. Cai, A., Rick, S. R., Heyman, J., Zhang, Y., Filipowicz, A., Hong, M. K., Klenk, M., & Malone, T. (2023). DesignAID: Using Generative AI and Semantic Diversity for Design Inspiration. Collective Intelligence Conference (CI ’23). https://dl.acm.org/doi/pdf/10.1145/3582269.3615596
  12. Krauss, V., McGill, M., Kosch, T., Thiel, Y., Schön, D., & Gugenheimer, J. (2024). “Create a Fear of Missing Out” — ChatGPT Implements Unsolicited Deceptive Designs in Generated Websites Without Warning. Technical University of Darmstadt, University of Glasgow, Humbold University of Berlin. https://doi.org/10.48550/arXiv.2411.03108
  13. Hannigan, T. R., McCarthy, I. P., & Spicer, A. (2024). Beware of botshit: How to manage the epistemic risks of generative chatbots. Business Horizons. https://doi.org/10.1016/j.bushor.2024.03.001
  14. Kumar, H., Vincentius, J., Jordan, E., & Anderson, A. (2024). Human Creativity in the Age of LLMs. University of Toronto. https://arxiv.org/html/2410.03703v1

--

--

David Serrault
David Serrault

Written by David Serrault

Design Director @BPCE GROUP, top French bank, specializing in Artificial Intelligence & scaling design. #AI | #UXDesign #Leadership

No responses yet