Leandro von Werra

The Open Source Evangelist

Background: Leading the research team at Hugging Face, the premier platform fostering AI innovation where developers collaborate on AI models and applications. Under his leadership, Hugging Face has become the central hub for open source AI, hosting nearly half a million datasets and 1.5 million models with new uploads every 10 seconds. von Werra has witnessed and helped orchestrate the remarkable transformation of the AI landscape from a closed ecosystem dominated by proprietary models to an open source renaissance that is reshaping global AI development.

Key Perspective: Champions the fundamental shift from AI users to AI builders, advocating that sustainable competitive advantage emerges from specialized model development rather than reliance on generic APIs. He draws compelling parallels to software development history, referencing Marc Andreessen's observation that "software is eating the world" and extending this to AI: « Every company is becoming a software company on some level. If you look around now, every major company has a huge department building their own software stack. » von Werra argues that just as companies developed internal software capabilities for competitive advantage, organizations must now build specialized AI capabilities tailored to their specific business contexts.

He tracks the remarkable convergence between open source and proprietary AI performance, providing concrete evidence that the competitive gap is rapidly closing. « When GPT-4 came out, it took roughly a year for an open model to be as good as GPT-4. When DeepSeek came out, it matched the best closed model from like a month or two ago. So that gap has been shrinking quite quickly. » This acceleration reflects both improved methodologies and increased computational resources available to open source projects.

von Werra emphasizes the strategic importance of true open source versus merely open weights models. « The interesting thing about fully open source is that the whole community can immediately build on top of what you build. If OpenAI had released how you make ChatGPT, everybody would have started building better chat models immediately, and not have that warm-up of a year. » He advocates for moving beyond the "data is the new oil" mentality, encouraging organizations to strategically share certain resources to uplevel the entire field while maintaining competitive advantages in specialized applications.

Market Analysis: von Werra provides unique insights into the startup ecosystem transformation enabled by open source AI. He highlights how Hugging Face's platform has created unprecedented accessibility for entrepreneurs: « If you're a startup and you want to build something for financial services or something, there's a very high chance there is a dataset or a model already out there that you can leverage. » This represents a fundamental shift from zero-sum competition to collaborative innovation, where startups can contribute to and benefit from community improvements rather than starting from scratch with limited resources.

He also addresses emerging concerns about synthetic data and model collapse with counterintuitive research findings. By analyzing web-scale datasets over time, his team discovered that high-quality synthetic data, particularly content that has undergone human curation, can actually improve training datasets. « We looked at what's the amount of ChatGPT data in web snapshots and you can see it started to go up quite a bit two years ago, and if you plot it against performance, you can see it goes up as well. »

Future Vision: von Werra embraces the vision of LLMs as potential operating systems for future computing paradigms while noting current limitations. « The models are really good at user interfaces, so you don't need applications anymore with predefined user interfaces. I want a dashboard showing these kinds of things. So the way we interact with computers in terms of LLMs, we're maybe more like at the terminal stage at the moment. » He envisions a future where AI interfaces evolve beyond current rudimentary text-based interactions to enable more sophisticated human-computer collaboration.

Key Achievement: Leandro demonstrated how open source AI ecosystem effects create non-zero-sum competitive dynamics, enabling startups to leverage sophisticated foundations while contributing to community improvements. His leadership at Hugging Face has established the platform as the de facto standard for AI collaboration, fundamentally changing how AI models are developed, shared, and improved globally. He provided compelling evidence that open source AI development can match proprietary performance while enabling broader innovation and democratizing access to advanced AI capabilities.

Natural language processing and reinforcement learning expert currently serving as Head of Research at Hugging Face, having previously worked as a Machine Learning Engineer at the same organization for nearly three years. Prior to joining Hugging Face, he worked as an NLP Data Scientist at Die Mobiliar and simultaneously lectured on Data Science and Visualization at Berner Fachhochschule BFH, teaching topics including the PyData stack, machine learning algorithms, and model evaluation techniques. Earlier in his career, he worked as a Data Scientist at spoud.io, where he developed anomaly detection algorithms for time series data, implemented language modeling for customer feedback classification, and improved prediction systems for logistics. Combines strong research capabilities with practical production experience in machine learning, particularly focused on using AI advancements to create positive societal impact.