Here's a number that should bother you: roughly 1 to 2 percent of the world's data is openly available for AI training. The rest, the premium stuff, the proprietary reports, the specialized datasets, the information that would actually make AI useful in high-stakes domains, is locked behind paywalls, legal agreements, and institutional firewalls.
Redpine, a Stockholm-based startup founded in 2024, just raised EUR 6.8 million in seed funding to unlock it. The round was led by NordicNinja, with participation from Luminar Ventures and node.vc. Angel investors include Peter Sarlin (co-founder of SiloAI, which sold to AMD for $665 million), Patrik Tran (co-founder of Validio), Anna Nordell Westling (co-founder of Sana), and leaders from OpenAI, Perplexity, and Spotify.
That angel list tells you everything about how the AI industry views this problem. The people building frontier models know that data quality is the bottleneck. They're putting their own money behind a company trying to fix it.
The Spotify Playbook, Applied to Data
Co-founder and CTO David Osterdahl was part of the early team at Spotify, where he helped build the infrastructure that transitioned the music industry from declining sales and piracy to smart streaming. CEO Anders Hammarback, a former VC partner, saw the same pattern: AI companies scraping the same publicly available data, building products with no differentiation, and compensating rights holders exactly nothing.
Redpine's model is closer to Spotify than to a traditional data broker. The platform partners with content owners to license premium, non-public datasets, then makes them available through a compliant, usage-based infrastructure. Rights holders get paid. AI companies get data that isn't available to their competitors. Everyone's incentives align.
It's the kind of idea that seems obvious once someone says it out loud. Which is usually a sign that execution, not concept, is the hard part.
Built by a Team from Four Nordic Unicorns
The founding team draws from Spotify, SanaLabs, Zettle (acquired by PayPal), and Lunar, plus stints at McKinsey, CERN, and H&M. Founding data scientist Dr. Leonora Vesterbacka leads the development of Redpine's proprietary retrieval and reranking technology, the engine that makes diverse, licensed datasets actually usable by AI systems.
NordicNinja General Partner Marek Kiisa joins the board. The firm, backed by Japanese corporates including NTT, Panasonic, and Honda, has been increasingly active in Nordic AI infrastructure deals.
The Compliance Question Nobody Wants to Answer
The elephant in every AI data room is legality. Training models on scraped internet data has produced a parade of lawsuits, from the New York Times vs. OpenAI to Getty Images vs. Stability AI. The legal landscape is shifting fast, and companies that built their models on questionable data practices are going to face reckoning.
Redpine positions itself as the compliant alternative. Every dataset is licensed. Every rights holder is compensated. In a world where the EU AI Act is tightening requirements around training data provenance, that compliance layer isn't a nice-to-have. It's a necessity.
The company is already working with leading international AI labs and US-based biotech research firm AsedaSciences. As AI agents become more autonomous and need access to specialized, verified information (not just whatever's on the open web), the demand for platforms like Redpine's should only grow.
A Bet That Data Is the New Oil, If You Actually Pay For It
EUR 6.8 million is a healthy seed round but modest compared to the scale of the opportunity. If even a fraction of the locked-up data economy opens up through licensed infrastructure, the company sitting at the center of those transactions will be enormously valuable.
Redpine isn't building another AI model. It's building the supply chain that every AI model needs but few want to talk about. That's either the most boring pitch in tech, or the most important one.
Probably both.
