The fight against distillation in AI is eerily similar to the battle against Napster in the music industry. Shutting it down didn’t stop piracy—it just forced the industry to evolve. Suing and banning won’t stop model distillation either. Instead, it’s an opportunity.

What Is Distillation and Why the Controversy?

Distillation is like a student learning from a teacher: instead of memorizing every detail, the student grasps key concepts and simplifies knowledge. In AI, distillation trains smaller models to mimic larger ones, making them faster, cheaper, and easier to deploy. This is exactly why DeepSeek is in the spotlight. Reports suggest it may have distilled OpenAI’s models, leading to OpenAI’s investigation and David Sacks, the "AI Czar," calling for legal action to shut it down. But history tells us that eliminating one player won’t stop the underlying trend—it will just push distillation underground or into the hands of new startups. OpenAI might enforce stricter licensing, pursue legal action, or introduce technical countermeasures, but the demand for smaller, optimized models will persist.

Startup idea: Distillation-as-a-Service

Rather than trying to destroy DeepSeek, why not use this shift to build your own startup? Distillation isn’t going away, and models like R1 don’t prohibit it in their terms of service. There’s a growing need for country-specific, regulated industry, and device-optimized smaller models, and distillation is the fastest way to create them. Instead of going after Deepseek, Distillation-as-a-Service platform on top of R1 could be an interesting startup idea. You help businesses fine-tune their own LLMs legally, tailored to their needs. It’s a market with real demand:
  1. Governments will want LLMs running within their own countries. You don’t expect them to store citizen data on U.S. servers, do you?
  2. Device-optimized models are crucial for on-device AI, reducing size and reliance on cloud infrastructure.
  3. Regulated industries need models that comply with strict data privacy laws. For example, you could build HIPAA-compliant models for healthcare, ensuring personal health information doesn’t leak—even if users make mistakes—potentially saving companies millions in compliance risks.
  4. Enterprise AI players like IBM wouldn’t want to be left behind in the AI race. Helping their models stay on par with frontier LLMs could be a major value proposition.
Instead of fighting the inevitable, be the Spotify, not the record label stuck in the past. In my best Garry Tan impression: "Winter batch for YC is open—wait, did I just say that? I’m not even in YC, but hey, if you’re building this, its time to apply!"