<aside> š Our website https://www.sievedata.com/
</aside>
AI models can fundamentally change how we interact with technology, but thereās still a huge gap between these models and what makes them useful in end products. We started Sieve to bridge this gap.
Programming, content creation, and search are all examples of areas where AI has made a mark. A lot of this progress, however, hasnāt been as simple as an OpenAI API call. Itās months of deep, applied R&D that involves evaluating dozens of models, setting them up on complex cloud infrastructure, and composing them into production-ready pipelines ā all things only a few teams in the world know how to do.
Take ElevenLabs Dubbing Studio, for example, which combines speech transcription, LLM translation, text-to-speech, and audio separation, with nuanced logic that enables quality, automated AI dubbing. Or Perplexity AI which combines an LLM with search results to produce beautiful citations along with its summarized answers to questions ā all in a few seconds.
If we want more teams to ship great AI products, models shouldnāt be the only tool in the developer toolbox. Developers should have direct access to tasks that models enable such as dubbing, content moderation, translation, etc. A task is a specific use case that a model or set of models might enable. It handles the domain-specific legwork through prompting, relevant pre/post-processing steps, hand-coded logic, fine-tuning, and precise multi-model pipelining. It is packaged with a set of options designed to help developers customize and make tradeoffs without doing the legwork themselves. For example, translation is a task, while GPT4 or SeamlessM4T are models that enable it.
Why reimplement and maintain bank API integrations when you could use Plaid? Itās not that you couldnāt learn how to. Itās that spending the months in the weeds of bank APIs isnāt core to your differentiating product experience and would take many months to execute well.
Developers could integrate these tasks directly, make cost/quality/speed tradeoffs, or combine them to make even more complex tasks. Complex use cases would become trivial to string together, improvements in base models would constantly propagate through the stack, and meeting a certain set of constraints wouldnāt mean having to keep up with every new model release or model provider.
Thatās what Sieve is, and itās why building AI products will become incredibly simple.
While our vision is broader than video, we believe video AI to be a great niche to start with. Every video product today is being overwhelmed by a ton of new use cases that are enabled by AI. Video is unique in that itās much more compute and data-intensive to process or generate compared to other modalities. This leads to a ton of complexity around the ways models are run, the kinds of extra processing needed to happen around them, and the complexity of pipelines that solve the most valuable use cases in the modality. To this end, Sieveās strongest and most immediate value proposition comes from being an AI toolkit that solves problems unique to video ā unlike generic AI developer tools that might exist today. We do this in a few ways.