Sahara AI
From Paul Veradittakit
Sahara AI ’s mission is to create a more open, equitable, and collaborative AI economy that is as easy as possible to participate in. By leveraging blockchain, Sahara ensures that all contributors (data contributors, labelers, model developers, etc) are fairly compensated, data and models maintain sovereignty, and AI assets can securely and permissions be created, shared, and traded.
Current AI Stack
The current AI stack has more or less settled into the following distinct layers:
- Data collection and LabelingData is gathered from various sources (e.g. web scraping, public datasets, user-generated data) and must comply with licensing requirements to avoid legal issues. Data is labeled based on the task at hand (e.g. classification, object recognition).
- Data is gathered from various sources (e.g. web scraping, public datasets, user-generated data) and must comply with licensing requirements to avoid legal issues. Data is labeled based on the task at hand (e.g. classification, object recognition).
- Model Training and servingData is fed into the model, which adjusts its internal parameters (weights) to minimize errors. This is computationally expensive and time-consuming.
- Data is fed into the model, which adjusts its internal parameters (weights) to minimize errors. This is computationally expensive and time-consuming.
- AI agent creation and deploymentUser experience of creating an AI agent typically involves using tools like TensorFlow, requiring technical expertise.
- User experience of creating an AI agent typically involves using tools like TensorFlow, requiring technical expertise.
- Compute resourcesModel Training requires expensive processing.
- Model Training requires expensive processing.
Each layer has seen intense competition and diversification, with a category of execution that has largely proven to be the most effective on each. For example, data collection is best done with large, public datasets (like books) and fine tuned with specialized data (research papers). Model training is best done on specialized hardware, AI agents should be easily made with plug-and-play resources to build a developer community, and compute resources should be distributed in order to accurately reward those who provide compute resources. Integrating these pieces will lead to better AI models and a stronger community.
Web2 companies are trying to do this, but face severe limitations because of their centralized design. From both a corporate and technical perspective, these firms are designed to limit access and silo different parts of the stack, leading to different security standards, database designs, backend integrations, and monetization strategies. In effect, poorly designed to face the shifting paradigm of the AI economy.
For example, OpenAI has built a very powerful foundational model and has begun embracing community builders with its permissionless GPT wrapper marketplace, but only allows surface-level prompt customization rather than a rewiring of the underlying model. The company purchases all its computing resources using investors’ money and is on track to end this year with a $5 billion loss .
Collaborative AI Economy
Sahara’s platform serves as a one-stop shop for all AI development needs throughout the entire AI lifecycle: from data collection and labeling, to model training and serving, AI agent creation and deployment, multi-agent communication, AI asset trading, and crowdsourcing of AI resources. By democratizing the AI development process and lowering barriers to entry found in existing systems, Sahara AI provides equal access for individuals, businesses and communities to co-build the future of AI.
The figure above presents an overview of this user journey, illustrating how AI assets move from creation to utilization and user engagement within the Sahara AI ecosystem. Notably, all transactions within the platform are immutable and traceable, with ownership protected and the origins of assets recorded. This supports a transparent and fair revenue-sharing model, ensuring that both developers and data providers receive appropriate compensation for their valuable contributions whenever revenue is generated.
Sahara’s goal is to make it easy to participate in this AI economy. The steps developers and users take are:
Seasoned AI developers:
- Developers can use Sahara SDK & API to interact with the Sahara Blockchain as well as any of the layers of the AI stack, like personalizing computation power, data storage, and incentive structure to form their own Sahara AI Agent, which can be permissioned and monetized for others to use.
New AI developers:
- Through a no-code/low-code environment, developers can create and deploy AI assets through intuitive interfaces and pre-built templates.
AI training:
- To participate in AI model training, users will simply go to a website where they can complete AI training tasks, from solving basic math to describing short videos, and get compensated with tradeable tokens.
AI users:
- Users can easily use an AI agent through an intuitive UI. Users have the flexibility to purchase licenses for access and further development or even trade shares of the AI assets.
- Users will be able to create a “Knowledge Vault” of their own personalized data, and create specialized AI trained with the user’s own data. This will, just like other AI’s, permissioned to allow access to others, while the training data remains completely private and secure.
Corporations:
- Corporations can also create AI Agents (or “Business Agents”) trained on their own proprietary data, and since the system runs on the Sahara Blockchain, is significantly cheaper due to decentralized AI agent generation and serving costs
- Corporations can also pay to generate Sahara Data, which melds AI Automatic Labeling and Human Labeling in order to efficiently create high-quality, privacy preserving, multimodal datasets.
Other than the corporate-facing products which are already being used by high-profile clients, all other features are unreleased, but have waitlists. Sign up for them here .
Technical Brief
The Sahara team has designed the system to be as simple to use as possible, abstracting away the complexity required to ensure every part of the AI stack is compatible, monetizable, and secure. Under the hood, countless innovations were developed by the Sahara team to make this possible. To list a few:
- Sahara Blockchain minimizes gas fees, is fully EVM compatible, and the Sahara Cross-chain Communication (SCC) Protocol enables secure, permissionless data transfer across blockchains, facilitating trustless interoperability
- Sahara AI-Native Precompiles (SAPs) are precompiled smart contracts to optimize performance for AI tasks to reduce computational overhead, including Training Execution SAPs and Inference Execution SAPs
- Sahara Blockchain Protocols (SBPs) manage AI assets to ensure accountability, like the AI Attribution to track contributions and distribute rewards, AI Asset Registry to manage registration and provenance of AI assets, AI Licensing, and AI Ownership
- Data Management is held on and off chain, with AI asset metadata, commitments, and proofs on-chain, while significant datasets, AI models, and supplemental information is held off-chain in order to optimize data retrieval, security, and data availability
- Collaborative Execution Protocols allow for joint AI model development and deployment across AI training, aggregation, and serving. Additional models like PEFT allow for fine-tuning techniques, Privacy Preserving Compute allows for Differential Privacy, Homomorphic Encryption, and Secret Sharing, and Fraud Proofs allow for its namesake.
Fully Integrated AI Stack
The team is led by the duo Sean Ren , a USC tenured professor who was previously awarded MIT Technology Review Innovator 35 and 2023 Samsung Researcher of Year, and Tyler Z , a UC Berkeley Alumni who was previously Investment Director at Binance Labs, and Investments at finance firms. The rest of the team bring in valuable expertise from Stanford, UC Berkeley, AI2, Toloka, Stability AI, Microsoft, Binance, Google, Chainlink, LinkedIn, Avalanche, and more.
Sahara is also advised by top AI native researchers and enterprise clients:
- Laksh Vaaman Sehgal (Vice Chairman, Motherson Group)
- Rohan Taori (Research Scientist, Anthropic)
- Teknium (Co-founder, Nous Research)
- Vipul Prakash (CEO, Together AI)
- Elvis Zhang (Founding Member, Midjourney)
Sahara AI is currently used by over 35 leading tech innovators and research institutions, including Microsoft, Amazon, MIT, Motherson Group, and Snap for various AI services like Sahara Data (for data collection/labeling) and Sahar Agents (for personalized domain agents).
Generative AI is still very much in its infancy in tech and market size; the centralized chat and video tools of today are throttled by their limited reach because of the difficulty in integrating the entire AI stack into one product. Sahara AI is the only company solving this bottleneck through a modular design which uses blockchain as the backbone for permissionless access, token distributions, and security. The future of AI must be easily accessible and equitable in order for everyone to participate in it, and Sahara AI is the only company building towards this vision.
For a more detailed technical breakdown, read the litepaper here .
Join the waitlist here .
Contact Sahara to create and use a Sahara Business Agent here .
Pantera Capital co-led the $43M investment round into Sahara AI.
- Paul Veradittakit
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
2024 Crypto Developer Report SummaryExecutive Summary
Digital Veblen Goods and Fees
Musings on the Future of Actually Smart Wallets
Bitwise CIO: Биткойн может достичь $200 000 без краха доллара