Sledhouse is Bobsled’s cutting-edge data product lakehouse designed to transform how data providers create, fulfill, and govern data products. Built on a multi-cloud, zero-copy architecture, Sledhouse offers fine-grained product customization and fulfillment at a fraction of the cost and complexity of traditional data feeds.
Sledhouse replaces complex and inefficient home-built data feeds with a unified data fulfillment system. Leading data companies like Dun and Bradstreet, CoreLogic, ZoomInfo, and Deutsche Boerse rely on Bobsled to deliver the right data to the right customers where they want it—all while dramatically reducing the infrastructure and labor costs traditionally required to maintain data feeds.
In this white paper, we will cover Sledhouse’s:
Bobsled offers a powerful suite of features that free up engineering teams and empower product, sales and operations to deliver better experiences to customers:
Empower operations and sales teams to configure the columns and rows to share with customers. Pre-configure common slices or customize based on customer needs. All without creating a ticket.
Securely deliver analytics-ready data products to customers in the platforms where they work from a single platform. Support for FTP, cloud storage and native cloud data warehouse sharing. Complete coverage without pipelines, credentials or accounts to manage.
Easily track usage and entitlements across every customer. Get alerts when deliveries fail, explore user behavior with telemetry, and easily audit lineage even for customized products.
Customize and fulfill data products directly to customers without replicating data. Sledhouse’s zero-copy fulfillment engine leverages the growing adoption of open table formats to enable highly performant, egress-free fulfillment of data products directly to a customer’s cloud platform (e.g., Snowflake, Databricks).
Sledhouse is built on an egress-free, S3-compliant data lake so you only pay egress once for every data product.
Sledhouse is architected to radically reduce the cost of managing and supporting end-to-end data fulfillment workflow while maintaining the flexibility and security modern data businesses require. By replacing disparate data stores and pipelines with a single, globally interoperable data product lakehouse, data providers can accelerate customers time-to-insight while reducing the incremental cost of fulfillment.
Sledhouse uses our Spark-based Ingestion and Table Maintenance Engines to seamlessly and cost-efficiently maintain Iceberg-based versions of select data assets.
Replicates data to Iceberg in R2 using multiple approaches depending on the use case, supporting both incremental updates and full overwrites
Manages background maintenance jobs to compact, update, and delete data, ensuring optimal performance and data integrity
Sledhouse’s Interoperable Data Lake enables teams to store data products in a format that enables cross-platform fulfillment with minimal infrastructure costs.
All data in Sledhouse is stored in Cloudflare’s egress-free storage, R2. This dramatically reduces one of the major cost drivers for both zero-copy and distributed multi-cloud fulfillment.
Elimination of data transfer egress
Global distribution network, ensuring low-latency access for users worldwide across data platforms
S3-compatible API, facilitating seamless integration with existing tools and workflows
We employ Apache Iceberg as our open table format, providing interoperability across data platforms. This enables performance when querying data across different platforms.
ACID compliant lakehouse table format
Optimized performance for large-scale data processing tasks
Snapshot, time travel and rollback features, enhancing data versioning and recovery options
The Data Product Catalog is where users create and manage data products. Users can create new data products by generating logical views of core tables stored in the Data Lake. Fulfillment teams can further customize on a per-customer basis.
Zero-copy data product creation and management
UI and API-based customization
Detailed lineage for all products fulfilled to customers
The Global Fulfillment engine powers fulfillment within Sledhouse to every supported destination (cloud data warehouses, cloud storage, FTP). The fulfillment engine offers zero-copy fulfillment (select locations) and localized fulfillment via Bobsled’s first-of-its-kind distribution network. (More detailed data flows for Zero-Copy Fulfillment can be found in the Data Flow section.)
Sledhouse implements a comprehensive access management system:
Data access to Iceberg tables via external tables and views in data warehouses
Fine-grained access control managed through data product views
Sledhouse is the first commercial platform to offer cross-platform zero-copy fulfillment. With Zero-copy Fulfillment, Bobsled uses external tables to generate logical views of Iceberg tables for customers natively within their cloud data warehouses. (Please see the Data Flows section for more details.)
No incremental storage or egress fees
No replica datasets to maintain
Performant queries due to native support of Iceberg by major cloud data warehouses
Sledhouse also offers the ability to localize data within the platform or region of a customer. This improves query performance for data warehouse destinations. In most cases, data is replicated within single-tenant staging accounts managed by Bobsled.
Improved query performance
Egress-free replication
Sledhouse permissions data products to customers using a network of fully-managed, single-tenant cloud accounts. These accounts enable Sledhouse to permission products using the native sharing protocols of each platform. Customers receive data as a “data share” accessible with no ETL.
Secure, single-tenant infrastructure that is fully transferrable
Credential-free permissioning for cloud destinations
No ETL for customers
To streamline operations and enhance user experience, Sledhouse offers:
Intuitive UI and comprehensive REST API for high level declarative configuration of Sledhouse tables and data products
Streamlined workflows for data product creation and distribution
This robust technical foundation enables Sledhouse to revolutionize data sharing and product management by enabling on-demand data product customization, global fulfillment and unified governance across every customer with limited technical support.
Below, we offer a step-by-step description of the data flow when sharing data via our zero copy data flow. Data is encrypted both in motion and at rest from throughout the entire workflow.
Data is incrementally replicated from source platform (e.g. Google BigQuery, Snowflake) using multiple replication patterns to Iceberg Tables in R2
Iceberg tables are created and managed by a Spark-based Table Maintenance service
External Tables are setup in destination platforms on top of the Iceberg tables
Data Product views are created on top of external tables and added to shares/listings
Data consumers are granted access to data shares or listings
Consumer queries in destination platforms (e.g. Snowflake) are served from Iceberg tables in R2
Bobsled is trusted by the world largest data companies to power fulfillment of their most sensitive data.
SOC 2 and ISO 24001 are just the start. As an infrastructure company, Bobsled has security in its DNA.
Bobsled is compliant with modern major standardizing offering solution for DSAR and consent requirements.
All data in Bobsled is processed through single tenant environments, isolated and owned by each customer.
Data in Bobsled is encrypted in motion and at rest throughout the entire platform.
The emergence of data feeds as one of the three pillars of data monetization—alongside SaaS apps and APIs—means that data companies now need the security, scalability and reliability offered by commercially-built infrastructure. Sledhouse dramatically simplifies the fulfillment of process for analytical data by providing a single platform to streamline the creation, fulfillment and governance of data products no matter where they are being consumed.
Engineering teams like Dun & Bradstreet, CoreLogic, Deutsche Boerse and ZoomInfo rely on Bobsled because it helps improve the customer experience while reducing costs.
Get prospects and customers querying your data and discovering insight in minutes no matter where they work.
Empower non-technical teams to get customers up-and-running on the data they want without engineering support.
Radically reduce the storage, egress and compute costs required to customize and fulfill data in the cloud.