Bulk Feeds
This guide explains how to access and retrieve datasets using Carbon Arc’s Bulk delivery system, designed for efficient, flexible, and scalable data access.
Overview
Each dataset is identified by a unique data identifier (for example, card_us_detail_data
). Use the API to list all available identifiers and interact with specific data assets programmatically.
Metadata and Partition Filters
Metadata for each dataset defines its structure and available partition filters, which allow targeted file retrieval:
brand_type
product
retailer_banner
service
NA
Partition filters help reduce the volume of data retrieved, improving performance and relevance.
Accessing Data Files
Use the manifest API to list downloadable files, each including:
- Download URL
- Format (e.g., Parquet)
- Record Count
- File Size
- Last Modified Time
Each file represents a partitioned slice of the dataset.
Incremental Data Delivery
Support efficient updates using manifest filters:
- Updated Since: Files modified after a given timestamp
- Created Since: Files newly created after a given timestamp
These filters enable lightweight, timely incremental syncs.
Summary
Carbon Arc’s bulk feed delivery offers:
- Unique identifiers for each dataset
- Metadata with partition filters for precision
- Manifest‑based file discovery and download
- Incremental sync via timestamp filters
For more resources, visit: