Bulk Data Access Information
This guide walks through the key workflows and functions available for accessing bulk datasets using the carbonarc
Python SDK.
All functionality is accessible via:
from carbonarc import CarbonArcClient
ca = CarbonArcClient(token="your_token_here")
Suggested Start
After installing dependencies users can explore dataset library information using the below functions.
1. Dataset Discovery
-
List all datasets available to your account
→ca.data.get_datasets()
-
Retrieve metadata, schema, and sample fields for a dataset
→ca.data.get_dataset_information(dataset_id)
-
List available graph datasets and metadata
→ca.data.get_graphs()
→ca.data.get_graph_information(graph_id)
2. Manifest Exploration
-
Get the full list of files available for a dataset
→ca.data.get_data_manifest(dataset_id)
-
Filter manifest by date
→ Usedrop_date
orlogical_date
with operators like("==", "202501")
Example:ca.data.get_data_manifest("CA0042", logical_date=("==", "202501"))
3. Data Purchasing
-
Purchase a specific file from the manifest
→ca.data.buy_data(dataset_id, file_urls=[...])
-
Purchase all files from a manifest or filtered time range
→ Pass the full list offile_url
s from the manifest tobuy_data
Code Example
This example shows Card - EU Detailed data.
manifest = ca.data.get_data_manifest("CA0042")
urls = [entry["file_url"] for entry in manifest["manifest"]]
order = ca.data.buy_data("CA0042", file_urls=urls)
4. Data Downloading
Download a file to local disk
→ ca.data.download_file(file_id, directory)