Bulk Data Access Information

This guide walks through the key workflows and functions available for accessing bulk datasets using the carbonarc Python SDK.

All functionality is accessible via:

from carbonarc import CarbonArcClient
ca = CarbonArcClient(token="your_token_here")

Suggested Start

After installing dependencies users can explore dataset library information using the below functions.

1. Dataset Discovery

List all datasets available to your account
→ ca.data.get_datasets()
Retrieve metadata, schema, and sample fields for a dataset
→ ca.data.get_dataset_information(dataset_id)
List available graph datasets and metadata
→ ca.data.get_graphs()
→ ca.data.get_graph_information(graph_id)

2. Manifest Exploration

Get the full list of files available for a dataset
→ ca.data.get_data_manifest(dataset_id)
Filter manifest by date
→ Use drop_date or logical_date with operators like ("==", "202501")
Example:
```
ca.data.get_data_manifest("CA0042", logical_date=("==", "202501"))
```

3. Data Purchasing

Purchase a specific file from the manifest
→ ca.data.buy_data(dataset_id, file_urls=[...])
Purchase all files from a manifest or filtered time range
→ Pass the full list of file_urls from the manifest to buy_data

Code Example

This example shows Card - EU Detailed data.

manifest = ca.data.get_data_manifest("CA0042")
urls = [entry["file_url"] for entry in manifest["manifest"]]
order = ca.data.buy_data("CA0042", file_urls=urls)

4. Data Downloading

Download a file to local disk → ca.data.download_file(file_id, directory)

Suggested Start​

1. Dataset Discovery​

2. Manifest Exploration​

3. Data Purchasing​

Code Example​

4. Data Downloading​

For more resources, visit:​