Skip to main content

Bulk Data Access Information

This guide walks through the key workflows and functions available for accessing bulk datasets using the carbonarc Python SDK.

All functionality is accessible via:

from carbonarc import CarbonArcClient
ca = CarbonArcClient(token="your_token_here")

Suggested Start

After installing dependencies users can explore dataset library information using the below functions.

1. Dataset Discovery

  • List all datasets available to your account
    ca.data.get_datasets()

  • Retrieve metadata, schema, and sample fields for a dataset
    ca.data.get_dataset_information(dataset_id)

  • List available graph datasets and metadata
    ca.data.get_graphs()
    ca.data.get_graph_information(graph_id)


2. Manifest Exploration

  • Get the full list of files available for a dataset
    ca.data.get_data_manifest(dataset_id)

  • Filter manifest by date
    → Use drop_date or logical_date with operators like ("==", "202501")
    Example:

    ca.data.get_data_manifest("CA0042", logical_date=("==", "202501"))

3. Data Purchasing

  • Purchase a specific file from the manifest
    ca.data.buy_data(dataset_id, file_urls=[...])

  • Purchase all files from a manifest or filtered time range
    → Pass the full list of file_urls from the manifest to buy_data

Code Example

This example shows Card - EU Detailed data.

manifest = ca.data.get_data_manifest("CA0042")
urls = [entry["file_url"] for entry in manifest["manifest"]]
order = ca.data.buy_data("CA0042", file_urls=urls)

4. Data Downloading

Download a file to local disk → ca.data.download_file(file_id, directory)

For more resources, visit: