Bulk Data – Getting Started

This guide walks you through authenticating, retrieving, and downloading bulk data files from Carbon Arc using the Python SDK.

Prerequisites

Make sure you have:

Python 3.10+
The carbonarc SDK and python-dotenv installed

Setup your environment:

python3.10 -m venv .venv  
source .venv/bin/activate  
pip install git+https://github.com/Carbon-Arc/carbonarc  
pip install python-dotenv  

## Environment Setup

1. Create a `.env` file in your working directory **or** export `API_AUTH_TOKEN` in your environment.

```bash
# .env file
API_AUTH_TOKEN=<your API auth token from https://platform.carbonarc.co/profile>

Example Code

# Import required dependencies
import os
from datetime import datetime
from dotenv import load_dotenv
from carbonarc import CarbonArcClient

load_dotenv()

## Read in environment variables
API_AUTH_TOKEN=os.getenv("API_AUTH_TOKEN")

# Create API Client
client = CarbonArcClient(API_AUTH_TOKEN)

List available datasets

## List datasets
datasets = ca.data.get_datasets()

Select a data identifier (example)

Rerieve information for a given dataset ID.

## Get information for given dataset
dataset = client.data.get_dataset_information(dataset_id="CA0000")

Retrieve files since last creation

## Downloading files created since last ingestions, this needs last ingestion time
last_ingest_time = datetime.now().strftime('%Y-%m-%dT%H:%M:%S')
print(last_ingest_time)

manifest = client.data.get_data_manifest(dataset_id="CA0000", created_since=last_ingest_time)

Manifest file structure

{
    'url': 'link',
    'format': 'parquet',
    'records': 1000,
    'size_bytes': 123456789,
    'modification_time': '2025-04-15T23:04:44',
    'price': 123.45,
}

```python
## Find and pick file urls from manifest
file_urls = [x['url'] for x in manifest['datasources']]

## Buy manifest files
order = client.data.buy_data(dataset_id="CA0028", file_urls=file_urls)

Buy the data

Select the manifest file, and buy the data.

## Download file to current directory
client.data.download_file(file_id=order['files'][0], directory="./")

## Download file to current directory
client.data.download_file(file_id=order['files'][0], directory="./")

Prerequisites​

Example Code​

List available datasets​

Select a data identifier (example)​

Retrieve files since last creation​

Manifest file structure​

Manifest file structure​

Buy the data​

Prerequisites

Example Code

List available datasets

Select a data identifier (example)

Retrieve files since last creation

Manifest file structure

Manifest file structure

Buy the data