CarbonArc Data Catalog Explorer (Python Guide)
Explore Carbon Arc data programmatically
This guide shows how to use the CarbonArc Python SDK to explore the CarbonArc data catalog, including:
- Browsing available datasets\
- Viewing data dictionaries\
- Exploring specific dataset topics\
- Retrieving full dataset "tearsheets" (metadata)
Installation
pip install carbonarc
Configuration
from carbonarc import CarbonArcClient
API_AUTH_TOKEN = "your_api_token_here"
API_BASE_URL = "https://api.carbonarc.co"
client = CarbonArcClient(
host=API_BASE_URL,
token=API_AUTH_TOKEN
)
Browse All Available Datasets
datasets = client.data.get_datasets()
print(f"Total Datasets Available: {datasets.get('size', 0)}")
for idx, dataset in enumerate(datasets.get('datasources', [])[:10], 1):
print(f"{idx}. {dataset.get('dataset_name', 'N/A')}")
print(f" Dataset ID: {', '.join(dataset.get('dataset_id', []))}")
print(f" Provider: {dataset.get('provider_name', 'N/A')}")
print(f" Description: {dataset.get('description', 'N/A')[:80]}...")
View a Dataset Data Dictionary
dataset_id = "CA0056"
data_dict = client.data.get_data_dictionary(dataset_id=dataset_id)
View a Specific Topic Data Dictionary
dataset_id = "CA0028"
entity_topic_id = 145
topic = client.data.get_data_dictionary(
dataset_id=dataset_id,
entity_topic_id=entity_topic_id
)[0]
Get Detailed Dataset Information (Tearsheet)
dataset_id = "CA0056"
dataset_info = client.data.get_dataset_information(dataset_id)
Get Data Samples
You can pull real data samples directly from the API for validation and exploration.
Sample for a Specific Topic
For example, pulling data for a specific topic such as Core Panel vs By Payment:
ca.data.get_data_sample(
dataset_id="CA0028",
entity_topic_id=146
)
Sample for All Topics in a Dataset
To retrieve a sample across all topics in a dataset:
ca.data.get_data_sample(
dataset_id="CA0056"
)
Summary
This guide enables developers to: - Discover available datasets - Inspect dataset schemas - Drill into specific topics - Retrieve full dataset metadata