Skip to main content

Data Release Lifecycle

At Carbon Arc, data doesn’t just arrive—it’s shaped, structured, and activated through a rigorous end-to-end pipeline designed for reliability, compliance, and usability. Every dataset passes through six distinct stages, ensuring it’s ready for decision-making, analysis, and scaled delivery.


1. Acquire

Each dataset begins with a structured onboarding process. Whether sourced from institutional providers or commercial partners, every asset is framed with clear rules of engagement:

  • Source documentation is reviewed (company background, data dictionary, delivery spec)
  • Usage rights and delivery formats are defined
  • Privacy and compliance considerations are scoped up front

2. Extract

Data is delivered via agreed methods (API, SFTP, flat file) and staged in Carbon Arc’s data separation model, isolating each source into secure, dedicated environments.

  • Security protocols are applied at ingestion
  • Global privacy standards are enforced before downstream processing begins

3. Clean & Transform

Raw inputs are converted into structured, ontology-aligned datasets through Carbon Arc’s internal transformation layer:

  • De-duplication and normalization
  • Entity mapping via identity resolution framework
  • Alignment to Carbon Arc’s unified ontology to enable cross-dataset joins
  • Final outputs structured into clean, queryable tables

4. Certify

Transformed datasets are pushed through a certification layer to unlock analytical value:

  • Metrics, KPIs, cohorts, and share calculations are generated
  • Schema is validated for consistency, and performance benchmarks are run
  • Data is staged for final review before release

5. Activate

Certified datasets are deployed to the Carbon Arc platform with full enablement materials:

  • Dataset tear sheets, documentation, and example use cases
  • Sample rows for QA and validation
  • Client alerts issued on new data availability

Compliance-managed accounts are notified for approval before activation.


6. Deliver

Data is made accessible through Carbon Arc’s delivery channels:

  1. Platform UI

    • Download CSV option for analysts
  2. Programmatic

    • Best for programmatic workflows using the carbonarc python package

Note:
All data—regardless of delivery method—flows through the same structured release process and transformation framework prior to client access.


Platform Commitments

  • ✅ Every dataset is fully transformed, certified, and ontology-aligned before activation
  • ✅ All datasets are accessible via API
  • ✅ All bulk datasets follow the same QA, governance, and transformation standards