data-engineering-skills

🥈Silver

Data engineering skills for analytics and data engineers working with dbt and Snowflake. Automate data pipelines, transform data, and generate SQL code. Integrates with Claude Code for AI-assisted data workflows.

8020Updated 2mo ago

Intermediate30min to implementautomation

Saves ~300 min per use

Quick InstallView Source

git clone https://github.com/AltimateAI/data-engineering-skills.git

Works with:

Claude

Overview

About This Skill

Data Engineering Skills is a collection of Claude Code skills built for analytics and data engineers working with dbt and Snowflake. The skills encode proven workflows and best practices, transforming Claude into a capable data engineering assistant for tasks like model creation, error debugging, test schema design, documentation, SQL-to-dbt migration, refactoring, and query optimization. Benchmark results show 53% accuracy on real-world dbt tasks and 84% pass rate on Snowflake query optimization. Skills activate automatically based on your request and work standalone or integrate with Altimate's MCP server for real-time project and warehouse access.

How to Use

1. **Prepare Your Environment**: Ensure you have dbt installed and configured with Snowflake credentials. Run `dbt init [PROJECT_NAME]` if starting a new project. 2. **Create Source Files**: In your `models/` directory, create a `staging` subdirectory. Add a `schema.yml` file to define sources and tests. 3. **Generate Models**: Use the prompt template to create new dbt models. Replace [PLACEHOLDERS] with your specific requirements. For complex transformations, break them into multiple CTEs as shown in the example. 4. **Test Locally**: Run `dbt compile` to check for syntax errors, then `dbt run --select [MODEL_NAME]` to test the model against your Snowflake warehouse. 5. **Integrate with CI/CD**: Add the generated models to your dbt Cloud or GitHub Actions pipeline. Configure the `dbt test` command to run automatically after each deployment. **Pro Tips:** - Use `dbt docs generate` to create documentation for your models automatically - For large transformations, consider breaking them into multiple models with clear dependencies - Store frequently used transformations in macros to avoid repetition - Monitor Snowflake query performance with `dbt build --profile [PROFILE_NAME] --select [MODEL_NAME]`

Use Cases

Create dbt models following project conventions and best practices

Debug dbt compilation and runtime errors with systematic troubleshooting

Write and test dbt schema tests matching existing project patterns

Optimize expensive Snowflake queries using query history and performance analysis

Setup & Installation

Quick Install

No install command available. Check the GitHub repository for manual installation instructions.

Alternative Install (Git Clone)

git clone https://github.com/AltimateAI/data-engineering-skills

Requirements

Claude Code or compatible AI agent
Works with: Claude

Quick Start Guide

Install the Skill

Copy the install command above and run it in your terminal.

Open Your AI Agent

Launch Claude Code, Cursor, or your preferred AI coding agent.

Try It Out

Use the prompt template or examples below to test the skill.

Customize

Adapt the skill to your specific use case and workflow.

Usage Examples

Prompt Template

Act as an expert data engineer using dbt and Snowflake. For the project [PROJECT_NAME], write a dbt model to transform [SOURCE_TABLE] into [TARGET_TABLE] with these requirements: [TRANSFORMATION_RULES]. Include error handling for null values in [COLUMN_LIST]. Generate the SQL code and explain the dbt model structure. Add a test for data quality that checks [TEST_CONDITION].

Example Output

```sql
-- dbt model: stg_customers.sql
-- Transforms raw customer data into a clean staging table

{{
  config(
    materialized='table',
    schema='staging',
    tags=['customer']
  )
}}

WITH source_data AS (
  SELECT
    customer_id,
    first_name,
    last_name,
    email,
    signup_date,
    country,
    CASE
      WHEN email IS NULL THEN 'missing_email'
      ELSE email
    END AS cleaned_email,
    CASE
      WHEN country IN ('USA', 'United States') THEN 'US'
      WHEN country IN ('UK', 'United Kingdom') THEN 'UK'
      ELSE 'Other'
    END AS region
  FROM {{ source('raw', 'customers') }}
),

cleaned_data AS (
  SELECT
    customer_id,
    TRIM(first_name) AS first_name,
    TRIM(last_name) AS last_name,
    cleaned_email,
    signup_date::DATE AS signup_date,
    region,
    CURRENT_TIMESTAMP AS etl_loaded_at
  FROM source_data
  WHERE customer_id IS NOT NULL
)

SELECT * FROM cleaned_data;

-- Data quality test
{% test not_null('customer_id') %}
SELECT * FROM {{ this }}
WHERE customer_id IS NULL;
{% endtest %}

{% test accepted_values('region', ['US', 'UK', 'Other']) %}
SELECT * FROM {{ this }}
WHERE region NOT IN ('US', 'UK', 'Other');
{% endtest %}
```

**Model Structure Explanation:**
This dbt model follows a clean staging pattern by:
1. First creating a CTE (`source_data`) that handles raw data cleaning including null handling for emails and standardizing country codes
2. Then creating a `cleaned_data` CTE that applies additional transformations like trimming whitespace and proper date casting
3. Finally selecting all columns from the cleaned data CTE

**Key Features Implemented:**
- Null handling for critical fields (customer_id, email)
- Data type conversion (signup_date to DATE)
- Standardization of categorical data (country to region)
- Timestamp for tracking load times
- Two data quality tests: one for null values in customer_id and another for valid region values

**Usage Notes:**
This model should be run as part of your dbt project's staging layer. The tests will automatically run during `dbt test` and fail the build if any records violate the constraints. The `etl_loaded_at` field helps track when data was last processed.

Apply to these tools

Browse all tools

dbt

SQL-first transformation framework for analytics engineering

Claude

AI assistant built for thoughtful, nuanced conversation

IronCalc

IronCalc is a spreadsheet engine and ecosystem

Swagger

Design, document, and generate code for APIs with interactive tools for developers.

ServiceNow

Enterprise workflow automation and service management platform

GPT for work

Automate your spreadsheet tasks with AI power

Compatible MCP servers

Browse all MCP servers

Find the right skills for your stack

Take a free 3-minute scan and get personalized AI skill recommendations.

Take free scan

Overview

About This Skill

How to Use

Use Cases

Create dbt models following project conventions and best practices

Debug dbt compilation and runtime errors with systematic troubleshooting

Write and test dbt schema tests matching existing project patterns

Optimize expensive Snowflake queries using query history and performance analysis

Quick Install

No install command available. Check the GitHub repository for manual installation instructions.

Alternative Install (Git Clone)

git clone https://github.com/AltimateAI/data-engineering-skills

Requirements

Claude Code or compatible AI agent
Works with: Claude

Quick Start Guide

Install the Skill

Copy the install command above and run it in your terminal.

Open Your AI Agent

Launch Claude Code, Cursor, or your preferred AI coding agent.

Try It Out

Use the prompt template or examples below to test the skill.

Customize

Adapt the skill to your specific use case and workflow.

Usage Examples

Prompt Template

Act as an expert data engineer using dbt and Snowflake. For the project [PROJECT_NAME], write a dbt model to transform [SOURCE_TABLE] into [TARGET_TABLE] with these requirements: [TRANSFORMATION_RULES]. Include error handling for null values in [COLUMN_LIST]. Generate the SQL code and explain the dbt model structure. Add a test for data quality that checks [TEST_CONDITION].

Example Output

```sql
-- dbt model: stg_customers.sql
-- Transforms raw customer data into a clean staging table

{{
  config(
    materialized='table',
    schema='staging',
    tags=['customer']
  )
}}

WITH source_data AS (
  SELECT
    customer_id,
    first_name,
    last_name,
    email,
    signup_date,
    country,
    CASE
      WHEN email IS NULL THEN 'missing_email'
      ELSE email
    END AS cleaned_email,
    CASE
      WHEN country IN ('USA', 'United States') THEN 'US'
      WHEN country IN ('UK', 'United Kingdom') THEN 'UK'
      ELSE 'Other'
    END AS region
  FROM {{ source('raw', 'customers') }}
),

cleaned_data AS (
  SELECT
    customer_id,
    TRIM(first_name) AS first_name,
    TRIM(last_name) AS last_name,
    cleaned_email,
    signup_date::DATE AS signup_date,
    region,
    CURRENT_TIMESTAMP AS etl_loaded_at
  FROM source_data
  WHERE customer_id IS NOT NULL
)

SELECT * FROM cleaned_data;

-- Data quality test
{% test not_null('customer_id') %}
SELECT * FROM {{ this }}
WHERE customer_id IS NULL;
{% endtest %}

{% test accepted_values('region', ['US', 'UK', 'Other']) %}
SELECT * FROM {{ this }}
WHERE region NOT IN ('US', 'UK', 'Other');
{% endtest %}
```

**Model Structure Explanation:**
This dbt model follows a clean staging pattern by:
1. First creating a CTE (`source_data`) that handles raw data cleaning including null handling for emails and standardizing country codes
2. Then creating a `cleaned_data` CTE that applies additional transformations like trimming whitespace and proper date casting
3. Finally selecting all columns from the cleaned data CTE

**Key Features Implemented:**
- Null handling for critical fields (customer_id, email)
- Data type conversion (signup_date to DATE)
- Standardization of categorical data (country to region)
- Timestamp for tracking load times
- Two data quality tests: one for null values in customer_id and another for valid region values

**Usage Notes:**
This model should be run as part of your dbt project's staging layer. The tests will automatically run during `dbt test` and fail the build if any records violate the constraints. The `etl_loaded_at` field helps track when data was last processed.

data-engineering-skills

Overview

About This Skill

How to Use

Use Cases

Tags

Setup & Installation

Quick Install

Alternative Install (Git Clone)

Requirements

Quick Start Guide

Install the Skill

Open Your AI Agent

Try It Out

Customize

Usage Examples

Prompt Template

Example Output

Apply to these tools

dbt

Claude

IronCalc

Swagger

ServiceNow

GPT for work

Compatible MCP servers

workany

claude-code-mcp

buildwithclaude

claude code ide.el

mcp claude spotify

x402-claude-mcp

Find the right skills for your stack

data-engineering-skills

Overview

About This Skill

How to Use

Use Cases

Tags

Setup & Installation

Quick Install

Alternative Install (Git Clone)

Requirements

Quick Start Guide

Install the Skill

Open Your AI Agent

Try It Out

Customize

Usage Examples

Prompt Template

Example Output

Apply to these tools

dbt

Claude

IronCalc

Swagger

ServiceNow

GPT for work

Compatible MCP servers

workany

claude-code-mcp

buildwithclaude

claude code ide.el

mcp claude spotify

x402-claude-mcp

Find the right skills for your stack