Skip to main content

Connect Amazon Redshift

Overview

Warehouse Native ExperimentationA method of running feature management experiments directly within your data warehouse, leveraging its processing power and existing data infrastructure. allows you to run experiments on data that already lives in your data warehouseA centralized repository for storing and managing large volumes of structured and semi-structured data. Examples include Snowflake, BigQuery, Redshift, and Databricks.. By connecting Harness FME directly to your Amazon Redshift instance, you can securely query and analyze experiment data from your source of truth.

To begin, connect your Amazon Redshift instance as a data source through a direct connection or using IAM role-based authentication.

Prerequisites

Ensure that you have the following before getting started:

  • Access to your organization's Redshift cluster endpoint and database
  • An IAM role with appropriate read access to the database and schema containing experiment data, and write access to a results table
  • A designated results table where experiment results are stored in Amazon Redshift

Setup

Harness recommends the following best practices:

  • Use IAM Role authentication instead of static credentials.
  • Restrict access to read-only privileges.
  • Keep Redshift clusters within secure VPCs and use SSL connections.
  • Regularly audit IAM Roles and access policies.

To integrate Amazon Redshift as a data warehouse for Warehouse Native Experimentation:

  1. Select Redshift as your data warehouse. In the Data Sources tab of your Harness FME project, select Redshift from the list of supported data warehouses.

  2. Enter the following connection details:

    FieldDescriptionExample
    Cluster Endpoint (Host)The endpoint of your Redshift cluster.redshift-cluster.analytics.us-east-1.redshift.amazonaws.com
    PortThe port number used by your Redshift instance (by default, set to 5439).5439
    DatabaseThe database containing your experimentation data.experiments
    SchemaThe schema within your database containing your experiment or metric tables.analytics
    IAM Role ARNThe IAM role with permissions to access your Redshift cluster.arn:aws:iam::123456789012:role/FMEAccessRole
    Results Table NameThe name of the table where experiment results are stored.FME_RESULTS
  3. Configure authentication. Harness FME supports IAM role-based authentication for secure, temporary access to Redshift.

    • Create or use an existing IAM role with permissions to access the cluster.
    • Attach a policy granting Redshift read access to relevant databases and schemas.
    • Provide the IAM Role ARN in Harness FME.
  4. Select a database and a schema. After authentication, Harness FME retrieves the list of accessible databases and schemas based on your IAM Role permissions. Select the one containing your experiment exposure and event/metric data.

  5. Specify a results table. Designate a results table where Harness FME will write experiment analysis results. Ensure the following:

    • The table exists in your database.
    • The schema matches the expected format for experiment results (such as key, metric_name, treatment, and more).

    FieldTypeDescription
    METRICRESULTIDVARCHARUnique identifier representing a specific calculation per metric, per experiment, per analysis run.
    TREATMENTVARCHARThe experiment variant (e.g., Control or Treatment) associated with the metric results.
    DIMENSIONNAMEVARCHARThe name of the dimension being analyzed (e.g., country, platform).
    DIMENSIONVALUEVARCHARThe corresponding value of the analyzed dimension.
    ATTRIBUTEDKEYSCOUNTBIGINTCount of unique keys (users, sessions, etc.) attributed to this metric result.
    REQUESTTIMESTAMPTIMESTAMPTimestamp when the metric computation request occurred.
    MINFLOAT8Minimum observed value for the metric.
    MAXFLOAT8Maximum observed value for the metric.
    COUNTBIGINTTotal number of observations included in the metric calculation.
    SUMFLOAT8Sum of all observed metric values.
    MEANFLOAT8Average (mean) of the metric values.
    P50FLOAT850th percentile (median) metric value.
    P95FLOAT895th percentile metric value.
    P99FLOAT899th percentile metric value.
    VARIANCEFLOAT8Variance of the metric values.
    EXCLUDEDUSERCOUNTBIGINTNumber of users excluded from the analysis (due to filters, SRM, etc.).
    ASOFTIMESTAMPTIMESTAMPTimestamp representing when the result snapshot was written.

    To create the results table with the correct structure, run the following SQL statement in Amazon Redshift:

    CREATE TABLE IF NOT EXISTS <DATABASE_NAME>.<SCHEMA_NAME>.<TABLE_NAME> (
    METRICRESULTID VARCHAR(256),
    TREATMENT VARCHAR(256),
    DIMENSIONNAME VARCHAR(256),
    DIMENSIONVALUE VARCHAR(256),
    ATTRIBUTEDKEYSCOUNT BIGINT,
    REQUESTTIMESTAMP TIMESTAMP,
    MIN FLOAT8,
    MAX FLOAT8,
    COUNT BIGINT,
    SUM FLOAT8,
    MEAN FLOAT8,
    P50 FLOAT8,
    P95 FLOAT8,
    P99 FLOAT8,
    VARIANCE FLOAT8,
    EXCLUDEDUSERCOUNT BIGINT,
    ASOFTIMESTAMP TIMESTAMP
    );
  6. Test the connection by clicking Test Connection. If the test fails, verify the following:

    • The IAM Role has the correct trust policy and permissions.
    • The Redshift cluster is publicly accessible (or within a connected VPC).
    • The correct database, schema, and port are entered.
  7. Save and activate. Once the test passes, click Save to create the connection.

Your Redshift data source can now be used to create assignment and metric sources for Warehouse Native Experimentation.

Example Redshift configuration

SettingExample
Cluster Endpointredshift-cluster.analytics.us-east-1.redshift.amazonaws.com
Port5439
Databaseexperiments
Schemaanalytics
IAM Role ARNarn:aws:iam::123456789012:role/FMEAccessRole
Results TableFME_RESULTS