Snowflake Openflow in Action – Part 2 – Setting up the Openflow Connector for SQL Server

Snowflake OpenFlow is a fully managed, cloud-native data integration service introduced in May 2025, built on Apache NiFi.

It enables organizations to connect diverse data sources and destinations, handling both structured and unstructured data types, and supports high-speed ingestion in batch and streaming modes. Openflow offers an open and extensible platform, allowing users to build and extend processors for various data sources and destinations. It provides enterprise-ready features such as built-in security, compliance, observability, and maintainability. Openflow supports a wide range of use cases, including ingesting data from unstructured sources like Google Drive and Box, replicating change data capture (CDC) from databases into Snowflake, ingesting real-time events from streaming services like Apache Kafka, and integrating data from SaaS platforms such as LinkedIn Ads for reporting and analytics.

In this blog post, we’ll guide you through setting up the Openflow Connector to replicate data from Microsoft SQL Server to Snowflake in near real-time using Change Data Capture (CDC).

Prerequisites

Before you begin, ensure you have the following:

  • A Snowflake account on AWS with ACCOUNTADMIN privileges.
  • An AWS account.
  • OpenFlow is set up in your Snowflake account.
  • A Microsoft SQL Server instance (versions 2017 or 2022) with administrative privileges.

For detailed setup instructions, please refer to our blog on OpenFlow Setup

https://sandbox.cittabase.com/snowflake-openflow-setup/

Implementation Steps

Step 1: Enable Change Tracking in SQL Server

Change Tracking must be enabled on the SQL Server database and the tables you intend to replicate.

In this case, we have created a database called SALES and a table named ORDER_DETAILS, and enabled Change Tracking in the database and table.

CREATE DATABASE SALES;

ALTER DATABASE SALES SET CHANGE_TRACKING = ON (CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON);

CREATE TABLE SALES.DBO.ORDER_DETAILS 
(
        ORDER_ID INT IDENTITY(1, 1) PRIMARY KEY,
        CUSTOMER_NAME NVARCHAR(100),
        ORDER_AMOUNT DECIMAL(10, 2),
        ORDER_RECEIVED_DATETIME DATETIME DEFAULT GETDATE()
);

ALTER TABLE SALES.DBO.ORDER_DETAILS ENABLE CHANGE_TRACKING WITH (TRACK_COLUMNS_UPDATED = ON);
Step 2: Grant Necessary Permissions to the User in SQL Server

Grant the required permissions to the SQL Server user that Openflow will use to access the data. These permissions allow the user to read data and access change tracking information.

GRANT SELECT ON SCHEMA_NAME.TABLE_NAME TO USER_NAME;

GRANT VIEW CHANGE TRACKING ON SCHEMA_NAME.TABLE_NAME TO USER_NAME;
Step 3: Configure Snowflake for OpenFlow

Set up a service user, role, database, and warehouse in Snowflake to facilitate data replication. This setup ensures that OpenFlow has the necessary access to replicate data into Snowflake.

CREATE DATABASE <destination_database>;

CREATE USER <openflow_user> TYPE=SERVICE COMMENT='Service user for automated access of Openflow';

CREATE ROLE <openflow_role>;

GRANT ROLE <openflow_role> TO USER <openflow_user>;

GRANT USAGE ON DATABASE <destination_database> TO ROLE <openflow_role>;

GRANT CREATE SCHEMA ON DATABASE <destination_database> TO ROLE <openflow_role>;

CREATE WAREHOUSE <openflow_warehouse>

     WITH

         WAREHOUSE_SIZE = 'MEDIUM'

         AUTO_SUSPEND = 300

         AUTO_RESUME = TRUE;

GRANT USAGE, OPERATE ON WAREHOUSE <openflow_warehouse> TO ROLE <openflow_role>;
Step 4: Set Up Key-Pair Authentication

Generate a public-private key pair for secure authentication. Assign the public key to the Snowflake service user. For more information on how to generate a key pair and assign it to a user, see Key-pair authentication and key-pair rotation.

ALTER USER openflow_user SET RSA_PUBLIC_KEY = 'thekey';
Step 5: Install the connector
  • Navigate to the Openflow Overview page. In the Featured connectors section, select View more connectors.
  • On the Openflow connectors page, find the SQL SERVER Connector and select Add to runtime.
  • In the Select runtime dialog, select your runtime from the Available runtimes drop-down list.
  • Select Add.
  • Authenticate to the deployment with your Snowflake account credentials and select Allow when prompted to allow the runtime application to access your Snowflake account. The connector installation process takes a few minutes to complete.
  • Authenticate to the runtime with your Snowflake account credentials.
  • The Openflow canvas appears with the connector process group added to it.

 

Step 6: Configure Connector Parameters

Right-click on the imported process group and select Parameters. Configure the following:

A list of parameters to be configured appears. Start with configuring Source Parameters

SQL Server Source Parameters
  • JDBC Connection URL: jdbc:sqlserver://:;Database=;trustServerCertificate=True;encrypt=false
    • Ensure that the SQL Server is accessible over the specified port.
    • In this scenario, we’ve utilized an on-premises SQL Server as the data source. While we won’t delve into the detailed steps of exposing an on-premises SQL Server for integration with Snowflake Openflow, it’s essential to ensure the following configurations are in place:
      • Enable TCP/IP in SQL Server Configuration Manager.
      • Set the TCP port to 1433 and disable dynamic ports.
      • Configure firewall rules to allow inbound connections on port 1433.
      • Use tools like netstat to verify that the port is open.
      • If behind a router or CGNAT, consider using tools like NGROK for TCP forwarding.
  • JDBC Driver: Download the Microsoft JDBC Driver for SQL Server from the official website. Upload the driver as a reference asset in OpenFlow. And check the Reference Assets and upload the downloaded JDBC driver.
  • SQL Server Username & SQL Server Password – Provide the username and password for the SQL Server user with the necessary permission.

Snowflake Destination Parameter
  • Account Identifier: Your Snowflake account identifier in the format [organization_name]-[account_name].
  • Authentication Strategy: Choose KEY_PAIR for key-pair authentication.
  • Private Key: Upload the private key file used for authentication. Ensure it’s in PKCS8 format with proper PEM headers. Select the Reference asset checkbox to upload the private key file. For more details, check Key-pair authentication and key-pair rotation
  • Private Key Password: If your private key is encrypted, provide the corresponding password.
  • Role: The Snowflake role assigned to the OpenFlow user.
  • Username: The OpenFlow service user’s username.
  • Warehouse: The warehouse is configured for OpenFlow operations.

 

Ingestion Parameters
  • Included Table Names: Comma-separated list of fully qualified table names to replicate (e.g., dbo.ORDER_DETAILS).
  • Included Table Regex: Regular expression to match table names for replication (e.g., dbo\..* to include all tables in the dbo schema).
  • Filter JSON: JSON array specifying columns to include in replication.
  • Merge Task Schedule CRON: CRON expression defining when merge operations occur. Use * * * * * ? for continuous merges.

Step 7: Activate the Connector

Once all the parameters are configured. Right-click the Resource Group and

  • Enable All the Controller Services. Verify once by checking the Controller Services. Right-click the Resource Group -> Controller Services
  • Enable the Resource Group
  • Start the Resource Group

Once activated, the connector will begin replicating data from SQL Server to Snowflake.

Double-click the resource group to view its contents. Inside, you’ll find two resource groups and a green play button, indicating that the components within these resource groups are currently running.

Testing Change Data Capture (CDC)

To verify CDC functionality: This is the sample data loaded into on-premises SQL Server.

Verify in Snowflake. Confirm that the data appears in the corresponding Snowflake table. In Snowflake, a table was created, and data was also replicated from SQL Server

And there are a few other objects that are created as part of the Openflow Snowflake Process to handle the incremental CDC data.

To test the CDC part. Modify a record in SQL Server (e.g., change the CUSTOMER_NAME for a specific ORDER_ID).

In Snowflake, in near real time, observe that the changes are reflected, along with timestamps indicating the update.

Now, to test Delete Transaction, delete a record from SQL Server

In Snowflake, we observe that for order ID 13, the delete flag and the record’s update date have been modified. This indicates that a hard delete in the SQL Server, when Change Data Capture (CDC) is enabled, is represented as a soft delete in Snowflake.

In this blog, we’ve walked through setting up the Openflow Connector to replicate data from SQL Server to Snowflake, including enabling Change Tracking, configuring necessary permissions, and verifying CDC functionality.

In future posts, we’ll delve deeper into the Openflow processors and explore advanced configurations to optimize your data integration workflows

Please feel free to reach out to us for your Snowflake solution needs. Cittabase is a Premier  partner with Snowflake.



Leave a Reply