- May 31, 2024
- Posted by: Anisha P
- Categories: GCP, Snowflake
Introduction:
The Snowflake Connector for Google Analytics is a powerful tool that enables seamless integration of your Google Analytics data with Snowflake’s cloud data platform. This connector allows for effortless importation of either raw or aggregated Google Analytics data into Snowflake, streamlining the process for users. This blog post provides a comprehensive guide to implementing the connector, including its limitations, and cost considerations.
Prerequisites:
- Snowflake account (Non trial)
- GCP (with a billed project)
- GA4
Snowflake connector for Google Analytics Raw data
Procedure:
GA4 setup:
1. Sign in to Google Analytics with the email that has GCP enabled.
2. Data stream is set up so that the streamed data is loaded into Snowflake.
3.To set up the GA4 raw data extraction, do the following:
- From the dropdown list in the top navigation bar, select a GA4 property.
- Enter the Admin panel.
- Under the Product links column, select the BigQuery Links option.
- Select Link » Choose a BigQuery project. From the available list, select the GCP project where you want to extract the GA4 raw data.
- Select Daily or Streaming export type.
- Select Save.
GCP setup:
Ensure that the below setup is done at GCP end:
- Configure the OAuth consent screen and client ID for your GCP project with the help of https://other-docs.snowflake.com/connectors/google/gard/gard-connector-create-client-id
- Ensure that the Cloud Resource Manager API is enabled for your GCP project. (https://cloud.google.com/endpoints/docs/openapi/enable-api)
Snowflake setup:
1. Navigate to Snowflake Marketplace, locate the Google Analytics connector, and click “Get” to enable it in your Snowflake account.
2. Establish a connection with your Google Analytics account, choosing between OAuth or Service Account.
3. Select the specific Google Analytics properties you want to replicate, define their destination in Snowflake, and set up the synchronization schedule. Your data will now automatically refresh within your Snowflake account based on your preferred frequency.
Upon data sync, the property details are loaded into snowflake destination table in the variant format.
Limitations:
The Snowflake Connector for Google Analytics Raw Data has the following limitations:
- The Snowflake Connector for Google Analytics Raw Data is supported in AWS and Azure deployments. GCP deployments are currently not supported.
- Accounts in government regions are currently not supported.
- The Snowflake Connector for Google Analytics Raw Data does not work on Snowflake trial accounts due to external access security concerns. This is not expected to change in the future.
- You can install only one instance of the connector in each Snowflake account. To install multiple instances of the connector, you must create a separate account for each instance.
- The connector can retrieve data for Google Analytics 4 (GA4) properties only. Universal Analytics (UA) is not and will not be supported.
- To install and configure the connector, you must use the ACCOUNTADMIN role. Other roles are not supported currently.
- The daily data in Google Analytics can change up to 72 hours after recording. Currently, the connector doesn’t reflect the changes in real time.
- The Snowflake Connector for Google Analytics Raw Data assumes that the application is the owner (has OWNERSHIP privilege) of all tables and views in destination schema. Granting FUTURE OWNERSHIP privilege on tables or views in this SCHEMA/DATABASE, or using a managed schema, will result in connector not working correctly.
- Parameter AUTOCOMMIT must be enabled in the session interacting with the connector.
- The connector will not work correctly if custom date formats are set in the account.
- Emojis are not supported as parts of the application name set during connector installation.
- Streaming is only available for Cloud Projects with Billing enabled.
Snowflake connector for Google Analytics Aggregate data
The Snowflake Connector for Google Analytics Aggregate Data enables you to automatically ingest aggregated data from Google Analytics 4 (GA4) reports into your Snowflake account. The connector extracts aggregated data using the GA4 Reporting API.
Procedure:
In the navigation menu, select Data Products » Apps.
Search for the Snowflake Connector for Snowflake Connector for Google Analytics Aggregate Data, and then select the tile for the connector.
The configuration wizard starts.
Ensure that all prerequisites on the list are met and mark them as done.
Click Start configuration.
The configuration process can take several minutes. When it is finished, the wizard advances to Authentication.
To specify authentication, follow one of these options:
service accounts and OAuth2.
Select Connect.
After the process is completed successfully, ingestion configuration can begin.
In the Data sync section, select Add report.
In the new dialog, complete the following fields:
Report name.
Identifier for the new report
Specify a name that is unique for your destination schema. The name of the report must follow the naming rules for unquoted object identifiers.
Property
A Google Analytics property that holds the data you want to ingest.
Choose one of the available Google Analytics properties.
Dimensions
Google Analytics 4 dimensions to be included in your report.
Dimensions are attributes of your data. For example, the dimension city indicates the city from which an event originates. The connector includes the date dimension in all reports.
Metrics
Google Analytics 4 metrics to be included in your report.
Metrics are quantitative measurements of a report. For example, the metric active1DayUsers is the number of distinct active users on your site or app within a one-day period. You must select at least one metric.
Keep empty rows.
If this is selected, the ingested data should contain records with dimension combinations for which all the metrics are zero (indicating that there were no events correlated with those dimensions).
Sync data from
Start date for the initial load of data.
Sync schedule.
Sync frequency for the ongoing load of data.
Select Start Sync.
It can take a few minutes for the ingestion process to be complete. The table and view with your report data will not be visible in the destination database until the data from GA is fully fetched.
The report can be viewed by querying the respective report table.
Limitations:
The Snowflake Connector for Google Analytics Aggregate Data has the following limitations:
• The connector is currently only supported in AWS and Azure clouds.
• Accounts in government regions are not supported.
• Only one instance of the connector is supported per Snowflake account. To install multiple instances of the connector, you must create a separate account for each connector. A single connector can ingest data from multiple GA properties, if the configured credentials have access to those properties.
• To install and configure the connector, you must be logged in as a user with the ACCOUNTADMIN role. Other roles are not supported currently.
• The connector can only retrieve data for Google Analytics 4 (GA4) properties. Universal Analytics (UA) are not supported.
• The GA4 API may return sampled data. Currently, the connector does not indicate if ingested data has been sampled.
• The data in Google Analytics might change up to 72 hours after it is recorded. Currently, the connector cannot reflect the changes in real time.
• The Snowflake Connector for Google Analytics Aggregate Data is not supported with Snowflake trial accounts due to external access security concerns.
• The Snowflake Connector for Google Analytics Aggregate Data creates tables and views for the ingested data in a database and schema chosen by the user. Currently, the connector must have ownership of those tables and views. There must be no future ownership grants on the database or schema, and the schema must not have managed access enabled.
• Users can configure at most 40 reports. Currently, this limit cannot be increased.
• AUTOCOMMIT must be enabled to configure and use the connector.
• Currently, the connector requires the TIMESTAMP_INPUT_FORMAT to be set to AUTO.
Cost Estimation:
You will incur additional BigQuery costs for using streaming export at the rate of $0.05 per gigabyte of data. 1 gigabyte equates to approximately 600,000 Google Analytics hits, though that number will vary depending on hit size.
In Snowflake, to determine the costs generated by the connector, you can create a separate account solely for the connector. Using a specific account lets you track the exact data transfer generated by the connector.
If you cannot use a separate account for the connector, consider the following options:
- To track storage costs more easily, create a separate database for storing ingested data.
- To determine exact compute costs, allocate a warehouse only for the connector.
- To build custom cost reports, use object tags on databases and the warehouse.
Conclusion:
By adhering to the above procedures, data can be streamed into snowflake using snowflake connectors. In summary, the Snowflake Connector for Google Analytics Raw Data allows for automatic ingestion of granular GA4 data into Snowflake, while the Aggregate Data connector retrieves aggregated data using the GA4 reporting API.
Cittabase is a select partner with Snowflake. Please feel free to contact us regarding your Snowflake solution needs. Our snowflake solutions encompass a suite of services for your data integration and migration needs. We are committed to providing personalized assistance and support customized to your requirements.