Overview
Glean provides our customers the ability to deploy Glean software inside their own Google Cloud Platform (GCP) project. This deployment requires your GCP admin to:
Create a new GCP project.
Associate a valid billing account.
Enable applicable GCP APIs.
Request the required quota increases from GCP.
Create a Service Account with a Project Owner role and associate a JSON account key.
Notify Glean of the GCP zone selected, the Project Name, Project ID, and Project Number.
After completing the above, Glean's systems will automatically build and deploy the required compute, workflows, and software into your GCP project.
At this stage, Glean will advise you that your tenant is ready, allowing your admins to proceed with the setup process, such as configuring Single Sign-On (SSO) and connecting your data sources.
Preferred GCP Regions
To ensure our customers enjoy the highest quality of machine learning performance, reduced latency, and adherence to regional data residency laws, Glean advises deploying our services in one of the following recommended GCP regions:
Iowa, USA (us-central1)
Taiwan, APAC (asia-east1)
The Netherlands, EMEA (europe-west4)
These regions are equipped with Tensor Processing Units (TPUs), which are specifically designed to accelerate machine learning tasks. Additionally, selecting these regions can lead to cost savings and help maintain compliance with various regulations.
While Glean does support additional GCP regions, it's important to be aware that these alternative locations utilize Graphics Processing Units (GPUs) rather than TPUs for machine learning workloads. Utilizing GPUs for these workloads means that they will take longer to complete; increasing the cost of running the workflows. Additionally, depending on the region, only a limited range of GPU models may be available, some of which will result in a lower-quality output. As such, we strongly recommend hosting your Glean projects within the preferred regions listed above to achieve the best balance of performance and cost-efficiency.
Should you need more information or assistance in determining the most suitable GCP region for your Glean deployment, please don't hesitate to reach out to your dedicated Glean engineer.
GCP Environment Preparation Process
1. Select a GCP Region
Select a supported GCP region for Glean to build your environment in. You must notify Glean of the GCP zone selected, e.g. us-central1-a
2. Create a new GCP Project
Go to the Manage Resources page in the GCP console and click Create Project.
In the New Project window that appears, add a project name, organization, and location. For project name, the preferred format is glean-{customer name} or glean-{customer name}-{prod/sandbox}
Make sure your project is created under the same organization as your GSuite account, not “No Organization”.
Click Create.
Notify Glean of the following:
Project name, e.g. “glean-company”
Project ID, e.g. “glean-company”
Project number, e.g. "715000000000”
Go to Billing in the GCP console.
Click Link a billing account to set up billing for this project.
Ensure that the billing account has a corporate credit card attached to it as a “free trial billing tier” will not work.
3. Enable GCP APIs for the Project
Enable the following APIs by going to the link and clicking Enable API:
Cloud Resource Manager API (cloudresourcemanager.googleapis.com) https://console.cloud.google.com/apis/api/cloudresourcemanager.googleapis.com/overview?project=[PROJECT_ID]
Service Usage API (serviceusage.googleapis.com)
https://console.developers.google.com/apis/api/serviceusage.googleapis.com/overview?project=[PROJECT_ID]Compute Engine API (compute.googleapis.com)
https://console.developers.google.com/apis/api/compute.googleapis.com/overview?project=[PROJECT_ID]Cloud SQL Admin API (sqladmin.googleapis.com)
https://console.developers.google.com/apis/api/sqladmin.googleapis.com/overview?project=[PROJECT_ID]Vertex AI API (aiplatform.googleapis.com)
https://console.cloud.google.com/apis/api/aiplatform.googleapis.com/metrics?project=[PROJECT_ID]Cloud Tasks API (cloudtasks.googleapis.com)
https://console.cloud.google.com/apis/api/cloudtasks.googleapis.com/metrics?project=[PROJECT_ID]Cloud Key Management Service (KMS) API
https://console.cloud.google.com/apis/api/cloudkms.googleapis.com/metrics?project=[PROJECT_ID]
4. Quota Changes Requests
Per the Glean setup, Glean will check quota requirements and will make quota change requests as needed. Customers will be alerted when quota requests need approval. All of the quota needs are listed in this spreadsheet. Please note that some quota requests will require filing a ticket with GCP support. They usually respond within 2 days at the latest.
The quotas you will request will differ depending on the size of the Glean tenant that will be built:
Use Case | Deployment Size |
Sandbox / UAT environments / < 1M docs | Small Deployment |
<50M documents to be indexed | Medium Deployment |
>50M documents to be indexed | Large Deployment |
If you are not sure how to proceed, please consult with your Glean engineer.
Note: Quota requests for some resources for a Large Deployment may fail depending on the GCP region you have selected due to the compute types available. If this is the case, please work with your Glean engineer, who can advise you further.
5. Create a Service Account
Go to the Service Accounts page in the GCP console and click Select a Project.
Select your project and click Open.
Click Create Service Account. Enter the service account name (glean-admin), ID, and description (optional), then click Create.
Click the Select a role dropdown to make your service account an Owner of the project.
Click Continue.
Ignore the Grant users access to this service account.
Click Create Key. In the panel that appears, select the key type JSON then Create.
A private JSON key will be saved to your computer.
Contact your Glean representative to let them know that the project has been created, and provide them with the project name and project ID.
Note: A service account that is generated with an owner role for a specific project in GCP is limited to the resources and services within that specific project. It does not have permission to access or modify resources outside of that project, even if it's within the same GCP tenant. The owner role grants full access to all resources in the project where it is assigned, but it does not extend to other projects in the GCP tenant.
Glean Self-Service GCP Environment Validation
After a confirmation from Glean that you are set up in the setup portal, browse to https://app.glean.com/admin and enter your email address to generate and send a magic link to your email. The first screen is an admin setup screen. Either add additional admins or skip to the next step.
Follow the on-screen instructions (The instruction is also provided below as a preview)
Upload your JSON key into the Glean portal. The portal will validate the key and provide correctional instruction, which includes quota updates (step 9) and organizational constraints if you have that in the parent organization.
If the validation passes, then Glean is ready to deploy to your GCP environment! 🎉
Troubleshooting
If the validation fails, the error message will indicate the issue.
Typically this is due to:
Org Constraints that have been applied that will interfere with the Glean build.
Missing or insufficient quotas.
Incorrect permissions or roles assigned to the Service Account.
GCP APIs that have not been enabled.
Please correct the issues indicated before attempting validation again. If you are unsure of anything, please contact your Glean engineer, who will assist you.