To enable Glean Generative AI features, you need to select a LLM provider. We recommend using Glean's Azure OpenAI LLM Provider. We also support using your own LLM Provider.
This can can be setup in Workspace Settings. To use your own LLM Provider, please work with your Glean sales or technical services person.
Glean’s Azure Open AI (Recommended)
Glean has a signed agreement with Azure Open AI promising:
0-day retention: Your data will not be stored by Azure.
Data will not be used to train any custom large language models.
Data encryption: All data is encrypted in transit.
Compliance: Azure is compliant with a variety of industry standards. See details here
The advantages of this approach are
Guaranteed capacity: Scale to all your users
Performance: Low latency
Access to all the necessary models out of the box.
0-day retention isn’t the default on Azure Open AI.
Your own LLM Provider: OpenAI, Azure OpenAI, Google PaLM
If your company wants to use its own LLM Provider, we support OpenAI, Azure Open AI, and Google PaLM provider. This may be preferable from a security perspective. You should ensure your provider has sufficient:
Capacity: Assuming 2 calls to GPT-4 per query, we ask that you ensure your LLM has the right TPM (tokens per minute) and RPM (requests per minute) to support this.
Access: We require access to GPT-4, GPT4-32k and GPT-3.5-turbo and Embedding (text-ada-embedding-002) models.
Please reach out to the Glean team to get this set up.
Do you support open-source models?
We don’t support open-source models because we’ve found our features to be not up to our quality standards with models like Alpaca (65B), Dolly (12B), etc.
LLM Providers not supported
We constantly evaluate other language models like Anthropic, Cohere and various other open source models, but we do not support the ability to set up our AI features with them.
Typically, capacity is expressed in RPM (requests per minute) and TPM (Tokens per minute). To be safe, we recommend 10RPM and 25k TPM for every 500 users. E.g.
20 RPM and 50k TPM for 1k users.
100 RPM and 250k TPM for 5k users
175 RPM and 350k TPM for 10k users
250 RPM and 500k TPM for 20k users
For the embedding model, our capacity requirement is
150 RPM and 22.5k TPM for 1k users
I want to use my own LLM Provider but don’t currently have GPT-4 access. How do I get access?
You have two options: OpenAI or Azure OpenAI.
We recommend Azure OpenAI. To get access, you will first need to fill out the following Azure OpenAI Service form. Once that has been approved, you will need to request access to GPT-4 public preview here.
If you choose to use OpenAI, you need to apply for the OpenAI GPT-4 waitlist.
After that has been completed, please reference the capacity requirements for your company size and submit a request to increase the quotas to the appropriate levels.
We currently do not have estimates on how long you will be on the waitlist to get access via either OpenAI directly or on Azure.