Introduction
The Salesforce connector for Glean allows Glean to fetch and index content from Salesforce, ensuring users can search and access documents for which they have authorized permissions.
Authentication: Glean requires the Salesforce admin to authenticate and set up a token for Glean during setup.
Data Storage: All data is stored in the cloud project or instance within the customer's cloud account, ensuring no data leaves the customer's environment
API Usage
Integration Features
Content Captured: Glean captures Salesforce standard objects such as Account, Campaign, Case, Contact, Lead, Opportunity, and Knowledge Articles including custom objects (per configuration)
Permissions Enforcement: Glean respects all user access permissions, ensuring users only see search results for documents and objects they have access to. When a user clicks on a search result, they are taken to the Salesforce web application, which enforces the permission.
Versions Supported
There are no specific version limitations of the Salesforce connector.
Objects Supported
The Salesforce connector supports the following standard objects:
Accounts
Campaigns
Cases
Case Comments
Contacts
Documents
Discussion Forums
Knowledgebase Articles
Leads
Opportunities
Tasks
Chatter (Beta and must be enabled by a Glean representative)
Authentication Mechanism
The Salesforce Administrator will establish a service account that is either assigned the System Administrator profile or a customized non-administrator profile. This account will be utilized for the generation of a token for Glean within the Salesforce environment. Subsequent to this, authentication will be conducted through an OAuth flow to facilitate the authentication of Glean with your Salesforce instance.
Please be advised that Glean does not recommend utilizing a Salesforce account that is associated with an employee. This is because if the employee departs from the company or if the account becomes disabled, access to data sources will be adversely affected.
Connector credentials requirements
The Salesforce connector for Glean provides two options for configuring the Salesforce Service Account profile:
Full Administrator
The user must be an admin in the Salesforce instance being authenticated, i.e., have the “System Administrator” profile.
We need an admin profile as many objects crucial to understanding the permissioning model are only accessible by an admin profile.
The user will require read-only access to all of the Objects that the customer would like to have indexed.
Non-Administrator Custom Profile
Note: Ensure all of the following permissions are set to avoid integration issues
Login to Salesforce. Navigate to Setup on the top right. On the left-hand side, under Administration (Administer for Salesforce Classic), go to Users (Manage Users for Salesforce Classic) and then Profiles.
Select an existing Profile that will be used for the integration and hit Edit, or create a New Profile.
Under Custom App Settings, if you plan on indexing the following objects, ensure that the settings in Table 1 are checked.
Under Administrative Permissions, ensure that the permissions in Table 2 are checked (any unset permission may lead to integration issues)
Under General User Permissions, ensure that
Access Activities is checked. This is required to crawl tasks within the Salesforce instance.
Allow View Knowledge is checked. This is used to crawl all supported knowledge bases within the Salesforce instance.
View Archived Articles is checked. This is required to crawl archived articles
Under Standard Object Permissions, ensure that we have both Read and View All permissions to the following objects:
Accounts
Campaigns
Cases
Contacts
Leads
Opportunities
Save the Profile. Finally, back on the left-hand side, select Users and create a new user with the profile associated with the previous steps. Ensure that Knowledge User and Service Cloud User are both checked before hitting Save.
You are now ready to authorize access to the main page for the newly created user.
Table 1. Custom App Settings
Content | Permission Setting |
Discussion Forums | Community (Standard__Community): Visible |
Discussion Forums and Chatter | Salesforce Chatter (Standard__Chatter): Visible |
Table 2. Administrative Permissions
Permission Setting | Reasoning |
API Enabled | Allows access to Salesforce API to ingest data |
View Roles and Role Hierarchy | Captures document permissions for any object (users, permission sets, etc.) with an associated Role |
View Setup and Configuration | Captures organization-level document permissioning |
View Data Categories in Setup | Captures organization and access control in Salesforce Knowledge and Discussion Forums (Chatter) |
View All Profiles | Captures document permissions for any object (users, permission sets, etc.) with associated Profiles |
View All Users | Captures users to understand document permissions for each individual |
View Reports in Public Folders | Captures public access reports |
View Dashboards in Public Folders | Captures public access dashboards |
Chatter Internal User | Captures discussion forums, chatter, and other feed-related items |
View All Data | Allows the ability to directly query for all tasks and feed-related items |
Connection instructions
Once the pre-requisite Service Account and profile are created, connecting your Salesforce instance as a data source for Glean will require a few steps:
In the Admin console within the Glean, select Data sources → Add data source → Salesforce
Enter a name for your data source in the text field Name and then (optionally) select an Icon
Click the checkbox Use optional custom login domain (optional)
Click Authorize
Login to Salesforce with the credentials
Choose Crawl now or Do this later and Save
Additional Object Setup
Glean has the ability to crawl additional native objects (not crawled by default) and custom Salesforce objects, which can be set in the Admin console under the Salesforce data source and object tab. Glean requires the object name, permission model, and mapping of fields for Title and Owner and optional filters. See an example of the self-service configuration:
User Requirements
If adding a new custom object, the integration account configured to use Salesforce must either be a full administrator or be configured with the following permissions (either the Profile or the User under Setup)
Custom Object Permissions – View All (for every custom object to be indexed)
If adding a new native object, the integration account configured to use Salesforce must either be a full administrator or be configured with the following permissions (either the Profile or the User under Setup)
Object Permissions for the new native object – view all + read access to all fields
Setup
To enhance the user experience, Glean will progressively integrate additional features to augment the flexibility for indexing fields across various crawlable objects.
Navigate to Setup → Objects and Fields > Object Manager, and then click on the object to find details on the object. Please Find all associated fields under Fields & Relationships.
Schema
Glean requires the field names of the Salesforce API, which will be utilized to populate document metadata and present it on the Glean user interface. The field name for the Native or Custom Object can be found in the Field Name column on the Field & Relationships screen, as illustrated below.
Custom objects typically conclude with the suffix __c. Below is an example of Salesforce API field names associated with one of our test custom objects.
Document Title
The Field Name will populate the title for the documents seen on Glean under search results.
Document Body
The Description field is designated for populating the body of the document. It is imperative to specify both the field that corresponds with the text/HTML and the MIME type of the field, such as text/plain or text/html.
Document Author
The Salesforce API field name corresponds to who created the object, which may be an Id or direct User object, however, by default, CreatedBy represents a User object, but CreatedById represents an ID. For example, Asset has CreatedById, representing the author of an Asset object
Status
You can provide the salesforce API field name, which specifies the object's status (if any). For example, an Asset object has Status, which represents the status of an Asset.
Document Create Time
By default, it is CreatedDate (not displayed on field relationships by default)
Document Last Modified Time
By default, it is SystemModstamp (not displayed on field relationships by default)
Custom Properties
As an option, provide all the fields that are wanted as a facet (a way to filter a search) and/or as an indexable field
Indexable fields types:
Textarea
String
Email
Picklist
multipicklist
Notes:
All salesforce API fields that are specified as custom properties with type “indexable field types” will be searchable within Glean
Indexable fields are the ones whose content is searchable on Glean
Types are inferred from the Salesforce object schema. Each native object has a predefined type, and for each custom object, you specify the type while setting it up
All the fields (including indexable and non-indexable fields) will be added as a string facet within Glean
Permissions
Advanced permissions, such as organization-wide defaults for Salesforce additional crawlable objects, are not supported
Permissions inherited from other salesforce objects are not supported. For example, Case is the parent object of LiveChatTranscript. If users can access Case, they will automatically have permission to access LiveChatTranscript without explicit permission. Glean requires each object’s permissions need to be explicitly defined and cannot rely on inheritance from other objects.
Inferred permission supported:
The user’s license provides similar access to a particular object type.
PermissionSets
Share records
Items crawled
Content Indexed
Accounts
Campaigns
Cases
Case Comments
Contacts
Documents
Discussion Forums
Knowledgebase Articles
Leads
Opportunities
Tasks
Chatter (Beta and must be enabled by a Glean representative)
Identity
Users: Information about users within the Salesforce
Groups: Details about groups within Salesforce
The identity crawl operates with the following configurations:
Incremental Identity Crawls: These are performed to capture changes since the last crawl.
Full Identity Crawls: These are conducted periodically to ensure all identity data is up-to-date.
Estimating the Number of API calls
The number of API calls depends on multiple factors, and Glean has listed a few factors:
Objects being crawled
Permission-ing models used by different objects
Crawl frequency of different objects
Request payload size, which in turn depends on the number of custom fields, etc.
Actual usage may vary significantly. Please work with your Glean representative to develop an estimate.
Update frequency
Content updates for the Salesforce connector in Glean can happen quite rapidly, depending on the type of update and the configuration settings. Here are the key areas:
All content objects
Full crawl frequency: 28 days
Incremental crawl frequency: 10 mins
Case Objects: 2 mins
For share records
Glean crawl all share records every hour
For identity objects (other than share records)
Glean crawl all permissions every hour
Changes in data must be crawled, processed, and indexed before the data is reflected in the UI. Actual time may vary depending on the number of changes and corpus size. For the most up-to-date crawler refresh information, please refer to [External] Glean crawling strategy
How the crawl works
The Salesforce crawler follows the traditional crawler strategy, including utilizing the Salesforce API and the following ways to get and update data:
Identity Crawl: updating and adding People data, including users, groups, and other information
Content Crawls: Full crawls the entire defined scope of the application whereas incremental crawls only capture the changes from the previous full or incremental crawl
Known Limitations in Crawl
The crawl speed can be affected by the daily rate limits imposed by the Salesforce API, which varies by Salesforce license type. To create a rough estimate of API calls to Salesforce, take the number of objects and divide by 2,000.
For Tasks, Glean currently captures only partial permissions. Only the owner of the task(assigned to) and users above the owner in the Salesforce role hierarchy will have access to the task on Glean search. Additionally, users with a view all data access will have permission to view all Tasks on Glean search.
Custom objects must be set up and are not crawled by default
Excluded or Red-listed fields can be accomplished and must be set up by a Glean representative
API endpoints
Purpose | Cloud Endpoint | HTTP Method | Authentication | Description |
/queryAll endpoint to get objects in a paginated manner | GET | Authorization: Bearer token
| Glean uses the maximum allowed page size 2k. Note: Number of objects fetched in an API call may be lower than 2k depending on other parameters like request size | |
/sobjects to crawl all valid salesforce objects within salesforce instance | GET | Authorization: Bearer token |
| |
fetch object descriptions for a given object | GET | Authorization: Bearer token | Object description consists of object metadata like fields, field labels, field types, etc | |
fetch the authorization_code during integration user account authentication | GET |
| The authorization_code is used to fetch the OAuth access token.
Note: Only used during the setup of a new salesforce instance | |
fetch user metadata for the authenticated user | GET |
| Note: Only used during the setup of a new salesforce instance | |
fetch the OAuth access token in exchange for authorization_code fetched using the /services/oauth2/authorize endpoint | GET |
| Note: Only used during the setup of a new salesforce instance |
Content Configuration
Search Results as Sales and Service Cloud
The Glean Salesforce data source connector is compatible with objects from both Sales Cloud and Service Cloud. When Glean delivers results in response to a query involving Salesforce data, it will identify the data source as either Sales Cloud or Service Cloud, based on the objects retrieved. Glean facilitates the return of both object types under the designation of Salesforce. Should you require assistance with configuring this setup, please do not hesitate to contact Glean support.
Inclusion and Exclusion Introduction
If Inclusion (Green-Listing) options are enabled, only content from the Inclusion content will be indexed. If Exclusion (Red-Listing) options are enabled all content in the exclusions will be removed. If both rules are applied to the same piece of content, then the content will NOT be indexed as the Red-listing rule takes priority.
The rules below should be used MINIMALLY to preserve the enterprise search experience, as most end-users expect to find all content. Most customers do not apply any rules, or apply red-listing rules sparingly for sensitive folders or objects.
Exclusion (Red-Listing) Options
Glean can exclude fields per object. Please contact your Glean representative to implement the configuration.
Troubleshooting
Why is My Object or Field Not Showing Up in the Glean search?
If a field is not showing up (but is of an indexable field type), the best way to ensure that it shows up is to have it specified in the Salesforce object configuration through the self-serve. If the issue persists, please contact your Glean representative.