Skip to main content
All CollectionsManaging your Glean deployment
Admin Guide: Managing data sources
Admin Guide: Managing data sources

An overview of the data sources page in the admin console

Cindy Chang avatar
Written by Cindy Chang
Updated over a month ago

Navigate to the admin console by clicking the wrench icon in the left-hand navigation menu. From there, you should land on the data sources page.

Monitor existing data sources

Here you'll see all the data sources you've connected and some high level information about the status of each one. Note that the statuses of each crawl, document count, and content crawl may not always be completely accurate since we may not have the most up-to-date information. Do note that if you see some content being showing as "Crawl in progress", as long as there is a document count and the data source is set as visible, people should still be able to see that content. (If this is not the case, we encourage you to file a ticket.)


Add new data sources

Clicking on the "Add data source" button will provide you with a list of all the data sources we have native connectors for. (Our website lists more for a variety of reasons. Some may have been made using our custom API or some are done via the extension and web history.)

If you scroll to the bottom of the list, you can also add your own:

  • Websites

To set up a new data sources, you will complete the following steps:

  1. Setup:

    • Provide credentials and configurations needed to crawl all data for a data source. You can save and resume progress of the setup.

  2. Manage Data:

    • You can define additional rules (e.g., inclusion or exclusion rules) to limit the types of content Glean is permitted to crawl. You can only define these rules after you save the setup in the previous step.

  3. Start a crawl:

    • In this step, you can choose to start crawl immediately or start crawl later. If you choose to start crawl later, you can start crawl in the Get Started tab under Review Data Source Crawl.


Turn on/off data source visibility

We recommend that when you've just completed setting up a new data source to first set it as "Visible to test group only". You can configure the test group by clicking on the "Manage test group" button on the main Data sources page. The test group will be able to ensure that the results that appear in Glean for a particular data source look good before it becomes visible to all teammates.

Once the test group has finished testing, make sure to set the visibility to visible to everyone.

Clicking on a data source will provide you with more information specific to that data source including crawl status and content indexed.

For example, with Dropbox, those document types are Folder, Document, Spreadsheet, Paper, Video, etc. This provides you with a high level overview of the types of content we have crawled and organized for search in Glean.


Manage data

For some data sources, such as Google Drive, you may find a "Manage Data" tab that allows you to define additional rules (e.g., inclusion or exclusion rules) to limit the types of content Glean is permitted to crawl. You can configure "Manage Data" as part of the initial setup steps as well as after the data source has been set up.

If you'd like to customize other data sources but don't see a "Manage Data" tab, please file a support ticket or contact your Glean representative for assistance with configuration. We are continuously working to expand the range of self-serve configuration options available for our data sources.


Did this answer your question?