Skip to main content
All CollectionsCollections
Github Enterprise Server Connector
Github Enterprise Server Connector
D
Written by Dan Iacono
Updated over a month ago

Note: The instructions below will work only for on-prem instances that the Glean Crawler running on GCP can access. Please reach out to Glean Support for any network configuration required.

Overview

  • Glean requires authentication to the GitHub instance in order to fetch relevant information.

  • Authentication is done by creating an application in GitHub.

  • Glean understands all user access permissions and strictly enforces permissions for users at the time of the query which ensures that users are not able to see results which they do not have access to.

  • It’s important to note that all data is stored in the GCP project in the customer's cloud account and no data leaves the customer's environment.

Integration Features

For GitHub, Glean will capture the following content:

  • PR descriptions

  • PR conversations/comments

  • Issue threads

  • Commit messages for main branch

  • Wikis

Additionally, we will capture the following from the latest commit on the main branch:

  • Directory/file names

  • Full content of documentation files only (.md and .txt)

We do not currently support code search or Github Pages. Both on-prem and cloud are supported.

API Usage

Glean uses the standard API to ingest all data.

In order to capture changes as quickly as possible, Glean will deploy a webhook which will send push notifications to an endpoint deployed in the GCP project (in your cloud infrastructure).

The app requests access to the following with a read-only scope:

  • Repository permissions

    • Administration

    • Contents

    • Issues

    • Metadata

    • Pull requests

    • Commit statuses

  • Organization permissions

    • Members

It also subscribes to the following events:

  • Commit comment

  • Issue comment

  • Member

  • Organization

  • Pull request

  • Pull request review

  • Pull request review comment

  • Push

  • Repository

  • Team

  • Team add

Setup

Prerequisites

User requirements

  • The user must be an organization administrator in GitHub.

Installation Process

Step 1. Create a GitHub App

This app will be used by Glean to crawl your GitHub instance.

  1. Go to your GitHub Server.

  2. Click on your organization.

  3. Click settings.

  4. Click GitHub Apps.

  5. Click New GitHub App.

  6. Fill the following fields:

    1. Name: Glean

    2. Homepage URL: https://app.glean.com

    3. Identifying and authorizing users

      • User authorization callback URL: Copy the generated URL from the setup page

      • Request user authorization: unchecked

    4. Post installation

      • Leave blank

    5. Webhook

      • Webhook Active: checked

      • Webhook URL: Copy the generated URL from the setup page

      • Webhook secret:

        • Copy the webhook secret into the corresponding field in Glean

        • Copy the webhook secret into the corresponding field in the GitHub App

    6. Repository permissions

      1. Set only the following to read-only:

        • Repository permissions

          • Administration

          • Contents

          • Commit statuses

          • Issues

          • Metadata

          • Pull requests

          • Pages

        • Organization permissions

          • Members

        • User permissions (or Account Permissions)

          • Email addresses

    7. Subscribe to events

      • Check only the following:

        • Commit comment

        • Issues

        • Issue comment

        • Member

        • Organization

        • Pull request

        • Pull request review

        • Pull request review comment

        • Push

        • Repository

        • Team

        • Team add

    8. Where can this App be installed: Any account

Step 2. Configure the GitHub App

Copy the following values into the corresponding fields in Glean:

  • App ID

  • Client ID

  • Client Secret

At the very bottom of the page, click "Generate a private key" It will download the key to your local machine. Upload this file into the corresponding field in Glean.

Step 3. Install the GitHub App

Click on Install App from the menu on the left. Click Install for your organization.

Step 4. Configure additional configs on Admin Console

Enter the following configs in Glean:

  1. Git Domain

  2. Organization Name

Post Setup

  • Exclusions/Redlisting repositories is possible, as well as control over which file extensions have full content indexed.

  • Users will be prompted to authenticate to GitHub oauth to help sync user aliases. They will not be able to see data in private repositories until the auth flow is completed for them. Once authentication is complete the next entity crawl will sync the aliases, which happens every hour.

For any questions or issues with this setup, please reach out to support@glean.com.

Did this answer your question?