Skip to main content
All CollectionsChat
File Upload for Glean Chat
File Upload for Glean Chat
Greg Bakken avatar
Written by Greg Bakken
Updated over a week ago

Overview

Glean is launching File Upload for our Assistant, Public Knowledge, and Apps. We are also providing an API to upload files to Glean Assistant. This feature will allow users to query against uploaded local files, including summarizing, analyzing, and generating content. Limits for file upload will be determined by the model utilized in the Glean instance. The content of files are retained in the chat sessions that the user uploaded them in for 24 hours after upload. File metadata is retained for 30 days in the chat session that the user uploaded them in and deleted afterwards. They are not added to your corpus or available for other users to view.

Supported File Formats

The feature supports a variety of file types, categorized as follows:

  • Document Files: pdf, doc, docx, pages

  • Spreadsheet Files: xls, xlsx, numbers

  • Presentation Files: ppt, pptx, key

  • Text Files: csv, json, xml, txt, rtf

  • Web Files: html, css

  • Code Files: java, py, js, ts, cpp, c, ipynb, sql, sh, go, yaml, log

Key Features

  1. File Upload: Users can upload multiple files (up to 5 files, each with a maximum size of 10 MB) directly from their local computer.

  2. Real-Time Querying: Users can query the text content of the uploaded files immediately after upload.

  3. Document Metadata: The chat UI will display document metadata, such as title and file type.

  4. File Deletion:

    1. Users can delete uploaded files before submitting their first query. Once a query is submitted, the files cannot be deleted directly from the chat session but will be removed when the chat session history is deleted.

    2. We will also have a default 24 hour retention policy for the content of all files uploaded.

    3. In keeping consistency with our chat retention policy, all files content and metadata will be deleted 30 days after a chat session is started.

  5. API Support: Customers who utilize our developer platform can also upload files via our API. Please visit the following documentation

  6. Security: The files will be parsed and scanned for malware before being stored within the cloud project. Any files with detected malware will have an error for upload

  7. Privacy: Files uploaded will only be accessible to the user who uploaded them

  8. Limits:

    1. The minimum file size for upload is 1 KB

    2. The maximum file size and number of files for upload in one session is determined by the token limit for your model

      1. 128K Models: 5 files and 10 MB

      2. 32K Models: 4 files and 5 MB

      3. 8K Models: 2 files and 2 MB

Known Limitations

  1. Multi-media support: We do not support images, audio, video files, and any other files outside of the list above for upload.

  2. Custom data retention policies: We do not support data retention policies beyond the default 24 hour policy for data and 30 days policy for metadata described above. You can ask users to disable chat session history or to manually delete chat sessions if you would like to delete metadata sooner.

  3. Optical Character Recognition must be enabled for your org for scanned PDFs to work: Please contact your Glean if you run into issues with upload PDFs not working, particularly those that are scanned.

How do I enable file upload?

Turn on the feature for all of their users via the Settings tag in the Assistant section of workspace. File Upload is off by default for existing customers (as of GA on 9/24) and on by default for new customers.

Future Work

  • Support for larger file sizes

  • More robust support for PDFs

  • Data Analysis for Spreadsheets and CSV Files

  • Multimedia file types:

    • Video files: mp4, mov, avi

    • Image files: jpg, png, gif

    • Audio files:mp3

  • File creation and export to cloud storage services such as GDrive and OneDrive

Did this answer your question?