Jupyter fsspec documentation

Jupyter FSSpec inside JupyterLab

Welcome to the jupyter_fsspec documentation, the Jupyter extension for the fsspec Python library.

jupyter_fsspec provides a file browser for your fsspec filesystems using a config file, and a Python module (jupyter_fsspec.helper) for using your defined filesystems inside your notebook kernels.

Installation

You can install jupyter_fsspec with pip (make sure you’ve installed JupyterLab or another editor first):

pip install jupyter-fsspec

You can verify that the JupyterLab extension and Jupyter Server extension are enabled properly with:

jupyter labextension list
jupyter server extension list

Basic Usage

jupyter_fsspec lives in the JupyterLab right sidebar, and when you open it, you will see a list of the filesystems you have defined in the config file (see below). When you select one of them, you will see a file browser (tree view) of those files below.

Basic navigation works as you’d expect: Click to expand/collapse folders, and right click for context options (such as copying the path of the current file to the clipboard).

Uploading Files

Uploading files to your FSSpec filesystems allows you to:

  • Transfer data between your local environment and remote storage systems

  • Share notebooks with datasets across different computing environments

  • Back up your work to cloud storage or other remote locations

  • Prepare data for distributed processing in cloud environments

You can upload files from multiple sources:

  • From bytes inside your active notebook kernel:

    • Import the jupyter_fsspec.helper module

    • Designate bytes for upload with jupyter_fsspec.helper.set_user_data(some_bytes)

    • Right-click the target folder or location in Jupyter FSSpec

    • Select Upload from Helper Module

    • Useful for: Programmatically generated data, processed results, or content modified in your notebook

  • From your local computer:

    • Right-click the target folder or location in Jupyter FSSpec

    • Select Upload from Computer

    • Choose a file from the browser’s file picker

    • Useful for: New datasets, configuration files, or results from external tools

  • From JupyterLab’s integrated file browser:

    • Right-click a file in JupyterLab’s File Browser

    • Select Select as Upload Source for FSSpec

    • Right-click the target folder or location in Jupyter FSSpec

    • Select Upload from Jupyter File Browser

    • Useful for: Moving files between your JupyterLab workspace and remote storage systems

Note

To transfer files between different remote filesystems (e.g., from S3 to GCS), you’ll need to use the helper module in your notebook to download from one source and upload to another. Direct remote-to-remote transfers are not currently supported through the UI.

Config File

To define your fsspec filesystems, you will need to list them in the Jupyter config folder, inside a file named ~/.jupyter/jupyter-fsspec.yaml. Here’s a sample file where for the local filesystem instances, file:///Users/finnmertens is the server root path:

sources:
  - name: "Cell filter repo"
    path: "file:///Users/finnmertens/cfilter/jupyterlab"
  - name: "Averager project"
    path: "file:///Users/finnmertens/averager"
  - name: "Remote MyBucket"
    path: "s3://mybucket"
    args: []
    kwargs:
      anon: false
      key: "my-access-key"
      secret: "my-secret-key"
      client_kwargs:
        endpoint_url: "https://s3.provider.com"
  - name: "MemFS"
    path: "memory://sample"

The config file has a list of sources, where each item needs a name and a path URL. The name is a unique identifier, so it should not be duplicated. The type of filesystem fsspec should construct is required in the path URL as the protocol file:// etc. For local filesystems, when the path is provided with no prefix in the protocol URL e.g. file://, the filesystem will be instantiated at the root of the corresponding filesystem. See the fsspec documentation for more information about available protocols and filesystem instantiation. If you provide the protocol argument it will be ignored. The config file path URL option accepts directory paths but does not support specifying specific files paths.

Lastly, you can pass additional arguments to the fsspec filesystem contructor by using the args and/or kwargs keys. You can check the fsspec docs for the available options that each filesystem implementation offers.

Warning

By default, the file browser in jupyter_fsspec does not enforce Jupyter Server’s root directory restriction and will allow access to paths outside of it. To restrict access:

  • Set the CLI flag --JupyterFsspec.allow_absolute_paths=False when instantiating the server

  • Set the corresponding environment variable in the kernel environment environment (see helper module section)

This will ensure that jupyter_fsspec will only instantiate filesystems rooted within the server’s working directory in both the browser UI and in the kernel side. Since the kernel is usually a fully privileged process, this restriction only applies to the automatic behavior of jupyter_fsspec.

Inactive Filesystems

Filesystems that are not instantiated due to an error will appear grayed out and will display an error message on hover. On click, there will be more information logged to the browser console.

Jupyter FSSpec Inactive Filesystem

The helper module

You can import the jupyter_fsspec.helper module into your notebooks to interact with filesystems defined in your config file:

# Import the helper module directly
from jupyter_fsspec import helper

# Get an fsspec filesystem using the config name
# (and use it as you would any fsspec filesystem)
fs = helper.filesystem('Averager project')

with fsspec.open('file://my/file/path', 'rb') as fhandle:
    filebytes = fhandle.read()
filebytes[:256]

Note

In disctrubuted environments, (for e.g. remote kernels) the paths in the code that the helper uses may not be valid unless the kernel and server share a filesystem.

Warning

The environment variable “JUPYTER_FSSPEC_ALLOW_ABSOLUTE_PATHS” defaults to true, and should be set to false in the kernel environment to ensure that the the helper does not instantiate filesystems with absolute paths.

Examples

Practical examples demonstrating how to use Jupyter fsspec with different storage systems: