Skip to content

Latest commit

 

History

History
516 lines (315 loc) · 16.7 KB

Survey.rst

File metadata and controls

516 lines (315 loc) · 16.7 KB

Survey of Existing Jupyter Security Documentation

Note

This is meant to be a high-level survey of existing Jupyter documentation with a focus on security-related instructions. This survey was originally compiled in August 2021 by Kay Avila with input and review from Terry Fleury and Jeannette Dopheide as part of a Trusted CI engagement .

Jupyter

Jupyter Notebook

JupyterLab

Jupyter Server (backend for JupyterLab)

  • https://jupyter-server.readthedocs.io/
    • Separated into Users, Operators, Developers, Contributors, Other
  • Users - https://jupyter-server.readthedocs.io/en/latest/users/index.html
    • Nothing specifically about security
  • Operators - https://jupyter-server.readthedocs.io/en/latest/operators/index.html
    • Installing a Jupyter extension automatically enables it [not ideal from a security standpoint]
    • Running a public Jupyter Server (intended only for single user)
      • Uses ZeroMQ
      • Can use a simple password with an automatic setup in the user interface or running "jupyter server password", or by manually creating a hashed password and adding it to the configuration file
      • Recommends using SSL
        • Brief description of self-designed versus LetsEncrypt
        • Links to ArsTechnica article about obtaining paid certificate
        • Also links to LetsEncrypt further down in the page, under Running a public notebook server [this is confusing!]
      • Later on the same page, more information about how to use SSL certs and info on how to use LetsEncrypt
      • Firewall setup - allow public connections and localhost connections
      • Overriding Content-Security-Policy to allow embedding into another web page
      • Can specify an external gateway server to do kernel management
      • Mozilla and others recommend enabling Content Security Policy headers to provide cross-site scripting
        • Disables inline JavaScript - which causes issues for Jupyter
        • Restricts communication to https, which disables ws/wss, which Jupyter uses for interacting with kernels
        • Need to add the following to the CSP headers -
          • 'unsafe-inline' and connect-src https: wss:
        • Note: not much about how this leaves Jupyter vulnerable, and nothing about how cross-site scripting protections can be enabled in another way
    • Security in the Jupyter Server
      • Token-based auth - on by default
        • Can be provided to the server in an authorization header, URL parameter, or password field of login form
        • If Jupyter server will launch the browser, an additional token is generated and then used to set a cookie
        • Can set a password instead (jupyter server password)
        • Possible to disable authentication, but not recommended
      • Security in notebook documents
        • [Duplicated information from Jupyter Notebook - arbitrary code execution, trust model, etc.]
  • Developers - https://jupyter-server.readthedocs.io/en/latest/developers/index.html
    • Depending on Jupyter Server [does not mention how to watch for security issues]
    • Note: nothing about how to contribute info about security issues here
  • Contributors - https://jupyter-server.readthedocs.io/en/latest/contributors/index.html
  • Other - https://jupyter-server.readthedocs.io/en/latest/other/index.html
    • FAQ is very short - just one ("Can I configure multiple extensions at once?")
    • Config file and command line options - a few mentions of impact on security from various settings
    • Changelog is buried here (?)

JupyterHub

  • https://jupyterhub.readthedocs.io/en/latest/getting-started/security-basics.html
    • Note: the list at the top of subjects covered is different from the order they're actually covered in
    • Enable SSL (note at top about not running w/out SSL on public network)
      • Adding SSL key and cert to JupyterHub
      • Using LetsEncrypt
      • Mention of SSL termination happening outside of the Hub, e.g. SSL termination provided by Nginx
    • Proxy authentication token
      • Manual secret token between Hub and Proxy
      • Options: set in config file, or use environmental variable
      • If not set manually, will be negotiated between Hub and Proxy (and Proxy must be restarted anytime the Hub is restarted)
    • "Cookie secret" encryption key to encrypt browser cookies used for auth
      • Options: set file location in config file, environmental variable, or store in the config file
      • List of cookies used
  • https://jupyterhub.readthedocs.io/en/stable/reference/websecurity.html
    • Designed by default for semi-trusted users, takes extra work to secure for untrusted users
      • Note: Confusing/unclear sentence - "If the Hub is serving untrusted users, many of the web's cross-site protections are not applied between single-user servers and the Hub, or between single-user servers and each other, since browsers see the whole thing (proxy, Hub, and single user servers) as a single website (i.e. single domain)."
        • Makes it sound like protections are not applied for untrusted users, as opposed to making it clear admins need to be aware of this
    • Protecting users from each other
      • Admins must ensure users cannot modify their single-user notebook servers or the configuration of their notebook server
    • Mitigation options
      • Run single-user servers on subdomains (requires wildcard ssl cert)
        • Highly encouraged because resolves cross-site issues
      • Disable user-owned config files from being loaded
        • Note: Typo - "After implementing this option, PATHs and package installation and PATHs are the other things that the admin must enforce."
      • Prevent spawner from evaluating shell config files
      • Run single-user servers in virtualenvs with disabled system-site-packages, and do not let user install packages
        • This impacts only the server, not the environment(s) where their kernel(s) run
      • Encryption
        • Communication among proxy, hub, and single-user notebooks is unencrypted by default
        • Use IPC instead of ZeroMQ since the latter is unencrypted
          • Mentions that "internal_ssl option will eventually extend to securing the tcp sockets as well."
      • Use security audits
    • Information on vulnerability reporting - report to [email protected] and can use PGP public key to encrypt
  • https://jupyterhub.readthedocs.io/en/latest/getting-started/institutional-faq.html#for-it
    • Section - "How would I set up JupyterHub on institutional hardware?"
      • Zero to JupyterHub for Kubernetes
      • Littlest JupyterHub (runs in a VM)
    • Section - "Is JupyterHub secure?"
    • Section - "Can JupyterHub be used with my high-performance computing resources?"
      • Yes - e.g. Dask
    • Section - "How much resources do user sessions take?"
      • Note: says it's configurable, but doesn't link to documentation on how to do this
  • https://jupyterhub.readthedocs.io/en/latest/getting-started/authenticators-users-basics.html - Authentication and User Basics
    • Admin accounts and whether they have access to user notebooks
  • https://jupyterhub.readthedocs.io/en/stable/reference/spawners.html - under the Encryption section
    • Encryption among Proxy, Hub, and Notebook
  • https://jupyterhub.readthedocs.io/en/stable/reference/config-sudo.html
    • Running the Hub process without root privileges

JupyterHub for Kubernetes / Zero-to-JupyterHub

  • https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html
    • Advice is mostly for cloud-based deployments
    • Information on vulnerability reporting - report to [email protected] and can use PGP public key to encrypt
    • HTTPS
      • Add LetsEncrypt to proxy by editing config.yaml file
        • Recommends using static IP address as a load balancer IP if LoadBalancer proxy being used
      • Or, manual https certificate ("considered an advanced option")
        • By configuring in config.yaml file or
        • Use kubectl to add a secret resource
      • Off-load SSL to a load balancer
    • Secure access to helm - see the relevant Kubernetes docs
    • Delete Kubernetes dashboard
    • Keep RBAC enabled, otherwise all pods are given root equivalent permissions
      • However, though strongly discouraged, also gives instructions to disabling RBAC
    • Instructions on how to give users access to the Kubernetes API
    • Block access to metadata about cloud from the provider
      • With a NetworkPolicy enforced by NetworkPolicy controller
        • Typo: We recommend relying on this approach if you had a NetworkPolicy controller
      • Default configuration uses singleuser.cloudMetadata.blockWithIptables
    • Kubernetes Network Policies
      • Note that any unsupported options will be silently ignored
      • Enabled by default in JupyterHub helm charts in version 0.10+
      • Network policies by default do not allow user pods to talk to JupyterHub component pods
        • Gives instructions on how to add additional access
      • Default policy allows all egress traffic
        • Gives information and example on how to override this with more restrictive controls
    • Restricting load balancer access
      • By default, any IP is allowed to access the load balancer