Survey of Existing Jupyter Security Documentation

Note

This is meant to be a high-level survey of existing Jupyter documentation with a focus on security-related instructions. This survey was originally compiled in August 2021 by Kay Avila with input and review from Terry Fleury and Jeannette Dopheide as part of a Trusted CI engagement .

Jupyter-owned domains with documentation:
- Styled to look like jupyter.org -
- Not re-styled
  - https://jupyter-notebook.readthedocs.io/
  - https://jupyterlab.readthedocs.io/
- GitHub repos with documentation
  - https://github.com/jupyterhub/jupyterhub
  - https://github.com/jupyterhub/jupyterhub-tutorial
  - https://github.com/jupyterhub/jupyterhub-deploy-docker - JupyterHub in a single Docker container
  - https://github.com/jupyter/notebook
    - Encourages people to move to JupyterLab for more support
    - "Our approach moving forward will be:
      - To maintain the security of the Jupyter Notebook. That means security-related issues and pull requests are our highest priority.
      - To address JupyterLab feature parity issues. As part of this effort, we are also working on a better notebook-only experience in JupyterLab for users who prefer the UI of the classic Jupyter Notebook.
      - ... We cannot support or maintain new features at this time, but we welcome security and other sustainability fixes."

Jupyter

https://jupyter.readthedocs.io/en/latest/use/use-cases/content-user.html
- Lists notebook narratives; Jupyter for data science, scientific computing, education, and enterprise - but these are mostly just placeholder stubs
https://jupyter.readthedocs.io/en/latest/install/notebook-classic.html
- Installation instructions (for Jupyter Notebook)
  - Guide recommends Anaconda, or failing that, pip3 (pip)
  - Note: Doesn't mention whether to install with sudo or not
    - Without sudo, places the executables in /home/<user>/.local/bin
  - Running jupyter notebook command on CLI
    - By default, runs on localhost:8888
      - Note: Can change it with --ip= and --port= args
    - Provides a URL with a token
    - Note: Accessing a URL with an invalid token prompts for a password or token, and also allows for setting a new password if provided a token
Weird issue - https://jupyter.readthedocs.io/en/latest/install.html left navbar repeats "Narratives and Use Cases" and "Advanced Use Cases"

Jupyter Notebook

https://jupyter-notebook.readthedocs.io/en/latest/notebook.html#introduction
- Browser compatibility: "Using Safari with HTTPS and an untrusted certificate is known to not work (websockets will fail)."
  - Doesn't explain what the issue is (self-signed cert)
https://jupyter-notebook.readthedocs.io/en/latest/notebook.html#notebooks-and-privacy
- If you followed standard install (linked above), then just running on own computer
- Can also run it remotely: "You can also use Jupyter remotely: your company or university might run the server for you, for instance. If you want to work with sensitive data in those cases, talk to your IT or data protection staff about it."
https://jupyter-notebook.readthedocs.io/en/latest/notebook.html#trusting-notebooks
- Signatures are stored of trusted notebooks (those fully executed by the user), and display HTML and Javascript output
- "jupyter trust <notebook>.ipynb" to trust one
- See Security section for more info
https://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/examples_index.html
- Note: No actual notebooks on this page, despite the text
- Links to nbviewer example notebooks
https://jupyter-notebook.readthedocs.io/en/latest/security.html#security-in-notebook-documents
- Problem of arbitrary code execution
- Security model -
  - Untrusted HTML is always sanitized
  - Untrusted Javascript is never executed
  - HTML and Javascript in Markdown cells are never trusted
  - Outputs generated by the user are trusted
  - Any other HTML or Javascript (in Markdown cells, output generated by others) is never trusted
  - The central question of trust is "Did the current user do this?"
- Checks signature when notebook is run to see which parts were created by current user
- Trusted updated when the notebook is saved
- Notebooks can be explicitly trusted with a CLI command or in the web interface
- Information on vulnerability reporting - report to security@ipython.org and can use PGP public key to encrypt
- Changes in Jupyter 2.0:
  - Javascript and CSS are sanitized and stripped out
  - Cannot see collaborator's outputs in a shared notebook because they are untrusted
    - Can rerun notebooks, explicitly trust, or share a notebook signature database
https://jupyter-notebook.readthedocs.io/en/latest/security.html#server-security (very short section)
- Token-based auth on by default, or can set a password
https://jupyter-notebook.readthedocs.io/en/stable/public_server.html#securing-a-notebook-server (Running a Notebook Server)
- Warning about not meant to be multi-user
- Setting password on the notebook server - automatically prompted in notebooks 5.3+
- Using SSL for encrypted communication
  - Using Let's Encrypt

JupyterLab

https://jupyterlab.readthedocs.io/en/latest/getting_started/starting.html
- Says it runs on top of Jupyter Server, so see the Jupyter Server security section

Jupyter Server (backend for JupyterLab)

https://jupyter-server.readthedocs.io/
- Separated into Users, Operators, Developers, Contributors, Other
Users - https://jupyter-server.readthedocs.io/en/latest/users/index.html
- Nothing specifically about security
Operators - https://jupyter-server.readthedocs.io/en/latest/operators/index.html
- Installing a Jupyter extension automatically enables it [not ideal from a security standpoint]
- Running a public Jupyter Server (intended only for single user)
  - Uses ZeroMQ
  - Can use a simple password with an automatic setup in the user interface or running "jupyter server password", or by manually creating a hashed password and adding it to the configuration file
  - Recommends using SSL
    - Brief description of self-designed versus LetsEncrypt
    - Links to ArsTechnica article about obtaining paid certificate
    - Also links to LetsEncrypt further down in the page, under Running a public notebook server [this is confusing!]
  - Later on the same page, more information about how to use SSL certs and info on how to use LetsEncrypt
  - Firewall setup - allow public connections and localhost connections
  - Overriding Content-Security-Policy to allow embedding into another web page
  - Can specify an external gateway server to do kernel management
  - Mozilla and others recommend enabling Content Security Policy headers to provide cross-site scripting
    - Disables inline JavaScript - which causes issues for Jupyter
    - Restricts communication to https, which disables ws/wss, which Jupyter uses for interacting with kernels
    - Need to add the following to the CSP headers -
      - 'unsafe-inline' and connect-src https: wss:
    - Note: not much about how this leaves Jupyter vulnerable, and nothing about how cross-site scripting protections can be enabled in another way
- Security in the Jupyter Server
  - Token-based auth - on by default
    - Can be provided to the server in an authorization header, URL parameter, or password field of login form
    - If Jupyter server will launch the browser, an additional token is generated and then used to set a cookie
    - Can set a password instead (jupyter server password)
    - Possible to disable authentication, but not recommended
  - Security in notebook documents
    - [Duplicated information from Jupyter Notebook - arbitrary code execution, trust model, etc.]
Developers - https://jupyter-server.readthedocs.io/en/latest/developers/index.html
- Depending on Jupyter Server [does not mention how to watch for security issues]
- Note: nothing about how to contribute info about security issues here
Contributors - https://jupyter-server.readthedocs.io/en/latest/contributors/index.html
- General Jupyter contributor guidelines -
  - "jupyter_server has adopted automatic code formatting so you shouldn't need to worry too much about your code style"
  - Links to https://jupyter.readthedocs.io/en/latest/contributing/content-contributor.html
Other - https://jupyter-server.readthedocs.io/en/latest/other/index.html
- FAQ is very short - just one ("Can I configure multiple extensions at once?")
- Config file and command line options - a few mentions of impact on security from various settings
- Changelog is buried here (?)

JupyterHub

https://jupyterhub.readthedocs.io/en/latest/getting-started/security-basics.html
- Note: the list at the top of subjects covered is different from the order they're actually covered in
- Enable SSL (note at top about not running w/out SSL on public network)
  - Adding SSL key and cert to JupyterHub
  - Using LetsEncrypt
  - Mention of SSL termination happening outside of the Hub, e.g. SSL termination provided by Nginx
- Proxy authentication token
  - Manual secret token between Hub and Proxy
  - Options: set in config file, or use environmental variable
  - If not set manually, will be negotiated between Hub and Proxy (and Proxy must be restarted anytime the Hub is restarted)
- "Cookie secret" encryption key to encrypt browser cookies used for auth
  - Options: set file location in config file, environmental variable, or store in the config file
  - List of cookies used
https://jupyterhub.readthedocs.io/en/stable/reference/websecurity.html
- Designed by default for semi-trusted users, takes extra work to secure for untrusted users
  - Note: Confusing/unclear sentence - "If the Hub is serving untrusted users, many of the web's cross-site protections are not applied between single-user servers and the Hub, or between single-user servers and each other, since browsers see the whole thing (proxy, Hub, and single user servers) as a single website (i.e. single domain)."
    - Makes it sound like protections are not applied for untrusted users, as opposed to making it clear admins need to be aware of this
- Protecting users from each other
  - Admins must ensure users cannot modify their single-user notebook servers or the configuration of their notebook server
- Mitigation options
  - Run single-user servers on subdomains (requires wildcard ssl cert)
    - Highly encouraged because resolves cross-site issues
  - Disable user-owned config files from being loaded
    - Note: Typo - "After implementing this option, PATHs and package installation and PATHs are the other things that the admin must enforce."
  - Prevent spawner from evaluating shell config files
  - Run single-user servers in virtualenvs with disabled system-site-packages, and do not let user install packages
    - This impacts only the server, not the environment(s) where their kernel(s) run
  - Encryption
    - Communication among proxy, hub, and single-user notebooks is unencrypted by default
    - Use IPC instead of ZeroMQ since the latter is unencrypted
      - Mentions that "internal_ssl option will eventually extend to securing the tcp sockets as well."
  - Use security audits
- Information on vulnerability reporting - report to security@ipython.org and can use PGP public key to encrypt
https://jupyterhub.readthedocs.io/en/latest/getting-started/institutional-faq.html#for-it
- Section - "How would I set up JupyterHub on institutional hardware?"
  - Zero to JupyterHub for Kubernetes
  - Littlest JupyterHub (runs in a VM)
- Section - "Is JupyterHub secure?"
  - Links to page Security Overview that I hadn't found before and JupyterHub on Kubernetes Security
  - Mentions reaching out to community in the forum
- Section - "Can JupyterHub be used with my high-performance computing resources?"
  - Yes - e.g. Dask
- Section - "How much resources do user sessions take?"
  - Note: says it's configurable, but doesn't link to documentation on how to do this
https://jupyterhub.readthedocs.io/en/latest/getting-started/authenticators-users-basics.html - Authentication and User Basics
- Admin accounts and whether they have access to user notebooks
https://jupyterhub.readthedocs.io/en/stable/reference/spawners.html - under the Encryption section
- Encryption among Proxy, Hub, and Notebook
https://jupyterhub.readthedocs.io/en/stable/reference/config-sudo.html
- Running the Hub process without root privileges

JupyterHub for Kubernetes / Zero-to-JupyterHub

https://zero-to-jupyterhub.readthedocs.io/en/latest/administrator/security.html
- Advice is mostly for cloud-based deployments
- Information on vulnerability reporting - report to security@ipython.org and can use PGP public key to encrypt
- HTTPS
  - Add LetsEncrypt to proxy by editing config.yaml file
    - Recommends using static IP address as a load balancer IP if LoadBalancer proxy being used
  - Or, manual https certificate ("considered an advanced option")
    - By configuring in config.yaml file or
    - Use kubectl to add a secret resource
  - Off-load SSL to a load balancer
- Secure access to helm - see the relevant Kubernetes docs
- Delete Kubernetes dashboard
- Keep RBAC enabled, otherwise all pods are given root equivalent permissions
  - However, though strongly discouraged, also gives instructions to disabling RBAC
- Instructions on how to give users access to the Kubernetes API
  - Recommends also setting up RBAC (no example given, links to Kubernetes RBAC docs)
- Block access to metadata about cloud from the provider
  - With a NetworkPolicy enforced by NetworkPolicy controller
    - Typo: We recommend relying on this approach if you had a NetworkPolicy controller
  - Default configuration uses singleuser.cloudMetadata.blockWithIptables
- Kubernetes Network Policies
  - Note that any unsupported options will be silently ignored
  - Enabled by default in JupyterHub helm charts in version 0.10+
  - Network policies by default do not allow user pods to talk to JupyterHub component pods
    - Gives instructions on how to add additional access
  - Default policy allows all egress traffic
    - Gives information and example on how to override this with more restrictive controls
- Restricting load balancer access
  - By default, any IP is allowed to access the load balancer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Survey.rst

Survey.rst

Survey of Existing Jupyter Security Documentation

Jupyter

Jupyter Notebook

JupyterLab

Jupyter Server (backend for JupyterLab)

JupyterHub

JupyterHub for Kubernetes / Zero-to-JupyterHub

Files

Survey.rst

Latest commit

History

Survey.rst

File metadata and controls

Survey of Existing Jupyter Security Documentation

Jupyter

Jupyter Notebook

JupyterLab

Jupyter Server (backend for JupyterLab)

JupyterHub

JupyterHub for Kubernetes / Zero-to-JupyterHub