Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blog post demonstrating the usage of GT.fmt_image() and vals.fmt_image() #558

Merged
merged 3 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
365 changes: 365 additions & 0 deletions docs/blog/rendering-images/index.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,365 @@
---
title: "Rendering images anywhere in Great Tables"
html-table-processing: none
author: Jerry Wu
date: 2024-12-13
freeze: true
jupyter: python3
format:
html:
code-summary: "Show the Code"
---

Rendering images in Great Tables is straightforward with `GT.fmt_image` and `vals.fmt_image()`.
In this post, we'll explore three key topics:

* Four examples demonstrating how to render images within the body using `GT.fmt_image()`.
* How to render images anywhere using `vals.fmt_image()` and `html()`.
* How to manually render images anywhere using `html()`.

## Rendering Images in the Body
[GT.fmt_image()](https://posit-dev.github.io/great-tables/reference/GT.fmt_image.html#great_tables.GT.fmt_image)
is the go-to tool for rendering images within the body of a table. Below, we'll present four examples
corresponding to the cases outlined in the documentation:

* **Case 1**: Local file paths.
* **Case 2**: Full HTTP/HTTPS URLs.
* **Case 3**: Image names with the `path=` argument.
* **Case 4**: Image names using both the `path=` and `file_pattern=` arguments.

::: {.callout-tip collapse="false"}

## Finding the Right Case for Your Needs

* **Case 1** and **Case 2** work best for data sourced directly from a database.
* **Case 3** is ideal for users dealing with image names relative to a base directory or URL (e.g., `/path/to/images`).
* **Case 4** is tailored for users working with patterned image names (e.g., `metro_{}.svg`).
:::

### Preparations
For this demonstration, we'll use the first five rows of the built-in [metro](https://posit-dev.github.io/great-tables/reference/data.metro.html#great_tables.data.metro) dataset, specifically the `name` and `lines` columns.

To ensure a smooth walkthrough, we’ll manipulate the `data` (a Python dictionary) directly. However,
in real-world applications, such operations are more likely performed at the DataFrame level to leverage
the benefits of vectorized operations.
```{python}
# | code-fold: true
import pandas as pd
from great_tables import GT, vals, html
from importlib_resources import files

pd.set_option('display.max_colwidth', 150)

data = {
"name": [
"Argentine",
"Bastille",
"Bérault",
"Champs-Élysées—Clemenceau",
"Charles de Gaulle—Étoile",
],
"lines": ["1", "1, 5, 8", "1", "1, 13", "1, 2, 6"],
}

print("""\
data = {
"name": [
"Argentine",
"Bastille",
"Bérault",
"Champs-Élysées—Clemenceau",
"Charles de Gaulle—Étoile",
],
"lines": ["1", "1, 5, 8", "1", "1, 13", "1, 2, 6"],
}\
""")
```

Attentive readers may have noticed that the values for the key `lines` are lists of strings, each
containing one or more numbers separated by commas. `GT.fmt_image()` is specifically designed to
handle such cases, allowing users to render multiple images in a single row.

### Case 1: Local File Paths
**Case 1** demonstrates how to simulate a column containing strings representing local file paths. We'll
use images stored in the `data/metro_images` directory of Great Tables:
```{python}
img_local_paths = files("great_tables") / "data/metro_images" # <1>
```
1. These image files follow a patterned naming convention, such as `metro_1.svg`, `metro_2.svg`, and so on.

Below is a `Pandas` DataFrame called `metro_mini1`, where the `case1` column contains local file
paths that we want to render as images.
```{python}
# | code-fold: true
metro_mini1 = pd.DataFrame(
{
**data,
"case1": [
", ".join(
str((img_local_paths / f"metro_{item}").with_suffix(".svg"))
for item in row.split(", ")
)
for row in data["lines"]
],
}
)
metro_mini1
```

::: {.callout-tip collapse="false"}
## Use the `pathlib` Module to Construct Paths

Local file paths can vary depending on the operating system, which makes it easy to accidentally
construct invalid paths. A good practice to mitigate this is to use Python's built-in
[pathlib](https://docs.python.org/3/library/pathlib.html) module to construct paths first and then
convert them to strings. In this example, `img_local_paths` is actually an instance of `pathlib.Path`.
```{python}
# | eval: false
from pathlib import Path

isinstance(img_local_paths, Path) # True
```

:::

The `case1` column is quite lengthy due to the inclusion of `img_local_paths`. In **Case 3**, we'll
share a useful trick to avoid repeating the directory name each time—stay tuned!

For now, let's use `GT.fmt_image()` to render images by passing `"case1"` as the first argument:
```{python}
GT(metro_mini1).fmt_image("case1").cols_align(align="right", columns="case1")
```

### Case 2: Full HTTP/HTTPS URLs
**Case 2** demonstrates how to simulate a column containing strings representing HTTP/HTTPS URLs. We'll
use the same images as in **Case 1**, but this time, retrieve them from the Great Tables GitHub repository:
```{python}
img_url_paths = "https://raw.githubusercontent.com/posit-dev/great-tables/refs/heads/main/great_tables/data/metro_images"
```

Below is a `Pandas` DataFrame called `metro_mini2`, where the `case2` column contains
full HTTP/HTTPS URLs that we aim to render as images.
```{python}
# | code-fold: true
metro_mini2 = pd.DataFrame(
{
**data,
"case2": [
", ".join(f"{img_url_paths}/metro_{item}.svg" for item in row.split(", "))
for row in data["lines"]
],
}
)
metro_mini2
```

The lengthy `case2` column issue can also be addressed using the trick shared in **Case 3**.

Similarly, we can use `GT.fmt_image()` to render images by passing `"case2"` as the first argument:
```{python}
GT(metro_mini2).fmt_image("case2").cols_align(align="right", columns="case2")
```


### Case 3: Image Names with the `path=` Argument
**Case 3** demonstrates how to use the `path=` argument to specify images relative to a base directory
or URL. This approach eliminates much of the repetition in file names, offering a solution to the
issues in **Case 1** and **Case 2**.

Below is a `Pandas` DataFrame called `metro_mini3`, where the `case3` column contains file names that
we aim to render as images.
```{python}
# | code-fold: true
metro_mini3 = pd.DataFrame(
{
**data,
"case3": [
", ".join(f"metro_{item}.svg" for item in row.split(", ")) for row in data["lines"]
],
}
)
metro_mini3
```

Now we can use `GT.fmt_image()` to render the images by passing `"case3"` as the first argument and
specifying either `img_local_paths` or `img_url_paths` as the `path=` argument:
```{python}
# equivalent to `Case 1`
(
GT(metro_mini3)
.fmt_image("case3", path=img_local_paths)
.cols_align(align="right", columns="case3")
)

# equivalent to `Case 2`
(
GT(metro_mini3)
.fmt_image("case3", path=img_url_paths)
.cols_align(align="right", columns="case3")
)
```

After exploring **Case 1** and **Case 2**, you’ll likely appreciate the functionality of the `path=`
argument. However, manually constructing file names can still be a bit tedious. If your file names
follow a consistent pattern, the `file_pattern=` argument can simplify the process. Let’s see how
this works in **Case 4** below.

### Case 4: Image Names Using Both the `path=` and `file_pattern=` Arguments
**Case 4** demonstrates how to use `path=` and `file_pattern=` to specify images with names following
a common pattern. For example, you could use `file_pattern="metro_{}.svg"` to reference images like
`metro_1.svg`, `metro_2.svg`, and so on.

Below is a `Pandas` DataFrame called `metro_mini4`, where the `case4` column contains a copy of
`data["lines"]`, which we aim to render as images.
```{python}
# | code-fold: true
metro_mini4 = pd.DataFrame({**data, "case4": data["lines"]})
metro_mini4
```

First, define a string pattern to illustrate the file naming convention, using `{}` to indicate the
variable portion:
```{python}
file_pattern = "metro_{}.svg"
```

Next, pass `"case4"` as the first argument, along with `img_local_paths` or `img_url_paths` as the
`path=` argument, and `file_pattern` as the `file_pattern=` argument. This allows `GT.fmt_image()`
to render the images:
```{python}
# equivalent to `Case 1`
(
GT(metro_mini4)
.fmt_image("case4", path=img_local_paths, file_pattern=file_pattern)
.cols_align(align="right", columns="case4")
)

# equivalent to `Case 2`
(
GT(metro_mini4)
.fmt_image("case4", path=img_url_paths, file_pattern=file_pattern)
.cols_align(align="right", columns="case4")
)
```


::: {.callout-warning collapse="true"}
## Using `file_pattern=` Independently

The `file_pattern=` argument is typically used in conjunction with the `path=` argument, but this
is not a strict rule. If your local file paths or HTTP/HTTPS URLs follow a pattern, you can use
`file_pattern=` alone without `path=`. This allows you to include the shared portion of the file
paths or URLs directly in `file_pattern`, as shown below:
```{python}
file_pattern = str(img_local_paths / "metro_{}.svg")
(
GT(metro_mini4)
.fmt_image("case4", file_pattern=file_pattern)
.cols_align(align="right", columns="case4")
)
```

:::

**Case 4** is undoubtedly one of the most powerful features of Great Tables. While mastering it may
take some practice, we hope this example helps you render images effortlessly and effectively.

## Rendering Images Anywhere
While `GT.fmt_image()` is primarily designed for rendering images in the table body, what if you
need to display images in other locations, such as the header? In such cases, you can turn to the versatile
[vals.fmt_image()](https://posit-dev.github.io/great-tables/reference/vals.fmt_image.html#great_tables.vals.fmt_image).

`vals.fmt_image()` is a hidden gem in Great Tables. Its usage is similar to `GT.fmt_image()`, but
instead of working directly with DataFrame columns, it lets you pass a string or a list of strings
as the first argument, returning a list of strings, each representing an image. You can then wrap
these strings with [html()](https://posit-dev.github.io/great-tables/reference/html.html#great_tables.html),
allowing Great Tables to render the images anywhere in the table.

### Preparations
We will create a `Pandas` DataFrame named `metro_mini` using the `data` dictionary. This will be used
for demonstration in the following examples:
```{python}
# | code-fold: true
metro_mini = pd.DataFrame(data)
metro_mini
```

### Single Image
This example shows how to render a valid URL as an image in the title of the table header:
```{python}
logo_url = "https://posit-dev.github.io/great-tables/assets/GT_logo.svg"

_gt_logo, *_ = vals.fmt_image(logo_url, height=100) # <1>
gt_logo = html(_gt_logo)

(
GT(metro_mini)
.fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg")
.tab_header(title=gt_logo)
.cols_align(align="right", columns="lines")
.opt_stylize(style=4, color="gray")
)
```
1. `vals.fmt_image()` returns a list of strings. Here, we use tuple unpacking to extract the first
item from the list.

### Multiple Images
This example demonstrates how to render two valid URLs as images in the title and subtitle of the
table header:
```{python}
logo_urls = [
"https://posit-dev.github.io/great-tables/assets/GT_logo.svg",
"https://raw.githubusercontent.com/rstudio/gt/master/images/dataset_metro.svg",
]

_gt_logo, _metro_logo = vals.fmt_image(logo_urls, height=100) # <1>
gt_logo, metro_logo = html(_gt_logo), html(_metro_logo)

(
GT(metro_mini)
.fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg")
.tab_header(title=gt_logo, subtitle=metro_logo)
.cols_align(align="right", columns="lines")
.opt_stylize(style=4, color="gray")
)
```
1. Note that if you need to render images with different `height` or `width`, you might need to make
two separate calls to `vals.fmt_image()`.

## Manually Rendering Images Anywhere
Remember, you can always use `html()` to manually construct your desired output. For example, the
previous table can be created without relying on `vals.fmt_image()` like this:
```{python}
# | eval: false
(
GT(metro_mini)
.fmt_image("lines", path=img_url_paths, file_pattern="metro_{}.svg")
.tab_header(
title=html(f'<img src="{logo_urls[0]}" height="100">'),
subtitle=html(f'<img src="{logo_urls[1]}" height="100">'),
)
.cols_align(align="right", columns="lines")
.opt_stylize(style=4, color="gray")
)
```

Alternatively, you can manually encode the image using Python's built-in
[base64](https://docs.python.org/3/library/base64.html) module, specify the appropriate MIME type
and HTML attributes, and then wrap it in `html()` to display the table.

## Final Words
In this post, we focused on the most common use cases for rendering images in Great Tables, deliberately
avoiding excessive DataFrame operations. Including such details could have overwhelmed the post with
examples of string manipulations and the complexities of working with various DataFrame libraries.

We hope you found this guide helpful and enjoyed the structured approach. Until next time, happy
table creation with Great Tables!

::: {.callout-note}
## Appendix: Related PRs

If you're interested in the recent enhancements we've made to image rendering, be sure to check out
[#444](https://github.com/posit-dev/great-tables/pull/444),
[#451](https://github.com/posit-dev/great-tables/pull/451) and
[#520](https://github.com/posit-dev/great-tables/pull/520) for all the details.
:::
3 changes: 2 additions & 1 deletion great_tables/_formats.py
Original file line number Diff line number Diff line change
Expand Up @@ -3688,7 +3688,8 @@ def fmt_image(
In the output of images within a body cell, `sep=` provides the separator between each
image.
path
An optional path to local image files (this is combined with all filenames).
An optional path to local image files or an HTTP/HTTPS URL.
This is combined with the filenames to form the complete image paths.
file_pattern
The pattern to use for mapping input values in the body cells to the names of the graphics
files. The string supplied should use `"{}"` in the pattern to map filename fragments to
Expand Down
Loading