Skip to content

Commit

Permalink
Fix readme and python3.9 union
Browse files Browse the repository at this point in the history
  • Loading branch information
maledorak committed Jan 9, 2025
1 parent 00f4b9d commit 0f96e79
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 6 deletions.
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,14 @@ Emmetify converts complex HTML structures into concise Emmet notation. For examp
```
Becomes:
```
div.container>header.header>nav.nav>ul.nav-list>li.nav-item>a[href=#]
div.container>header.header>nav.nav>ul.nav-list>li.nav-item>a[href=#]{Link}
```

Using the [OpenAI Tokenizer](https://platform.openai.com/tokenizer), we can see this simple transformation reduces token count from:
- HTML: 59 tokens
- Emmet: 20 tokens
- Emmet: 22 tokens

That's 66% fewer tokens while preserving all structural information! And this is just with default settings.
That's 63% fewer tokens while preserving all structural information! And this is just with default settings.

You can achieve even higher compression rates (up to 90%, or even more depending on the HTML structure) by using advanced configuration options:
- Removing unnecessary tags
Expand Down Expand Up @@ -137,6 +137,11 @@ response = llm.chat.completions.create(
)
```

## Backlog 📝

- [x] Add support for HTML
- [ ] Add examples

## Supported Formats 📊

- ✅ HTML
Expand Down
5 changes: 3 additions & 2 deletions emmetify/converters/html_converter.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from typing import Union
from emmetify.config.base_config import EmmetifierConfig
from emmetify.config.html_config import HtmlAttributesPriority
from emmetify.converters.base_converter import BaseConverter
Expand Down Expand Up @@ -152,7 +153,7 @@ def _node_to_emmet(self, node: HtmlNode) -> str:
return "".join(parts)

def _build_emmet(
self, node_pool: HtmlNodePool, node_data: str | HtmlNode, level: int = 0
self, node_pool: HtmlNodePool, node_data: Union[str, HtmlNode], level: int = 0
) -> str:
"""Recursively build Emmet notation with optional indentation."""
indent = " " * (self.config.indent_size * level) if self.config.indent else ""
Expand All @@ -170,7 +171,7 @@ def _build_emmet(

# Get children nodes
children_nodes: list[HtmlNode] = []
direct_text_child_node: HtmlNode | None = None
direct_text_child_node: Union[HtmlNode, None] = None
for child_index, child_id in enumerate(node.children_ids):
child_node = node_pool.get_node(child_id)
is_first_text_child = (
Expand Down
4 changes: 3 additions & 1 deletion emmetify/emmetifier.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Union

from emmetify.config import EmmetifierConfig
from emmetify.converters import get_converter
from emmetify.parsers import get_parser
Expand All @@ -8,7 +10,7 @@ class Emmetifier:
def __init__(
self,
format: SupportedFormats = DefaultFormat,
config: EmmetifierConfig | dict | None = None,
config: Union[EmmetifierConfig, dict, None] = None,
):
self.config = EmmetifierConfig.model_validate(config) if config else EmmetifierConfig()

Expand Down

0 comments on commit 0f96e79

Please sign in to comment.