Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Empty" h-cite gets name property? #267

Open
janboddez opened this issue Jan 1, 2025 · 2 comments
Open

"Empty" h-cite gets name property? #267

janboddez opened this issue Jan 1, 2025 · 2 comments

Comments

@janboddez
Copy link

janboddez commented Jan 1, 2025

I'm not sure if this is in a spec someplace, but I was surprised to see, for example, ...

<div class="h-cite u-in-reply-to">
<p><em>In reply to <a href="https://toot.re/@ochtendgrijs/113752096935364562" class="u-url">https://toot.re/@ochtendgrijs/113752096935364562</a>.</em></p>
</div>

... get parsed into:

"in-reply-to": [
  {
    "type": [
      "h-cite"
    ],
    "properties": {
      "url": [
        "https://toot.re/@ochtendgrijs/113752096935364562"
      ],
      "name": [
        "In reply to https://toot.re/@ochtendgrijs/113752096935364562."
      ]
    },
    "value": "https://toot.re/@ochtendgrijs/113752096935364562"
  }
]

(Obviously, the post I'm replying to is not titled "In reply to, etc." If it was, well, I'd have added a p-name in there somewhere. Still, could be just me missing something obvious here.)

But then at the same time, something like this ...

<div class="h-cite u-in-reply-to">
<p><em>In reply to <a href="https://toot.re/@ochtendgrijs/113752096935364562" class="u-url">https://toot.re/@ochtendgrijs/113752096935364562</a> by <span class="p-author h-card"><a href="https://toot.re/@ochtendgrijs" rel="mention" class="mention u-url">@[email protected]</a></span>.</em></p>
</div>

... does not get a name property at all:

"in-reply-to": [
  {
    "type": [
      "h-cite"
    ],
    "properties": {
      "url": [
        "https://toot.re/@ochtendgrijs/113752096935364562"
      ],
      "author": [
        {
          "type": [
            "h-card"
          ],
          <...>
        }
      ]
    }
  }
]

Looks like a name (equal to the h-cite's text value) gets auto-added when there isn't an explicit p-name or similar. (If there is, then the p-name's value is used, as expected.)

Except, this does not happen when a h-card (or something like e-content) is present. (Yet it does happen when only a u-url is present.)

So, uh, I was wondering if this is by design.

On my own sites, I've always used markup very close to the examples above. When replying to or reposting a note (without an explicit title), I leave out the p-name.

I do occasionally but not always add p-author h-card. Looks like when I do, all is well (no name gets set, as intended), but when I don't, php-mf2 suddenly does add that ("non-sensical") name property.

Only way around it would be to use a u-in-reply-to directly on the URL in question (and skip the h-cite altogether). Which would be OK, I guess (there's not much of a reply context without, well, any actual reply context), except I'd have to rewrite quite some posts (and logic). 😅

Either way, I'd love to find out ... I mean, and it would also be super convenient if I could somehow keep using the same type of markup. The examples aren't so different, after all.

@gRegorLove
Copy link
Member

Hi @janboddez! This is part of the implied parsing rules. If there's no explicit name property and no other p-*, e-*, or nested h-*, then it will imply a name. The algorithm used to almost always imply a name, but that led to some unintended consequences, so we added those conditions to limit that.

We could definitely consider restricting that further, though I'm not sure how offhand (would need to consider other unintended consequences). That would be a discussion for the parsing specification issues, though.

@janboddez
Copy link
Author

Aah, thanks for confirming!

I think I'll just leave things as they are, for now; there shouldn't be too many mf2 consumers that actually display an h-cite's "name."

I'm all for further restricting, haha. 😅 Seems somewhat counterintuitive, given that "so much" (i.e., how an entry is interpreted/displayed) depends on the absence of a name property, to "so badly want to fill in the blanks" when there isn't one to begin with.

What I've done in some of my projects, when parsing others' markup, is first check for microformats. When there's none, I'll use something like a title element or Open Graph tag to try and set a "name," akin to your "implied name." But when a page clearly supports microformats yet doesn't seem to have a name, I'm going to just assume there isn't one, i.e., that the page represents a "note."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants