Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'strip' #19

Open
Amirelkanov opened this issue Oct 8, 2024 · 0 comments
Open

AttributeError: 'NoneType' object has no attribute 'strip' #19

Amirelkanov opened this issue Oct 8, 2024 · 0 comments

Comments

@Amirelkanov
Copy link

Amirelkanov commented Oct 8, 2024

Desctiption: Getting an AttributeError when passing an html-like string with a corrupted <style> tag in the AdvancedHTMLParser.AdvancedHTMLParser().parseStr method.

String input:

<!DOCTYPE html><html><head><title>W33ZpsIOCysn9GGU45y0LW9EpuPHBlAuxCRRusKRvowefQLMy2</title><style:p { color: red; }</style></head><body><ul><li>rp52OnfCuzqBsp7</li><li>wrAAhIfvfpvMeyoTdmoF1oxezMhscNlgTqo0fPhfUS7XWZvECi2iVMsldLpqJq6W34KuOeoJ74cx5</li><li>8ymeXTKNEDb3jDnYwKt3lFMc4s7pJxDIVgSXljWIlOjv7JGr8cXf8SJOmpiyD05PyTzj9UATCFo1XqBpCqXR7KcjUYinCI4kZYI</li></ul> 6L1gB6g0z</body></html>

Bytearray input:

[60, 33, 68, 79, 67, 84, 89, 80, 69, 32, 104, 116, 109, 108, 62, 60, 104, 116, 109, 108, 62, 60, 104, 101, 97, 100, 62, 60, 116, 105, 116, 108, 101, 62, 87, 51, 51, 90, 112, 115, 73, 79, 67, 121, 115, 110, 57, 71, 71, 85, 52, 53, 121, 48, 76, 87, 57, 69, 112, 117, 80, 72, 66, 108, 65, 117, 120, 67, 82, 82, 117, 115, 75, 82, 118, 111, 119, 101, 102, 81, 76, 77, 121, 50, 60, 47, 116, 105, 116, 108, 101, 62, 60, 115, 116, 121, 108, 101, 58, 112, 32, 123, 32, 99, 111, 108, 111, 114, 58, 32, 114, 101, 100, 59, 32, 125, 60, 47, 115, 116, 121, 108, 101, 62, 60, 47, 104, 101, 97, 100, 62, 60, 98, 111, 100, 121, 62, 60, 117, 108, 62, 60, 108, 105, 62, 114, 112, 53, 50, 79, 110, 102, 67, 117, 122, 113, 66, 115, 112, 55, 60, 47, 108, 105, 62, 60, 108, 105, 62, 119, 114, 65, 65, 104, 73, 102, 118, 102, 112, 118, 77, 101, 121, 111, 84, 100, 109, 111, 70, 49, 111, 120, 101, 122, 77, 104, 115, 99, 78, 108, 103, 84, 113, 111, 48, 102, 80, 104, 102, 85, 83, 55, 88, 87, 90, 118, 69, 67, 105, 50, 105, 86, 77, 115, 108, 100, 76, 112, 113, 74, 113, 54, 87, 51, 52, 75, 117, 79, 101, 111, 74, 55, 52, 99, 120, 53, 60, 47, 108, 105, 62, 60, 108, 105, 62, 56, 121, 109, 101, 88, 84, 75, 78, 69, 68, 98, 51, 106, 68, 110, 89, 119, 75, 116, 51, 108, 70, 77, 99, 52, 115, 55, 112, 74, 120, 68, 73, 86, 103, 83, 88, 108, 106, 87, 73, 108, 79, 106, 118, 55, 74, 71, 114, 56, 99, 88, 102, 56, 83, 74, 79, 109, 112, 105, 121, 68, 48, 53, 80, 121, 84, 122, 106, 57, 85, 65, 84, 67, 70, 111, 49, 88, 113, 66, 112, 67, 113, 88, 82, 55, 75, 99, 106, 85, 89, 105, 110, 67, 73, 52, 107, 90, 89, 73, 60, 47, 108, 105, 62, 60, 47, 117, 108, 62, 32, 54, 76, 49, 103, 66, 54, 103, 48, 122, 60, 47, 98, 111, 100, 121, 62, 60, 47, 104, 116, 109, 108, 62]

Code that reproduces the error:

import AdvancedHTMLParser

parser = AdvancedHTMLParser.AdvancedHTMLParser()
parser.parseStr(string_input) # The same string_input as above in issue

Expected Result: Ignore invalid input or raise a specified exception (like MultipleRootNodeException)

Actual Result:

Traceback (most recent call last):
  File "C:\Users\AmEl\IdeaProjects\Joker2023\src\main\python\main.py", line 55, in main
    python_method(input_data)
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 980, in parseStr
    self.feed(html)
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 948, in feed
    HTMLParser.feed(self, contents)
  File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 111, in feed
    self.goahead(0)
  File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 171, in goahead
    k = self.parse_starttag(i)
        ^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 338, in parse_starttag
    self.handle_starttag(tag, attrs)
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 138, in handle_starttag
    newTag = AdvancedTag(tagName, attributeList, isSelfClosing, ownerDocument=self)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Tags.py", line 196, in __init__
    myAttributes[key] = value
    ~~~~~~~~~~~~^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 96, in __setitem__
    tag.style = StyleAttribute(value, tag)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 424, in __init__
    self._styleDict = StyleAttribute.styleToDict(styleValue)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 650, in styleToDict
    styleStr = styleStr.strip()
               ^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'strip'

Additional information:

  • OS: Windows 10, 22H2 (19045.4984)
  • Python version: Python 3.12.6
  • You can achieve this error on input like this: <s</style>

P.s. You can see the same info in reportAttributeError.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant