Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zh: add special plurlized words #1059

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Yiyiyimu
Copy link
Contributor

Signed-off-by: yiyiyimu [email protected]

Pull Request Checklist

Thank you for taking the time to improve Arrow! Before submitting your pull request, please check all appropriate boxes:

  • 🧪 Added tests for changed code.
  • 🛠️ All tests pass when run locally (run tox or make test to find out!).
  • 🧹 All linting checks pass when run locally (run tox -e lint or make lint to find out!).
  • 📚 Updated documentation for changed code.
  • ⏩ Code is up-to-date with the master branch.

If you have any questions about your code changes or any of the points above, please submit your questions along with the pull request and we will try our best to help!

Description of Changes

Add special plurlized words in Chinese. Ref: #804

Came across one error that need some help

Error Log
======================================================= FAILURES =======================================================____________________________________________ TestArrowDehumanize.test_year _____________________________________________
self = <tests.test_arrow.TestArrowDehumanize object at 0x7f9a3b9b9b00>
locale_list_no_weeks = ['en', 'en-us', 'en-gb', 'en-au', 'en-be', 'en-jp', ...]

    def test_year(self, locale_list_no_weeks):

        for lang in locale_list_no_weeks:

            arw = arrow.Arrow(2000, 1, 10, 5, 55, 0)
            year_ago = arw.shift(years=-1)
            year_future = arw.shift(years=1)

            year_ago_string = year_ago.humanize(arw, locale=lang, granularity=["year"])
            year_future_string = year_future.humanize(
                arw, locale=lang, granularity=["year"]
            )

>           assert arw.dehumanize(year_ago_string, locale=lang) == year_ago

tests/test_arrow.py:2669:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <Arrow [2000-01-10T05:55:00+00:00]>, input_string = '去年', locale = 'zh'

    def dehumanize(self, input_string: str, locale: str = "en_us") -> "Arrow":
        """Returns a new :class:`Arrow <arrow.arrow.Arrow>` object, that represents
        the time difference relative to the attrbiutes of the
        :class:`Arrow <arrow.arrow.Arrow>` object.

        :param timestring: a ``str`` representing a humanized relative time.
        :param locale: (optional) a ``str`` specifying a locale.  Defaults to 'en-us'.

        Usage::

                >>> arw = arrow.utcnow()
                >>> arw
                <Arrow [2021-04-20T22:27:34.787885+00:00]>
                >>> earlier = arw.dehumanize("2 days ago")
                >>> earlier
                <Arrow [2021-04-18T22:27:34.787885+00:00]>

                >>> arw = arrow.utcnow()
                >>> arw
                <Arrow [2021-04-20T22:27:34.787885+00:00]>
                >>> later = arw.dehumanize("in a month")
                >>> later
                <Arrow [2021-05-18T22:27:34.787885+00:00]>

        """

        # Create a locale object based off given local
        locale_obj = locales.get_locale(locale)

        # Check to see if locale is supported
        normalized_locale_name = locale.lower().replace("_", "-")

        if normalized_locale_name not in DEHUMANIZE_LOCALES:
            raise ValueError(
                f"Dehumanize does not currently support the {locale} locale, please consider making a contribution to add support for this locale."
            )

        current_time = self.fromdatetime(self._datetime)

        # Create an object containing the relative time info
        time_object_info = dict.fromkeys(
            ["seconds", "minutes", "hours", "days", "weeks", "months", "years"], 0
        )

        # Create an object representing if unit has been seen
        unit_visited = dict.fromkeys(
            ["now", "seconds", "minutes", "hours", "days", "weeks", "months", "years"],
            False,
        )

        # Create a regex pattern object for numbers
        num_pattern = re.compile(r"\d+")

        # Search input string for each time unit within locale
        for unit, unit_object in locale_obj.timeframes.items():

            # Need to check the type of unit_object to create the correct dictionary
            if isinstance(unit_object, Mapping):
                strings_to_search = unit_object
            else:
                strings_to_search = {unit: str(unit_object)}

            # Search for any matches that exist for that locale's unit.
            # Needs to cycle all through strings as some locales have strings that
            # could overlap in a regex match, since input validation isn't being performed.
            for time_delta, time_string in strings_to_search.items():

                # Replace {0} with regex \d representing digits
                search_string = str(time_string)
                search_string = search_string.format(r"\d+")

                # Create search pattern and find within string
                pattern = re.compile(fr"{search_string}")
                match = pattern.search(input_string)

                # If there is no match continue to next iteration
                if not match:
                    continue

                match_string = match.group()
                num_match = num_pattern.search(match_string)

                # If no number matches
                # Need for absolute value as some locales have signs included in their objects
                if not num_match:
                    change_value = (
                        1 if not time_delta.isnumeric() else abs(int(time_delta))
                    )
                else:
                    change_value = int(num_match.group())

                # No time to update if now is the unit
                if unit == "now":
                    unit_visited[unit] = True
                    continue

                # Add change value to the correct unit (incorporates the plurality that exists within timeframe i.e second v.s seconds)
                time_unit_to_change = str(unit)
                time_unit_to_change += (
                    "s" if (str(time_unit_to_change)[-1] != "s") else ""
                )
                time_object_info[time_unit_to_change] = change_value
                unit_visited[time_unit_to_change] = True

        # Assert error if string does not modify any units
        if not any([True for k, v in unit_visited.items() if v]):
            raise ValueError(
>               "Input string not valid. Note: Some locales do not support the week granulairty in Arrow. "
                "If you are attempting to use the week granularity on an unsupported locale, this could be the cause of this error."
            )
E           ValueError: Input string not valid. Note: Some locales do not support the week granulairty in Arrow. If you are attempting to use the week granularity on an unsupported locale, this could be the cause of this error.

arrow/arrow.py:1422: ValueError

@anishnya
Copy link
Member

anishnya commented Jan 2, 2022

Hi @Yiyiyimu, for the pluralized words, I'm assuming this implementation is based off the Korean locale, which does not support dehumanize, hence the error you're seeing. I'd recommend basing your implementation based off something like the Arabic or Hebrew locales. Let me know if you have any further questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants