-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix opt_start_time
usage and other datetime fields
#535
base: main
Are you sure you want to change the base?
Conversation
f82fe98
to
14632f6
Compare
FYI, I ran an additional test which I did not commit because it used pytest: Basically I want to make sure that the timezone information is not lost when using Test case: @pytest.mark.parametrize(
"tz",
[*pytz.all_timezones],
)
@pytest.mark.parametrize(
"microseconds",
[0, 1, 123456]
)
async def test_consumer_with_opt_start_time_all_timezones(self, tz: str, microseconds: int):
nc = NATS()
await nc.connect()
jsm = nc.jsm()
await jsm.add_stream(name="ctests", subjects=["a", "b", "c.>"])
con = await jsm.add_consumer(
"ctests",
opt_start_time=datetime.datetime(1970, 1, 1, microsecond=microseconds, tzinfo=pytz.timezone(tz)),
deliver_policy=api.DeliverPolicy.BY_START_TIME,
)
assert isinstance(con.created, datetime.datetime)
assert con.config.opt_start_time == datetime.datetime(1970, 1, 1, microsecond=microseconds, tzinfo=pytz.timezone(tz))
await nc.close() (click to unfold complete test with setup with pytest)@pytest.mark.asyncio
class SetupTest:
@pytest.fixture(autouse=True)
def setup(self):
self.server_pool = []
server = NATSD(port=4222, with_jetstream=True)
self.server_pool.append(server)
for natsd in self.server_pool:
start_natsd(natsd)
try:
yield
finally:
for natsd in self.server_pool:
natsd.stop()
shutil.rmtree(natsd.store_dir)
class TestOptStartTime(SetupTest):
@pytest.mark.parametrize(
"tz",
[*pytz.all_timezones],
)
@pytest.mark.parametrize(
"microseconds",
[0, 1, 123456]
)
async def test_consumer_with_opt_start_time_all_timezones(self, tz: str, microseconds: int):
nc = NATS()
await nc.connect()
jsm = nc.jsm()
await jsm.add_stream(name="ctests", subjects=["a", "b", "c.>"])
con = await jsm.add_consumer(
"ctests",
opt_start_time=datetime.datetime(1970, 1, 1, microsecond=microseconds, tzinfo=pytz.timezone(tz)),
deliver_policy=api.DeliverPolicy.BY_START_TIME,
)
assert isinstance(con.created, datetime.datetime)
assert con.config.opt_start_time == datetime.datetime(1970, 1, 1, microsecond=microseconds, tzinfo=pytz.timezone(tz))
await nc.close()
|
is assumed to be in UTC timezone. | ||
""" | ||
if date.tzinfo is None: | ||
date = date.replace(tzinfo=datetime.timezone.utc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this part. It is described in the PR body though, and there are test cases for it.
I wonder if we should use .astimezone(datetime.timezone.utc)
instead of .replace(datetime.timezone.utc)
.
The current implementation considers that date without timezone are UTC, which is the rule that we use at work but may not be universal.
If we change to astimezone
, the date is assumed to be local and is converted to UTC.
So it appears that the response precision is variable at times, which is causing A response for (partial, prettied)
Looking at the two timestamps we can see the precision difference:
The second isn't valid for def _convert_rfc3339(resp: Dict[str, Any], field: str) -> None:
"""Convert a RFC 3339 formatted string into a datetime.
If the string is None, None is returned.
"""
val = resp.get(field, None)
if val is None:
return None
raw_date = val[:26]
if raw_date.endswith("Z"):
raw_date = raw_date[:-1] + "+00:00"
resp[field] = datetime.datetime.fromisoformat(raw_date).replace(
tzinfo=datetime.timezone.utc
) As it's coming in the raw response this looks like a It's also appears to be intermittent (dropping trailing |
So it ( RFC3339Nano = "2006-01-02T15:04:05.999999999Z07:00" The latest Python parses this correctly (code): def _parse_hh_mm_ss_ff(tstr):
# Parses things of the form HH[:?MM[:?SS[{.,}fff[fff]]]] However at least as of Python 3.9 it only accepts microseconds with 3 or 6 digits. def _parse_hh_mm_ss_ff(tstr):
# Parses things of the form HH[:MM[:SS[.fff[fff]]]]
len_str = len(tstr)
...
if pos < len_str:
if tstr[pos] != '.':
raise ValueError('Invalid microsecond component')
else:
pos += 1
len_remainder = len_str - pos
if len_remainder not in (3, 6):
raise ValueError('Invalid microsecond component')
time_comps[3] = int(tstr[pos:])
if len_remainder == 3:
time_comps[3] *= 1000 |
Not sure if an added dependency is wanted but backports-datetime-fromisoformat exists and can be used without the monkey patching. This brings the ISO 8601 parsing up to Python 3.11 in any Python 3. from backports.datetime_fromisoformat import datetime_fromisoformat
datetime_fromisoformat("2024-03-20T13:15:22.89525Z") |
Thank you for looking into this ! If I follow correctly, does it mean that the 4 commented test cases should pass but won't ? (I'm not on my computer so I can't run it) def test_parse_consumer_info_with_created_timestamp(self):
for created in [
# "1970-01-01T01:02:03.4Z", # Nok: should pass
# "1970-01-01T01:02:03.400Z", # Nok: should pass
"1970-01-01T01:02:03.400Z", # OK
# "1970-01-01T01:02:03.4000Z", # Nok: should pass
# "1970-01-01T01:02:03.40000Z", # Nok: should pass
"1970-01-01T01:02:03.400000Z", # OK
"1970-01-01T01:02:03.4000000Z", # OK
"1970-01-01T01:02:03.40000000Z", # OK
"1970-01-01T01:02:03.400000000Z", # OK
]:
info = api.ConsumerInfo.from_response({
"name": "test",
"stream_name": "test",
"config": {},
"created": created
})
created = info.created
assert created == datetime.datetime(
1970, 1, 1, 1, 2, 3, 400000, tzinfo=datetime.timezone.utc
)
|
I just pushed a commit to address those changes, and I added a test with more cases. The function cost is still relatively cheap, on my computer (CPU: Intel Pentium 6500Y @ 1.10GHz) 765 ns ± 114 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) What do you think @arguile- ? |
Signed-off-by: Guillaume Charbonnier <[email protected]>
Signed-off-by: Guillaume Charbonnier <[email protected]>
Signed-off-by: Guillaume Charbonnier <[email protected]>
Signed-off-by: Guillaume Charbonnier <[email protected]>
6634834
to
2e55a5c
Compare
nats/js/api.py
Outdated
raw_date = val[:-6] | ||
# Padd missing milliseconds | ||
if "." not in raw_date: | ||
raw_date += ".000000" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary for fromisoformat
, invert if
and leave out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right ! I removed this block and inverted the if
statement.
nats/js/api.py
Outdated
else: | ||
offset = val[-6:] | ||
raw_date = val[:-6] | ||
# Padd missing milliseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Padd missing milliseconds | |
# Pad missing milliseconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth linking to the issue that caused Python 3.11 to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a bunch of comments. Thanks for reviewing those changes, feel free to comment again.
842e274
to
e81be78
Compare
e81be78
to
469e8a8
Compare
Problem
This PR is related to issue #474.
Today it's possible to use the
opt_start_time
field inConsumerConfig
but a# type: ignore
comment is required:This is required because an
int
is expected by theConsumerConfig
dataclass.Moreover, users must encode the
opt_start_time
value prior to using it.Proposal
Annotate
ConsumerConfig.opt_start_time
as adatetime.datetime
and automatically convert from/to string when needed.By default (e.g., unless user-provided), datetimes are assumed to be UTC. So the example above should be equivalent to:
Additional proposal
It seemed that several fields used the same date format in the JetStream API. I uncommented the commented lines with a
FIXME
note where it seemed appropriate. Those changes can be reverted if not desired.Added tests
I added several test cases to make sure that:
Side effects
I added
pytz
as a development dependency, and ranpipenv lock
to refresh the lockfile. This dependency is not required by thenats-py
package, only during development.