Content-Length: 409473 | pFad | https://github.com/adafruit/circuitpython/issues/9693

CB time.sleep waits less than expected · Issue #9693 · adafruit/circuitpython · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

time.sleep waits less than expected #9693

Open
ishahak opened this issue Oct 7, 2024 · 21 comments
Open

time.sleep waits less than expected #9693

ishahak opened this issue Oct 7, 2024 · 21 comments
Assignees
Labels
bug esp32-s3 needs retest rp2 Both RP2 microcontrollers
Milestone

Comments

@ishahak
Copy link

ishahak commented Oct 7, 2024

CircuitPython version

Adafruit CircuitPython 8.1.0-beta.1-48-g98a1083d6-dirty on 2023-08-22; ESP32-S3-DevKitC-1-N32R8 with ESP32S3

Code/REPL

import time

LOOP=200
acc = 0
for i in range(LOOP):
    t = time.monotonic_ns()
    time.sleep(0.01)
    t2 = time.monotonic_ns()
    diff_us = (t2-t) // 1000
    acc += diff_us
avg = acc / LOOP
print(f'expected 10000 us, got {avg} us')

Behavior

The result shows:
expected 10000 us, got 9799.07 us

Description

see:
https://forums.adafruit.com/viewtopic.php?p=1024856#p1024856

Additional information

No response

@ishahak ishahak added the bug label Oct 7, 2024
@todbot
Copy link

todbot commented Oct 7, 2024

You should try that on CircuitPython 9.1 or better.
There have been many updates to the ESP32 internals since 8.1.

@dhalbert dhalbert added this to the 9.x.x milestone Oct 7, 2024
@anecdata
Copy link
Member

anecdata commented Oct 7, 2024

Similar in 9.1.4. A sliver of the loss (<8us) is from the time it takes to call time.monotonic_ns() (time.sleep() takes a little longer). There also seems to be a tinier sliver lost to the float division and the accumulation of truncation errors. Without any division::

expected = 2_000_000_000 ns
actual   = 1_961_395_269 ns (avg 9806.98 us)

But the point remains that time.sleep() doesn't match time.monotonic_ns().

Similar results on Pico [W]:

expected 10000 us, got 9867.55 us
(total: 1_973_510_795 ns; total attributable to time.monotonic_ns() call: 4_211_402 ns)

@ishahak
Copy link
Author

ishahak commented Oct 7, 2024

Re @todbot, if you followed the linked forum discussion you'll see that bug was reproduced on 9.1.1 as well

@todbot
Copy link

todbot commented Oct 7, 2024

Right, thanks I see that now. Also, thanks @anecdata. In my experience trying to do tight-timing on CircuitPython this isn't very surprising. CircuitPython has no concept of atomic or critical sections so who knows what happens during those three function calls: USB handling and garbage collection come to mind as tasks that eat up several milliseconds.

Perhaps a similar test that was disconnected from USB, does a gc.disable() before starting, and then logs the final results to a file would give you a better indication as to true timing. That's not very satisfactory, but I've not found a way for CircuitPython to not be 5-10 milliseconds sloppy with timing. That you're only getting 0.2 milliseconds variation is pretty great, to me.

@dhalbert
Copy link
Collaborator

dhalbert commented Oct 7, 2024

What I think is odd here is that the time is less that the requested time. I'd expect overhead to make it longer. So I'm suspecting a ticks-vs-msec issue: a tick is 1/1024 of a second vs the msec 1/1000 of a second. It could be an error in converting ticks to msecs or assuming one is the other.

@todbot
Copy link

todbot commented Oct 7, 2024

Oh yes, under the requested value. You are correct. I need more coffee.

@johnnohj
Copy link

I think adafruit_support_bill got it right way back in August: here. Plus, any short sleep will be accentuated by the floored division diff_us = (t2-t) // 1000 which rounds decimal remainders down to the nearest integer, e.g., 1.999999 -> 1 (per the docs). See also this for a detailed discussion of what adafruit_support_bill and @dhalbert describe. (Of interest might be the point made about perhaps seeing greater accuracy, in this case, with time.sleep(0.1) instead)

@todbot
Copy link

todbot commented Oct 11, 2024

I think adafruit_support_bill got it right way back in August: here. Plus, any short sleep will be accentuated by the floored division [...]

Ah yes, I have experienced this. In #9237 I was convinced it was a problem with floats in CircuitPython because 0.004 seconds * 1000 millis/sec != 4 milliseconds.

@johnnohj
Copy link

Ah yes, I have experienced this. In #9237 I was convinced it was a problem with floats in CircuitPython because 0.004 seconds / 1000 != 4 milliseconds.

My favorite quote from the discussion:

"Note that this is in the very nature of binary floating point: this is not a bug in Python, and it is not a bug in your code either. You’ll see the same kind of thing in all languages that support your hardware’s floating-point arithmetic" (emphasis mine)

@dhalbert
Copy link
Collaborator

Plus, any short sleep will be accentuated by the floored division diff_us = (t2-t) // 1000 which rounds decimal remainders down to the nearest integer, e.g., 1.999999 -> 1 (per the docs).

I think this is still something to follow up on. We might round instead of truncating. And I am still suspicious of 1024 vs 1000. 9.7 vs 10 is not a few least-significant bits of difference.

@johnnohj
Copy link

I think this is still something to follow up on. We might round instead of truncating. And I am still suspicious of 1024 vs 1000. 9.7 vs 10 is not a few least-significant bits of difference.

I had played around with altering the value given time.sleep; the handful of substitute values tried were enough to suggest that expressions are perhaps being evaluated according to The Price is Right rules, which is to say what it returns is as near as possible the specified value without going over, within the limits of floating point/ (1000/1024) math.

Using values for sleep from 0.08 to 0.011 returned results which were either 986x(- up to 90) or 1076x(- up to 20).

Using 0.1 (and adjusting divisor to 10000) returned 9985(- up to 20).

(These tests were done on a Metro RP2040 running 9.2.0-beta.0)

@eightycc eightycc self-assigned this Mar 27, 2025
@dhalbert dhalbert modified the milestones: 9.x.x, 10.0.0 Mar 27, 2025
@eightycc eightycc added the rp2 Both RP2 microcontrollers label Mar 28, 2025
@eightycc
Copy link
Collaborator

eightycc commented Mar 28, 2025

mp_hal_delay_ms() starts out by calculating end_tick by converting port_get_raw_ticks() to ms, but adjusts remaining ticks using port_get_raw_ticks() without conversion.

@johnnohj
Copy link

but adjusts remaining ticks using port_get_raw_ticks() without conversion.

Does this mean it's literally losing track of time?

(fwiw: Revisiting this thread led me to imagine potential build flags for fuzzy logic. I'm too new to the project to know, but I have to think it's been discussed somewhere)

@Neradoc
Copy link

Neradoc commented Mar 29, 2025

mp_hal_delay_ms() starts out by calculating end_tick by converting port_get_raw_ticks() to ms, but adjusts remaining ticks using port_get_raw_ticks() without conversion.

I only see the delay being converted from ms to ticks. The rest is all done in ticks.

uint64_t delay_ticks = (delay_ms * (uint64_t)1024) / 1000;

The issue is not with floats, it's because the actual sleep time is in milliseconds, and there's 1024 ticks in 1000 milliseconds, so it just can't be accurate.

  • 0.01 gets converted to 10 milliseconds correctly
  • mp_hal_delay_ms() gets called with 10ms as input correctly
  • 10ms gets converted to 10 ticks because in int land: (10 * 1024) / 1000 = 10 which is a loss of accuracy
  • the board correctly sleeps for 10 ticks, which are actually ~9.77ms (10 * 1000 / 1024)

So that explains why it's lower than 10, but note that the function reads port_get_raw_ticks(NULL) which means it has at best a precision of 1 tick anyway (or 0.977 ms). 11 ticks being ~10.74ms.

The only way to have a higher precision would be to use subticks (which are a 1/32th of a tick or ~31us) for sleeps below say... 1s.
10 ms = 10 * 1024 / 1000 = 10.24 ticks = 10 ticks + 8 subticks.

@eightycc
Copy link
Collaborator

eightycc commented Apr 2, 2025

I only see the delay being converted from ms to ticks. The rest is all done in ticks.

@Neradoc Right. I'd mis-read the code. It does boil down to a question of precision as you describe.

@eightycc
Copy link
Collaborator

eightycc commented Apr 3, 2025

Using subticks to increase timer resolution by 5 bits, results running on a Raspberry Pi Pico W are:

10.0.0-alpha.1-19-g380b8bda7a-dirty
expected 10000 us, got 9997.95 us

Repeating the test several times gives similar results:

expected 10000 us, got 9999.29 us
expected 10000 us, got 10000.2 us
expected 10000 us, got 9996.29 us
expected 10000 us, got 9996.44 us
expected 10000 us, got 9998.56 us
expected 10000 us, got 9996.74 us

I'll be making similar updates to other ports and will PR shortly.

@Neradoc
Copy link

Neradoc commented Apr 3, 2025

Maybe we could have a note in the sleep() docs about precision being limited around the millisecond.
I've seen library code with time.sleep(0.00001) or 0.0001 which I don't know if this results in an actual sleep of 1 tick or 0.

Note that C python sleep docs says:

The suspension time may be longer than requested by an arbitrary amount, because of the scheduling of other activity in the system.

For CP, do we guarantee it's at least the time given (by rounding up) ? Do we say it's a "closest" effort plus overhead ? Do we just say it's an approximation ? Do we note that we can't promise precise timings in CP anyway ?

@dhalbert
Copy link
Collaborator

dhalbert commented Apr 3, 2025

For CP, do we guarantee it's at least the time given (by rounding up) ? Do we say it's a "closest" effort plus overhead ? Do we just say it's an approximation ? Do we promise nothing ?

No guarantees as to whether it's a ceiling rather than a floor, because there are things that can make the interval longer, like gc and interrupt handling. Right now time.sleep(0.00001) doesn't really do anything. For more precise timing, one can use microntroller.delay_us(), which does a busy-wait loop. Perhaps @eightycc's changes will subsume that.

@eightycc
Copy link
Collaborator

eightycc commented Apr 3, 2025

I've seen library code with time.sleep(0.00001) or 0.0001 which I don't know if this results in an actual sleep of 1 tick or 0.

The conversion code in time_sleep first converts the floating time in seconds to milliseconds by floating multiplication and rounds up by adding 0.5. An implicit cast from float to int then truncates the fractional part (standard C). So, a value < 0.5 milliseconds will not sleep while a value >= 0.5 milliseconds will sleep at least 1 millisecond.

For more precise timing, one can use microntroller.delay_us(), which does a busy-wait loop. Perhaps @eightycc's changes will subsume that.

My update does not support sleep timer resolution beyond integral milliseconds.

Since the underlying timer has subtick resolution (in most cases), it is possible to resolve sleep to ~30 us. Perhaps change mp_hal_delay_ms to mp_hal_delay_us? This would avoid any busy waiting. Is there utility in doing this?

@dhalbert
Copy link
Collaborator

dhalbert commented Apr 3, 2025

My update does not support sleep timer resolution beyond integral milliseconds.

Since the underlying timer has subtick resolution (in most cases), it is possible to resolve sleep to ~30 us. Perhaps change mp_hal_delay_ms to mp_hal_delay_us? This would avoid any busy waiting. Is there utility in doing this?

Are you saying change that call? mp_hal_delay_us() typically is a busy-wait, not a tick wait (which will save power, since it uses WFI() or the equivalent.

One issue is if a very long sleep would overflow the 32-bit arg to mp_hal_delay_us().
There could be a call in the time.sleep() impl to call mp_hal_delay_ms() for the ms part and then call mp_hal_delay_us() for the rest.

@eightycc
Copy link
Collaborator

eightycc commented Apr 3, 2025

Are you saying change that call? mp_hal_delay_us() typically is a busy-wait, not a tick wait (which will save power, since it uses WFI() or the equivalent.

Because there is not a generic mp_hal_delay_us(), I assumed it didn't exist. My bad. I agree that a busy wait is not a good idea due to power consumption implications. I'll choose another name that clarifies whether the implemented delay is a busy or a timer wait.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug esp32-s3 needs retest rp2 Both RP2 microcontrollers
Projects
None yet
Development

No branches or pull requests

8 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/adafruit/circuitpython/issues/9693

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy