.from_bytes() implementation for long MPZ ints #2791

pfalcon · 2017-01-15T08:39:02Z

This is WIP for the discussion in #682 (comment) . The usual problem here is to make it work with all configuration options for long ints that we have, and that update tests for that. That takes me extended time, so I post this WIP for now (which works well for unix port and passes tests).

pfalcon · 2017-01-15T08:39:14Z

@nickovs: FYI.

dpgeorge

The usual problem here is to make it work with all configuration options for long ints that we have

What you have here is a good start for the most used case of mpz. Other implementations can follow later, but at least put a dummy implementation of mp_obj_int_from_bytes_impl for non-mpz cases.

dpgeorge · 2017-01-16T00:50:13Z

py/mpz.c

@@ -909,6 +909,36 @@ mp_uint_t mpz_set_from_str(mpz_t *z, const char *str, mp_uint_t len, bool neg, m
    return cur - str;
 }

+void mpz_set_from_bytes(mpz_t *z, bool big_endian, mp_uint_t len, byte *buf) {


const byte *

dpgeorge · 2017-01-16T00:53:00Z

py/mpz.c

+
+    mpz_dig_t d = 0;
+    int num_bits = 0;
+    z->len = 0;


Need to set z->neg=0 as well.

dpgeorge · 2017-01-16T00:57:59Z

py/objint_mpz.c

@@ -107,6 +107,12 @@ char *mp_obj_int_formatted_impl(char **buf, size_t *buf_size, size_t *fmt_size,
    return str;
 }

+mp_obj_t mp_obj_int_from_bytes_impl(bool big_endian, size_t len, byte *buf) {


const byte *

dpgeorge · 2017-01-16T01:04:09Z

py/objint.c

    // get the buffer info
    mp_buffer_info_t bufinfo;
    mp_get_buffer_raise(args[1], &bufinfo, MP_BUFFER_READ);

-    // convert the bytes to an integer
+    #if MICROPY_LONGINT_IMPL != MICROPY_LONGINT_IMPL_NONE


It would be desirable to use the original code path (ie convert to small-int) if the value fits in a small integer. To keep it simple one could just do something like if (MP_SMALL_INT_FITS(1 << (bufinfo.len * 8 -1))) ....

I of course thought about that, but checking exact conditions for that is boring, so as you point, only heuristic conditions would fit. And then, endian="big" was not implemented for that case. Ok, now I implemented "big" case, that's good. But I still don't think it's worth to spend bytes on this heuristic optimization - not before it was shown to be useful at lease. We have optimizations found to be useful from practice (e.g. #2719), it's better to save bytes for those.

But I still don't think it's worth to spend bytes on this heuristic optimization - not before it was shown to be useful at lease

A case where it would be useful is instead of struct.unpack('I', buf) which creates a tuple to return the single value. Instead one could use int.from_bytes and it could work without allocating on the heap. See eg end of esp8266/modules/ds18x20.py which does ad-hoc bit shifting to unpack a signed 2-byte integer, in order to avoid allocating memory.

It won't help with struct.unpack('I', buf), only with struct.unpack('H', buf), which is 2 bytes, which is indeed can be easily replaced with shift-or. I will add that optimization later, unless you finally agree it doesn't add much useful ;-).

I'm getting confused here, and just realised that the title of this PR is wrong, should be from_bytes not to_bytes.

You are right that "I" wont use the small-int path with the simple check for length of incoming buffer. But the more useful case is 'h', ie a signed 2-byte conversion, because that requires more than a shift-or in Python to implement.

My main concern is that this PR is a breaking-change, in the sense that someone who already uses int.from_bytes with the heap locked (eg in an irq) will find that their code stops working after this is merged.

Title is fixed, sorry. Signed conversion for .from_bytes() is again not implemented. But sounds good, I'll implement what you suggest.

dpgeorge · 2017-01-20T02:04:33Z

py/objint.h

@@ -59,6 +59,7 @@ char *mp_obj_int_formatted(char **buf, size_t *buf_size, size_t *fmt_size, mp_co
 char *mp_obj_int_formatted_impl(char **buf, size_t *buf_size, size_t *fmt_size, mp_const_obj_t self_in,
                                int base, const char *prefix, char base_char, char comma);
 mp_int_t mp_obj_int_hash(mp_obj_t self_in);
+mp_obj_t mp_obj_int_from_bytes_impl(bool big_endian, size_t len, byte *buf);


Did you make this const?

…ytes().

An implementation of int.from_bytes() is now delegated to function mp_obj_int_from_bytes_impl() of particular objint implementation. A version for objint_mpz is provided.

This test works only for MICROPY_LONGINT_IMPL == MICROPY_LONGINT_IMPL_MPZ and needs a way of skipping in other cases.

…PL_NONE.

To be implemented later.

…/o alloc. For a small number of bytes, it's expected to return a small int without allocation.

pfalcon · 2017-01-21T17:06:42Z

Ok, this is fully implemented, I'm flushing it to master.

pfalcon · 2017-01-21T17:16:47Z

Squashed a bit and merged.

dpgeorge · 2017-01-22T01:01:44Z

Thanks!

nickovs · 2017-01-25T01:10:35Z

For the unix port, when I test this on Linux it works correctly but when I test this on MacOS it does not work. After a make clean; make axtls; make on Linux I see:

nicko@prolapse:~/micropython/unix$ git show --oneline -s
b32a38e esp8266: Factor out common linker code to esp8266_common.ld.
nicko@prolapse:~/micropython/unix$ ./micropython -c 'print(int.from_bytes(b"\xff"*64, "little"))'
13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084095

When I do the same on MacOS I get:

Nickos-MBP-84:unix nicko$ git show --oneline -s
b32a38e esp8266: Factor out common linker code to esp8266_common.ld.
Nickos-MBP-84:unix nicko$ ./micropython -c 'print(int.from_bytes(b"\xff"*64, "little"))'
18446744073709551615

I'm not quite sure what's going on here but something is clearly not right.

dpgeorge · 2017-01-25T03:40:07Z

I'm not quite sure what's going on here but something is clearly not right.

Thanks for the report! Should be fixed by eaa7745

nickovs · 2017-01-25T04:11:44Z

Yes, that seems to fix it.

* Tweak scroll area position so last line is complete and top is under the title bar. * Pick Blinka size based on the font to minimize unused space in title bar. Related to micropython#2791 * Update the title bar after terminal is started. Fixes micropython#6078 Fixes micropython#6668

dpgeorge reviewed Jan 16, 2017

View reviewed changes

pfalcon force-pushed the mpz_set_from_bytes branch from f17dbba to 96e2235 Compare January 19, 2017 21:36

dpgeorge reviewed Jan 20, 2017

View reviewed changes

pfalcon force-pushed the mpz_set_from_bytes branch from f18b0cb to 0176acc Compare January 20, 2017 08:20

pfalcon changed the title ~~.to_bytes() implementation for long MPZ ints~~ .from_bytes() implementation for long MPZ ints Jan 20, 2017

Paul Sokolovsky added 5 commits January 21, 2017 18:51

py/mpz: Implement mpz_set_from_bytes() as a foundation for int.from_b…

a225b23

…ytes().

py/objint: Delegate from_bytes() to mp_obj_int_from_bytes_impl().

abee21f

An implementation of int.from_bytes() is now delegated to function mp_obj_int_from_bytes_impl() of particular objint implementation. A version for objint_mpz is provided.

tests: Add test for int.from_bytes() for arbitrary-precision integer.

ef47b7a

This test works only for MICROPY_LONGINT_IMPL == MICROPY_LONGINT_IMPL_MPZ and needs a way of skipping in other cases.

py/objint: from_bytes: Support byteorder="big" for MICROPY_LONGINT_IM…

115a95a

…PL_NONE.

py/objint_longlong: Add stub for mp_obj_int_from_bytes_impl().

04b03bc

To be implemented later.

pfalcon force-pushed the mpz_set_from_bytes branch from 0176acc to 04b03bc Compare January 21, 2017 16:10

Paul Sokolovsky added 2 commits January 21, 2017 20:03

py/objint: from_bytes(): If result fits in small int, use that.

2c4318a

tests/heapalloc_int_from_bytes: Test that int.from_bytes() can work w…

fb9a32d

…/o alloc. For a small number of bytes, it's expected to return a small int without allocation.

pfalcon closed this Jan 21, 2017

pfalcon deleted the mpz_set_from_bytes branch January 21, 2017 17:16

Uh oh!

.from_bytes() implementation for long MPZ ints #2791

.from_bytes() implementation for long MPZ ints #2791

Uh oh!

Conversation

pfalcon commented Jan 15, 2017

Uh oh!

pfalcon commented Jan 15, 2017

Uh oh!

dpgeorge left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pfalcon commented Jan 21, 2017

Uh oh!

pfalcon commented Jan 21, 2017

Uh oh!

dpgeorge commented Jan 22, 2017

Uh oh!

nickovs commented Jan 25, 2017

Uh oh!

dpgeorge commented Jan 25, 2017

Uh oh!

nickovs commented Jan 25, 2017

Uh oh!

Uh oh!

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.