Skip to content

.from_bytes() implementation for long MPZ ints #2791

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 7 commits into from

Conversation

pfalcon
Copy link
Contributor

@pfalcon pfalcon commented Jan 15, 2017

This is WIP for the discussion in #682 (comment) . The usual problem here is to make it work with all configuration options for long ints that we have, and that update tests for that. That takes me extended time, so I post this WIP for now (which works well for unix port and passes tests).

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 15, 2017

@nickovs: FYI.

Copy link
Member

@dpgeorge dpgeorge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The usual problem here is to make it work with all configuration options for long ints that we have

What you have here is a good start for the most used case of mpz. Other implementations can follow later, but at least put a dummy implementation of mp_obj_int_from_bytes_impl for non-mpz cases.

@@ -909,6 +909,36 @@ mp_uint_t mpz_set_from_str(mpz_t *z, const char *str, mp_uint_t len, bool neg, m
return cur - str;
}

void mpz_set_from_bytes(mpz_t *z, bool big_endian, mp_uint_t len, byte *buf) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const byte *

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


mpz_dig_t d = 0;
int num_bits = 0;
z->len = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to set z->neg=0 as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -107,6 +107,12 @@ char *mp_obj_int_formatted_impl(char **buf, size_t *buf_size, size_t *fmt_size,
return str;
}

mp_obj_t mp_obj_int_from_bytes_impl(bool big_endian, size_t len, byte *buf) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const byte *

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// get the buffer info
mp_buffer_info_t bufinfo;
mp_get_buffer_raise(args[1], &bufinfo, MP_BUFFER_READ);

// convert the bytes to an integer
#if MICROPY_LONGINT_IMPL != MICROPY_LONGINT_IMPL_NONE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be desirable to use the original code path (ie convert to small-int) if the value fits in a small integer. To keep it simple one could just do something like if (MP_SMALL_INT_FITS(1 << (bufinfo.len * 8 -1))) ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I of course thought about that, but checking exact conditions for that is boring, so as you point, only heuristic conditions would fit. And then, endian="big" was not implemented for that case. Ok, now I implemented "big" case, that's good. But I still don't think it's worth to spend bytes on this heuristic optimization - not before it was shown to be useful at lease. We have optimizations found to be useful from practice (e.g. #2719), it's better to save bytes for those.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I still don't think it's worth to spend bytes on this heuristic optimization - not before it was shown to be useful at lease

A case where it would be useful is instead of struct.unpack('I', buf) which creates a tuple to return the single value. Instead one could use int.from_bytes and it could work without allocating on the heap. See eg end of esp8266/modules/ds18x20.py which does ad-hoc bit shifting to unpack a signed 2-byte integer, in order to avoid allocating memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't help with struct.unpack('I', buf), only with struct.unpack('H', buf), which is 2 bytes, which is indeed can be easily replaced with shift-or. I will add that optimization later, unless you finally agree it doesn't add much useful ;-).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting confused here, and just realised that the title of this PR is wrong, should be from_bytes not to_bytes.

You are right that "I" wont use the small-int path with the simple check for length of incoming buffer. But the more useful case is 'h', ie a signed 2-byte conversion, because that requires more than a shift-or in Python to implement.

My main concern is that this PR is a breaking-change, in the sense that someone who already uses int.from_bytes with the heap locked (eg in an irq) will find that their code stops working after this is merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title is fixed, sorry. Signed conversion for .from_bytes() is again not implemented. But sounds good, I'll implement what you suggest.

@pfalcon pfalcon force-pushed the mpz_set_from_bytes branch from f17dbba to 96e2235 Compare January 19, 2017 21:36
@@ -59,6 +59,7 @@ char *mp_obj_int_formatted(char **buf, size_t *buf_size, size_t *fmt_size, mp_co
char *mp_obj_int_formatted_impl(char **buf, size_t *buf_size, size_t *fmt_size, mp_const_obj_t self_in,
int base, const char *prefix, char base_char, char comma);
mp_int_t mp_obj_int_hash(mp_obj_t self_in);
mp_obj_t mp_obj_int_from_bytes_impl(bool big_endian, size_t len, byte *buf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you make this const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now done.

@pfalcon pfalcon force-pushed the mpz_set_from_bytes branch from f18b0cb to 0176acc Compare January 20, 2017 08:20
@pfalcon pfalcon changed the title .to_bytes() implementation for long MPZ ints .from_bytes() implementation for long MPZ ints Jan 20, 2017
Paul Sokolovsky added 5 commits January 21, 2017 18:51
An implementation of int.from_bytes() is now delegated to function
mp_obj_int_from_bytes_impl() of particular objint implementation. A
version for objint_mpz is provided.
This test works only for MICROPY_LONGINT_IMPL == MICROPY_LONGINT_IMPL_MPZ
and needs a way of skipping in other cases.
@pfalcon pfalcon force-pushed the mpz_set_from_bytes branch from 0176acc to 04b03bc Compare January 21, 2017 16:10
Paul Sokolovsky added 2 commits January 21, 2017 20:03
…/o alloc.

For a small number of bytes, it's expected to return a small int without
allocation.
@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 21, 2017

Ok, this is fully implemented, I'm flushing it to master.

@pfalcon
Copy link
Contributor Author

pfalcon commented Jan 21, 2017

Squashed a bit and merged.

@pfalcon pfalcon closed this Jan 21, 2017
@pfalcon pfalcon deleted the mpz_set_from_bytes branch January 21, 2017 17:16
@dpgeorge
Copy link
Member

Thanks!

@nickovs
Copy link
Contributor

nickovs commented Jan 25, 2017

For the unix port, when I test this on Linux it works correctly but when I test this on MacOS it does not work. After a make clean; make axtls; make on Linux I see:

nicko@prolapse:~/micropython/unix$ git show --oneline -s
b32a38e esp8266: Factor out common linker code to esp8266_common.ld.
nicko@prolapse:~/micropython/unix$ ./micropython -c 'print(int.from_bytes(b"\xff"*64, "little"))'
13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084095

When I do the same on MacOS I get:

Nickos-MBP-84:unix nicko$ git show --oneline -s
b32a38e esp8266: Factor out common linker code to esp8266_common.ld.
Nickos-MBP-84:unix nicko$ ./micropython -c 'print(int.from_bytes(b"\xff"*64, "little"))'
18446744073709551615

I'm not quite sure what's going on here but something is clearly not right.

@dpgeorge
Copy link
Member

I'm not quite sure what's going on here but something is clearly not right.

Thanks for the report! Should be fixed by eaa7745

@nickovs
Copy link
Contributor

nickovs commented Jan 25, 2017

Yes, that seems to fix it.

tannewt added a commit to tannewt/circuitpython that referenced this pull request Aug 3, 2022
* Tweak scroll area position so last line is complete and top is
  under the title bar.
* Pick Blinka size based on the font to minimize unused space in
  title bar. Related to micropython#2791
* Update the title bar after terminal is started. Fixes micropython#6078

Fixes micropython#6668
tannewt added a commit to tannewt/circuitpython that referenced this pull request Aug 3, 2022
* Tweak scroll area position so last line is complete and top is
  under the title bar.
* Pick Blinka size based on the font to minimize unused space in
  title bar. Related to micropython#2791
* Update the title bar after terminal is started. Fixes micropython#6078

Fixes micropython#6668
tannewt added a commit to tannewt/circuitpython that referenced this pull request Aug 4, 2022
* Tweak scroll area position so last line is complete and top is
  under the title bar.
* Pick Blinka size based on the font to minimize unused space in
  title bar. Related to micropython#2791
* Update the title bar after terminal is started. Fixes micropython#6078

Fixes micropython#6668
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy