-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: don't allocate when putting a bool into an interface #17725
Comments
Relatedly, I've been thinking that converting len-1 []byte to string shouldn't need to allocate either. |
This can already be done within the runtime in convT2I. I don't believe it needs compiler assistance. |
How do we feel about smallish integers? Seems like we could profile in convT2I to see how this would pay off. One possible value for smallish is {2,1,0,-1} |
I feel like last time smallish integers was brought up, the concern was that it would be inconsistent and thus surprising. But with bools at least it would be consistent. Then again, the better the compiler and runtime get, the more that performance things get surprising, so maybe it's okay nowadays. I agree some convT2I stats for all this would be interesting. |
I was thinking we just have a 256-byte slab of "\x00\x01...\xff", and we can index into that for any read-only 1-byte value that needs to be converted to a pointer (e.g., converting a len-1 []byte to string, or storing a bool/byte/int8/uint8 into an interface). If we instead used [...]int64{..., -3, -2, -1, 0, 1, 2, 3, ...} for a certain range of integers, we could avoid heap allocs for a wider variety of values. Agreed collecting stats on recognizable patterns would be good. |
That was my exact reaction as well. |
I had a CL a long time ago which did [...]int64{0..1024}. We ended up not using it because the performance cliff from 1024->1025 would be weird. The 256-byte one would not have that problem, although it would weirdly encourage people to cram their data into 1 byte chunks. |
I'm also still in favor of this. I think the benefits outweigh the oddity in performance behavior. |
Clearly we just need to smooth out that cliff with some |
Byte-sized values handled in CL 35555. I went the compiler route because it was simple and meant that byte-size values required no runtime call at all, shrinking ns/op from ~5 to ~1. After handling byte-sized values and constant values (#18704), I instrumented the runtime to dump what integer-ish values made it into convT2E. During make.bash, I got this (unsurprising) distribution, top 30 values only:
That says to me we might get the best bang for our buck by inserting a check for integer-ish values of size <= uintptr in the generated code like: if val == 0 {
e = {itab, &zerobase}
} else {
e = convT2E(...)
} |
Is it impossible to reintroduce inline values in interfaces?
|
It could be done if we increased the size of interfaces to 3 words. Otherwise, I'm afraid that there are a host of complications (mostly in the garbage collector) which make sometimes-a-pointer-sometimes-a-scalar fields hard. I can elaborate more if you wish. |
Lots of prior discussion in #8405. |
CL 35563 implements the strategy "use zerobase for small pointer-free zero values". It could probably be optimized a bit, but it's a first look at the cost/benefit of doing it entirely in the runtime. Checking While running the stdlib tests, CL 35563 eliminated roughly 170,000 allocations. Increasing the size of zerobase did not show any improvement; larger pointer-free zero values sent to convT2E/convT2I are rare. |
And with that, I'm going to pause my hacking on interface conversions. I hope that making the code and numbers concrete will help the conversation about the correct direction to go from here. |
CL https://golang.org/cl/35555 mentions this issue. |
CL https://golang.org/cl/35563 mentions this issue. |
I'm thinking that we haven't completely explored the entire design space
yet.
For example, we can make interface value always allocate anew on heap and
thus all interface values are a single pointer.
then we can do inline data. In fact, we can even inline non-intptr sized
data. In fact, this solution eliminates the classical data race of racing
with interfaces and enables RCU style interface manipulations.
e.g. storing a string into a eface will allocate this struct on heap:
struct {
type *type
// inlined string
str *byte
len int
}
This might not be a net win, but my point is we haven't exhausted the
entire design space yet.
|
… allocation Based in part on khr's CL 2500. Updates #17725 Updates #18121 Change-Id: I744e1f92fc2104e6c5bd883a898c30b2eea8cc31 Reviewed-on: https://go-review.googlesource.com/35555 Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Keith Randall <khr@golang.org>
How can an an static enum from 0x00 to 0xFF can help ? |
Bools are represented as 0 or 1, so they can use the static values for 0 and 1. |
I know that, and thanks for your reply, but I still do not understand the theory behind this optimisation. |
I plan to write a blog post (commaok.xyz) sometime in the next few days that explains a lot of this. |
I just want to finish CL 35563 first, and that will take some experimentation. |
@ancaemanuel an interface is two words: a pointer to a type and a pointer to a value. To store "12" in an interface, you need to have a position in memory that holds the value "12", to be pointed by the interface's second word. Up until now, that position in memory would be allocated on the heap, every time anew. With this patch, for special values between 0 and 255, the compiler knows that there is a location in memory that already holds those values and use it, rather than allocating on the heap. |
@ancaemanuel perhaps http://commaok.xyz/post/interface-allocs/ will also provide some useful background and context, though you're probably better off reading https://research.swtch.com/interfaces :) |
Thank you. |
@ancaemanuel if I understand correctly, that is https://golang.org/issue/18704, which is done. (This is also discussed a bit more readably at http://commaok.xyz/post/interface-allocs/.) |
I read the article. |
I'm not sure I understand. Would you file a new issue with some more details and cc me and we can discuss there? Thanks! |
Fixes golang#17725 name old time/op new time/op delta ConvT2EInt/const-8 0.90ns ± 4% 0.90ns ± 2% ~ (p=0.623 n=20+15) ConvT2EInt/zero-8 21.5ns ± 2% 7.4ns ± 6% -65.33% (p=0.000 n=19+18) ConvT2EInt/one-8 21.6ns ± 2% 22.8ns ± 2% +5.21% (p=0.000 n=20+19) name old alloc/op new alloc/op delta ConvT2EInt/const-8 0.00B 0.00B ~ (all equal) ConvT2EInt/zero-8 8.00B ± 0% 0.00B -100.00% (p=0.000 n=20+20) ConvT2EInt/one-8 8.00B ± 0% 8.00B ± 0% ~ (all equal) name old allocs/op new allocs/op delta ConvT2EInt/const-8 0.00 0.00 ~ (all equal) ConvT2EInt/zero-8 1.00 ± 0% 0.00 -100.00% (p=0.000 n=20+20) ConvT2EInt/one-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) Change-Id: I5b71f9e44e3de8b8f2284a3821c5176c13ab2c61
This is fixed with 03583675 |
Shoving a bool into an interface probably doesn't need to allocate. I imagine most binaries already have a static 1 byte and a static 0 byte somewhere whose address we could use in the interface's data pointer.
/cc @josharian @randall77 @mdempsky
The text was updated successfully, but these errors were encountered: