Content-Length: 74685 | pFad | https://go.googlesource.com/proposal/+/master/design/36460-lazy-module-loading.md
Author: Bryan C. Mills (with substantial input from Russ Cox, Jay Conrod, and Michael Matloob)
Last updated: 2020-02-20
Discussion at https://golang.org/issue/36460.
We propose to change cmd/go
to avoid loading transitive module dependencies that have no observable effect on the packages to be built.
The key insights that lead to this approach are:
If no package in a given dependency module is ever (even transitively) imported by any package loaded by an invocation of the go
command, then an incompatibility between any package in that dependency and any other package has no observable effect in the resulting program(s). Therefore, we can safely ignore the (transitive) requirements of any module that does not contribute any package to the build.
We can use the explicit requirements of the main module as a coarse filter on the set of modules relevant to the main module and to previous invocations of the go
command.
Based on those insights, we propose to change the go
command to retain more transitive dependencies in go.mod
files and to avoid loading go.mod
files for “irrelevant” modules, while still maintaining high reproducibility for build and test operations.
In the initial implementation of modules, we attempted to make go mod tidy
prune out of the go.mod
file any module that did not provide a transitive import of the main module. However, that did not always preserve the remaining build list: a module that provided no packages might still raise the minimum requirement on some other module that did provide a package.
We addressed that problem in CL 121304 by explicitly retaining requirements on all modules that provide directly-imported packages, as well as a minimal set of module requirement roots needed to retain the selected versions of transitively-imported packages.
In #29773 and #31248, we realized that, due to the fact that the go.mod
file is pruned to remove indirect dependencies already implied by other requirements, we must load the go.mod
file for all versions of dependencies, even if we know that they will not be selected — even including the main module itself!
In #30831 and #34016, we learned that following deep history makes problematic dependencies very difficult to completely eliminate. If the repository containing a module is no longer available and the module is not cached in a module mirror, then we will encounter an error when loading any module — even a very old, irrelevant one! — that required it.
In #26904, #32058, #33370, and #34417, we found that the need to consider every version of a module separately, rather than only the selected version, makes the replace
directive difficult to understand, difficult to use correctly, and generally more complex than we would like it to be.
In addition, users have repeatedly expressed the desire to avoid the cognitive overhead of seeing “irrelevant” transitive dependencies (#26955, #27900, #32380), reasoning about older-than-selected transitive dependencies (#36369), and fetching large numbers of go.mod
files (#33669, #29935).
In this proposal, we aim to achieve a property that we call lazy loading:
In the steady state, an invocation of the go
command should not load any go.mod
file or source code for a module (other than the main module) that provides no packages loaded by that invocation.
go
command, the go
command should not load a go.mod
file or source code for any other version of that module.We also want to preserve reproducibility of go
command invocations:
go
command should either load the same version of each package as every other invocation since the last edit to the go.mod
file, or should edit the go.mod
file in a way that causes the next invocation on any subset of the same packages to use the same versions.We propose that, when the main module's go.mod
file specifies go 1.15
or higher, every invocation of the go
command should update the go.mod
file to maintain three invariants.
(The import invariant.) The main module's go.mod
file explicitly requires the selected version of every module that contains one or more packages that were transitively imported by any package in the main module.
(The argument invariant.) The main module's go.mod
file explicitly requires the selected version of every module that contains one or more packages that matched an explicit package pattern argument.
(The completeness invariant.) The version of every module that contributed any package to the build is recorded in the go.mod
file of either the main module itself or one of modules it requires explicitly.
The completeness invariant alone is sufficient to ensure reproducibility and lazy loading. However, it is under-constrained: there are potentially many minimal sets of requirements that satisfy the completeness invariant, and even more valid solutions. The import and argument invariants guide us toward a specific solution that is simple and intuitive to explain in terms of the go
commands invoked by the user.
If the main module satisfies the import and argument invariants, and all explicit module dependencies also satisfy the import invariant, then the completeness invariant is also trivially satisfied. Given those, the completeness invariant exists only in order to tolerate incomplete dependencies.
If the import invariant or argument invariant holds at the start of a go
invocation, we can trivially preserve that invariant (without loading any additional packages or modules) at the end of the invocation by updating the go.mod
file with explicit versions for all module paths that were already present, in addition to any new main-module imports or package arguments found during the invocation.
At the start of each operation, we load all of the explicit requirements from the main module's go.mod
file.
If we encounter an import from any module that is not already explicitly required by the main module, we perform a deepening scan. To perform a deepening scan, we read the go.mod
file for each module explicitly required by the main module, and add its requirements to the build list. If any explicitly-required module uses go 1.14
or earlier, we also read the go.mod
files for all of that module's (transitive) module dependencies.
(The deepening scan allows us to detect changes to the import graph without loading the whole graph explicitly: if we encounter a new import from within a previously-irrelevant package, the deepening scan will re-read the requirements of the module containing that package, and will ensure that the selected version of that import is compatible with all other relevant packages.)
As we load each imported package, we also read the go.mod
file for the module containing that package and add its requirements to the build list — even if that version of the module was already explicitly required by the main module.
(This step is theoretically redundant: the requirements of the main module will already reflect any relevant dependencies, and the deepening scan will catch any previously-irrelevant dependencies that subsequently become relevant. However, reading the go.mod
file for each imported package makes the go
command much more robust to inconsistencies in the go.mod
file — including manual edits, erroneous version-control merge resolutions, incomplete dependencies, and changes in replace
directives and replacement directory contents.)
If, after the deepening scan, the package to be imported is still not found in any module in the build list, we resolve the latest
version of a module containing that package and add it to the build list (following the same search procedure as in Go 1.14), then perform another deepening scan (this time including the newly added-module) to ensure consistency.
all
pattern and mod
subcommandsIn module mode in Go 1.11–1.14, the all
package pattern matches each package reachable by following imports and tests of imported packages recursively, starting from the packages in the main module. (It is equivalent to the set of packages obtained by iterating go list -deps -test ./...
over its own output until it reaches a fixed point.)
go mod tidy
adjusts the go.mod
and go.sum
files so that the main module transitively requires a set of modules that provide every package matching the all
package pattern, independent of build tags. After go mod tidy
, every package matching the all
package pattern is provided by some module matching the all
module pattern.
go mod tidy
also updates a set of // indirect
comments indicating versions added or upgraded beyond what is implied by transitive dependencies.
go mod download
downloads all modules matching the all
module pattern, which normally includes a module providing every package in the all
package pattern.
In contrast, go mod vendor
copies in only the subset of packages transitively imported by the packages and tests in the main module: it does not scan the imports of tests outside of the main module, even if those tests are for imported packages. (That is: go mod vendor
only covers the packages directly reported by go list -deps -test ./...
.)
As a result, when using -mod=vendor
the all
and ...
patterns may match substantially fewer packages than when using -mod=mod
(the default) or -mod=readonly
.
all
package pattern and go mod tidy
We would like to preserve the property that, after go mod tidy
, invocations of the go
command — including go test
— are reproducible (without changing the go.mod
file) for every package matching the all
package pattern. The completeness invariant is what ensures reproducibility, so go mod tidy
must ensure that it holds.
Unfortunately, even if the import invariant holds for all of the dependencies of the main module, the current definition of the all
pattern includes dependencies of tests of dependencies, recursively. In order to establish the completeness invariant for distant test-of-test dependencies, go mod tidy
would sometimes need to record a substantial number of dependencies of tests found outside of the main module in the main module's go.mod
file.
Fortunately, we can omit those distant dependencies a different way: by changing the definition of the all
pattern itself, so that test-of-test dependencies are no longer included. Feedback from users (in #29935, #26955, #32380, #32419, #33669, and perhaps others) has consistently favored omitting those dependencies, and narrowing the all
pattern would also establish a nice new property: after running go mod vendor
, the all
package pattern with -mod=vendor
would now match the all
pattern with -mod=mod
.
Taking those considerations into account, we propose that the all
package pattern in module mode should match only the packages transitively imported by packages and tests in the main module: that is, exactly the set of packages preserved by go mod vendor
. Since the all
pattern is based on package imports (more-or-less independent of module dependencies), this change should be independent of the go
version specified in the go.mod
file.
The behavior of go mod tidy
should change depending on the go
version. In a module that specifies go 1.15
or later, go mod tidy
should scan the packages matching the new definition of all
, ignoring build tags. In a module that specifies go 1.14
or earlier, it should continue to scan the packages matching the old definition (still ignoring build tags). (Note that both of those sets are supersets of the new all
pattern.)
all
and ...
module patterns and go mod download
In Go 1.11–1.14, the all
module pattern matches each module reachable by following module requirements recursively, starting with the main module and visiting every version of every module encountered. The module pattern ...
has the same behavior.
The all
module pattern is important primarily because it is the default set of modules downloaded by the go mod download
subcommand, which sets up the local cache for offline use. However, it (along with ...
) is also currently used by a few other tools (such as go doc
) to locate “modules of interest” for other purposes.
Unfortunately, these patterns as defined in Go 1.11–1.14 are not compatible with lazy loading: they examine transitive go.mod
files without loading any packages. Therefore, in order to achieve lazy loading we must change their behavior.
Since we want to compute the list of modules without loading any packages or irrelevant go.mod
files, we propose that when the main module‘s go.mod
file specifies go 1.15
or higher, the all
and wildcard module patterns should match only those modules found in a deepening scan of the main module’s dependencies. That definition includes every module whose version is reproducible due to the completeness invariant, including modules needed by tests of transitive imports.
With this redefinition of the all
module pattern, and the above redefinition of the all
package pattern, we again have the property that, after go mod tidy && go mod download all
, invoking go test
on any package within all
does not need to download any new dependencies.
Since the all
pattern includes every module encountered in the deepening scan, rather than only those that provide imported packages, go mod download
may continue to download more source code than is strictly necessary to build the packages in all
. However, as is the case today, users may download only that narrower set as a side effect of invoking go list all
.
go.mod
sizeUnder this approach, the set of modules recorded in the go.mod
file would in most cases increase beyond the set recorded in Go 1.14. However, the set of modules recorded in the go.sum
file would decrease: irrelevant modules would no longer be included.
The modules recorded in go.mod
under this proposal would be a strict subset of the set of modules recorded in go.sum
in Go 1.14.
go
command would still not require a separate “manifest” file, and unlike a lock file, the go.mod
file would still be updated automatically to reflect new requirements discovered during package loading.)For modules with few test-of-test dependencies, the go.mod
file after running go mod tidy
will typically be larger than in Go 1.14. For modules with many test-of-test dependencies, it may be substantially smaller.
For modules that are tidy:
The module versions recorded in the go.mod
file would be exactly those listed in vendor/modules.txt
, if present.
The module versions recorded in vendor/modules.txt
would be the same as under Go 1.14, although the ## explicit
annotations could perhaps be removed (because all relevant dependencies would be recorded explicitly).
The module versions recorded in the go.sum
file would be exactly those listed in the go.mod
file.
The go.mod
file syntax and semantics proposed here are backward compatible with previous Go releases: all go.mod
files for existing go
versions would retain their current meaning.
Under this proposal, a go.mod
file that specifies go 1.15
or higher will cause the go
command to lazily load the go.mod
files for its requirements. When reading a go 1.15
file, previous versions of the go
command (which do not prune irrelevant dependencies) may select higher versions than those selected under this proposal, by following otherwise-irrelevant dependency edges. However, because the require
directive continues to specify a minimum version for the required dependency, a previous version of the go
command will never select a lower version of any dependency.
Moreover, any strategy that prunes out a dependency as interpreted by a previous go
version will continue to prune out that dependency as interpreted under this proposal: module maintainers will not be forced to break users on new go
versions in order to support users on older versions (or vice-versa).
Versions of the go
command before 1.14 do not preserve the proposed invariants for the main module: if go
command from before 1.14 is run in a go 1.15
module, it may automatically remove requirements that are now needed. However, as a result of CL 204878, go
version 1.14 does preserve those invariants in all subcommands except for go mod tidy
: Go 1.14 users will be able to work (in a limited fashion) within a Go 1.15 main module without disrupting its invariants.
bcmills
is working on a prototype of this design for cmd/go
in Go 1.15.
At this time, we do not believe that any other tooling changes will be needed.
Because go mod tidy
will now preserve seemingly-redundant requirements, we may find that we want to expand or update the // indirect
comments that it currently manages. For example, we may want to indicate “indirect dependencies at implied versions” separately from “upgraded or potentially-unused indirect dependencies”, and we may want to indicate “direct or indirect dependencies of tests” separately from “direct or indirect dependencies of non-tests”.
Since these comments do not have a semantic effect, we can fine-tune them after implementation (based on user feedback) without breaking existing modules.
The following examples illustrate the proposed behavior using the cmd/go
script test format. For local testing and exploration, the test files can be extracted using the txtar
tool.
cp go.mod go.mod.old go mod tidy cmp go.mod go.mod.old # Before adding a new import, the go.mod file should # enumerate modules for all packages already imported. go list all cmp go.mod go.mod.old # When a new import is found, we should perform a deepening scan of the existing # dependencies and add a requirement on the version required by those # dependencies — not re-resolve 'latest'. cp lazy.go.new lazy.go go list all cmp go.mod go.mod.new -- go.mod -- module example.com/lazy go 1.15 require ( example.com/a v0.1.0 example.com/b v0.1.0 // indirect ) replace ( example.com/a v0.1.0 => ./a example.com/b v0.1.0 => ./b example.com/c v0.1.0 => ./c1 example.com/c v0.2.0 => ./c2 ) -- lazy.go -- package lazy import ( _ "example.com/a/x" ) -- lazy.go.new -- package lazy import ( _ "example.com/a/x" _ "example.com/a/y" ) -- go.mod.new -- module example.com/lazy go 1.15 require ( example.com/a v0.1.0 example.com/b v0.1.0 // indirect example.com/c v0.1.0 // indirect ) replace ( example.com/a v0.1.0 => ./a example.com/b v0.1.0 => ./b example.com/c v0.1.0 => ./c1 example.com/c v0.2.0 => ./c2 ) -- a/go.mod -- module example.com/a go 1.15 require ( example.com/b v0.1.0 example.com/c v0.1.0 ) -- a/x/x.go -- package x import _ "example.com/b" -- a/y/y.go -- package y import _ "example.com/c" -- b/go.mod -- module example.com/b go 1.15 -- b/b.go -- package b -- c1/go.mod -- module example.com/c go 1.15 -- c1/c.go -- package c -- c2/go.mod -- module example.com/c go 1.15 -- c2/c.go -- package c
cp go.mod go.mod.old go mod tidy cmp go.mod go.mod.old # 'go list -m all' should include modules that cover the test dependencies of # the packages imported by the main module, found via a deepening scan. go list -m all stdout 'example.com/b v0.1.0' ! stdout example.com/c cmp go.mod go.mod.old # 'go test' of any package in 'all' should use its existing dependencies without # updating the go.mod file. go list all stdout example.com/a/x go test example.com/a/x cmp go.mod go.mod.old -- go.mod -- module example.com/lazy go 1.15 require example.com/a v0.1.0 replace ( example.com/a v0.1.0 => ./a example.com/b v0.1.0 => ./b1 example.com/b v0.2.0 => ./b2 example.com/c v0.1.0 => ./c ) -- lazy.go -- package lazy import ( _ "example.com/a/x" ) -- a/go.mod -- module example.com/a go 1.15 require example.com/b v0.1.0 -- a/x/x.go -- package x -- a/x/x_test.go -- package x import ( "testing" _ "example.com/b" ) func TestUsingB(t *testing.T) { // … } -- b1/go.mod -- module example.com/b go 1.15 require example.com/c v0.1.0 -- b1/b.go -- package b -- b1/b_test.go -- package b import _ "example.com/c" -- b2/go.mod -- module example.com/b go 1.15 require example.com/c v0.1.0 -- b2/b.go -- package b -- b2/b_test.go -- package b import _ "example.com/c" -- c/go.mod -- module example.com/c go 1.15 -- c/c.go -- package c
cp go.mod go.mod.old go mod tidy cmp go.mod go.mod.old # 'go list -m all' should include modules that cover the test dependencies of # the packages imported by the main module, found via a deepening scan. go list -m all stdout 'example.com/b v0.1.0' cmp go.mod go.mod.old # 'go test all' should use those existing dependencies without updating the # go.mod file. go test all cmp go.mod go.mod.old -- go.mod -- module example.com/lazy go 1.15 require ( example.com/a v0.1.0 ) replace ( example.com/a v0.1.0 => ./a example.com/b v0.1.0 => ./b1 example.com/b v0.2.0 => ./b2 example.com/c v0.1.0 => ./c ) -- lazy.go -- package lazy import ( _ "example.com/a/x" ) -- a/go.mod -- module example.com/a go 1.15 require ( example.com/b v0.1.0 ) -- a/x/x.go -- package x -- a/x/x_test.go -- package x import _ "example.com/b" func TestUsingB(t *testing.T) { // … } -- b1/go.mod -- module example.com/b go 1.15 -- b1/b.go -- package b -- b1/b_test.go -- package b import _ "example.com/c" -- b2/go.mod -- module example.com/b go 1.15 require ( example.com/c v0.1.0 ) -- b2/b.go -- package b -- b2/b_test.go -- package b import _ "example.com/c" -- c/go.mod -- module example.com/c go 1.15 -- c/c.go -- package c
go 1.14
dependencycp go.mod go.mod.old go mod tidy cmp go.mod go.mod.old # 'go list -m all' should include modules that cover the test dependencies of # the packages imported by the main module, found via a deepening scan. go list -m all stdout 'example.com/b v0.1.0' stdout 'example.com/c v0.1.0' cmp go.mod go.mod.old # 'go test' of any package in 'all' should use its existing dependencies without # updating the go.mod file. # # In order to satisfy reproducibility for the loaded packages, the deepening # scan must follow the transitive module dependencies of 'go 1.14' modules. go list all stdout example.com/a/x go test example.com/a/x cmp go.mod go.mod.old -- go.mod -- module example.com/lazy go 1.15 require example.com/a v0.1.0 replace ( example.com/a v0.1.0 => ./a example.com/b v0.1.0 => ./b example.com/c v0.1.0 => ./c1 example.com/c v0.2.0 => ./c2 ) -- lazy.go -- package lazy import ( _ "example.com/a/x" ) -- a/go.mod -- module example.com/a go 1.14 require example.com/b v0.1.0 -- a/x/x.go -- package x -- a/x/x_test.go -- package x import ( "testing" _ "example.com/b" ) func TestUsingB(t *testing.T) { // … } -- b/go.mod -- module example.com/b go 1.14 require example.com/c v0.1.0 -- b/b.go -- package b import _ "example.com/c" -- c1/go.mod -- module example.com/c go 1.14 -- c1/c.go -- package c -- c2/go.mod -- module example.com/c go 1.14 -- c2/c.go -- package c
Fetched URL: https://go.googlesource.com/proposal/+/master/design/36460-lazy-module-loading.md
Alternative Proxies: