Content-Length: 348033 | pFad | https://github.com/golang/go/issues/64358

B0 x/pkgsite: searching exact standard library package names · Issue #64358 · golang/go · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/pkgsite: searching exact standard library package names #64358

Open
Deleplace opened this issue Nov 23, 2023 · 11 comments
Open

x/pkgsite: searching exact standard library package names #64358

Deleplace opened this issue Nov 23, 2023 · 11 comments
Assignees
Labels
pkgsite/search Issues related to pkg.go.dev search functionality pkgsite

Comments

@Deleplace
Copy link
Contributor

Deleplace commented Nov 23, 2023

What is the URL of the page with the issue?

https://pkg.go.dev/search?q=path&m=

What is your user agent?

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36

What did you do?

Searched the exact full name of a package of the standard library : "path"

What did you expect to see?

https://pkg.go.dev/path should be the first result

What did you see instead?

there are many results, but https://pkg.go.dev/path is not in the list at all

General problem

This is similar to #41369 but more specific: when exact search terms fail to show the relevant standard library package as 1st result.

The following rule should always hold:

IF a package exists in the standard library with name FOO, AND a user is searching the exact term "FOO" in pkg.go.dev, THEN the standard library package FOO must be the 1st result in the list

I tried 284 such searches, here are my findings.

  • When the full package name contains a "/" e.g. "compress/bzip", and the user is searching for the exact full name "compress/bzip", then no result list is displayed. Instead, the page is redirected to the correct stdlib package. This works 100% of the time
  • When the full package name is a single word e.g. "time", and the user is searching for the exact word, then a result list is displayed. The correct stdlib package is most often the 1st result, with 5 annoying exceptions
  • When the full package name contains a "/" e.g. "compress/bzip", and the user is searching for the exact last word e.g. "bzip", then a result list is displayed. The correct stdlib package is often the 1st result, with 16 annoying exceptions

The results of these 5 single-word searches are buggy:

Package Search link Result
path search path NOT FOUND
cmp search cmp 2nd pos
maps search maps 8th pos
plugin search plugin 2nd pos
slices search slices 4th pos

The results of these 16 single-word searches are buggy too, according to the rule:

Package Search link Result
runtime/coverage search coverage NOT FOUND
go/build/constraint search constraint 3rd pos
go/constant search constant 3rd pos
go/doc/comment search comment 6th pos
go/format search format 2nd pos
go/importer search importer 2nd pos
go/types search types 4th pos
image/color search color 2nd pos
net/rpc search rpc 3rd pos
regexp/syntax search syntax 4th pos
runtime/cgo search cgo 2nd pos
runtime/metrics search metrics 21st pos
runtime/race search race 5th pos
runtime/trace search trace 5th pos
testing/slogtest search slogtest 2nd pos
text/template/parse search parse 11th pos
@gopherbot gopherbot added this to the Unreleased milestone Nov 23, 2023
@Deleplace
Copy link
Contributor Author

I could clone go.googlesource.com/pkgsite, analyze the code, run the server locally and observe how the stdlib is not always prioritized.

However, I don't have the same datasources as production (postgres?) so a quick fix like "boost the Score of all stdlib results" might not be the best move.

@Deleplace
Copy link
Contributor Author

cc @matloob @jamalc @tatianab

@komuw
Copy link
Contributor

komuw commented Nov 23, 2023

@bcmills bcmills added the pkgsite/search Issues related to pkg.go.dev search functionality label Nov 27, 2023
@hyangah
Copy link
Contributor

hyangah commented Dec 7, 2023

According to https://pkg.go.dev/search-help

"If the package path you specified is complete enough, matching a full package import path, you will be brought directly to the details page for the latest version of that package."

Directing to the stdlib package automatically may be too risky, but at least, I think we need to make it correctly considered as exact match and rank it the highest.

cc @jba Do you want to track this separate from #41369?

@seankhliao
Copy link
Member

Duplicate of #41369

@seankhliao seankhliao marked this as a duplicate of #41369 Dec 29, 2024
@seankhliao seankhliao closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2024
@jba
Copy link
Contributor

jba commented Jan 17, 2025

I agree with @hyangah, this is a separate issue from #41369 and I consider it a bug.

@jba jba reopened this Jan 17, 2025
@jba
Copy link
Contributor

jba commented Jan 17, 2025

Thanks @Deleplace for your careful investigation.

Currently pkgsite search does not implement the documented rule: "If the package path you specified is complete enough, matching a full package import path, you will be brought directly to the details page for the latest version of that package."

I'm not sure if that's really ideal, but this is actually a separate UI bug. As you can see below, the path package is mentioned when you search for "path".

Image
It gets "demoted" because of a post-processing step we perform on searches. To prevent many packages in the same module from dominating the results, we keep only the highest-scoring package in the module at top level, and mention the rest as "related packages." path/filepath scores higher than path because it's more popular, so it stays at top level.

The fix for this particular problem is to avoid demoting stdlib packages.

Now does it make sense for a search of "path" to take you directly to pkg.go.dev/path? I say no. Then you could never search for general terms like "path", "sync" or "math". The rule may make sense for search terms with at least one /. What do people think of that revised rule?

@seankhliao
Copy link
Member

I think the revised rule makes sense.

Related, can "std" be special cased to be the standard library? it would help since it doesn't have a common module prefix to match against.

@jba
Copy link
Contributor

jba commented Jan 17, 2025

I mailed a CL to address the problem with "path" being hidden by "path/filepath". Now all stdlib packages will be at top level.

However, there may be more to do here:

  • Why is image/color second to github.com/fatih/color despite being more popular?
  • Why is go/constant third after math and unicode/utf8?

I suspect these two have to do with the way we turn paths into tokens, which may be too delicate to tweak, but I'll look into that. (By "too delicate to tweak" I mean the amount of effort it would take to verify that changes to that algorithm do more good than harm exceed our work budget for pkgsite.)

Lastly, there is the rule that search terms with a slash that match exactly should visit the matched package directly, skipping search, with a special case for "std". That one probably needs its own issue.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/643317 mentions this issue: internal/postgres: search doesn't group stdlib

gopherbot pushed a commit to golang/pkgsite that referenced this issue Jan 21, 2025
Search will display all standard library packages at top level.

Previously it grouped them based on the top-level directory, meaning
that some could be hidden. For example, searching for "path" appears
to omit the standard library package "path". In fact, it is nested
under "path/filepath", which is more popular.

With this change, "path" would appear second.

Updates golang/go#64358.

Change-Id: I2f35862ac514df0891b6f5b9055b6bf0a7ae37c7
Reviewed-on: https://go-review.googlesource.com/c/pkgsite/+/643317
Reviewed-by: Robert Findley <rfindley@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
kokoro-CI: kokoro <noreply+kokoro@google.com>
@jba
Copy link
Contributor

jba commented Jan 27, 2025

https://go.dev/cl/644676 will implement the "std/" tweak to search queries, and document that queries without slashes don't redirect, even if they match a package.

I looked into some of the other search concerns that @Deleplace mentioned in the top post. Here is a rundown on some of them:

  • If you search for "constant", you see math and unicode/utf8 ahead of go/constant. That is because they are both far more popular than go/constant, and both mention "constant" in their package doc. Unfortunately, what they say is just "this is a package with some types and constants." Pkgsite's primitive search algorithm can't tell a sentence like that apart from one that says that the package is really about constants. I don't see that improving any time soon.

  • If you search for "cmp", you will get github.com/google/go-cmp/cmp before the stdlib cmp package, because it is slightly more popular and their synopses are similar. I expect that will change in the future.

  • The maps package is now third, not eighth, when you search for "maps". It is beaten by golang.org/x/exp/maps, which is still slightly more popular, but that will change. The number 1 result is the very popular k8s.io/apimachinery/pkg/runtime. But that is more a flaw in our popularity algorithm than search, which is a topic for another day.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkgsite/search Issues related to pkg.go.dev search functionality pkgsite
Projects
None yet
Development

No branches or pull requests

7 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/golang/go/issues/64358

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy