Content-Length: 88963 | pFad | http://phabricator.wikimedia.org/T380833

s ⚓ T380833 [harbor] some artifacts and projects seems to have gone missing
Page MenuHomePhabricator

[harbor] some artifacts and projects seems to have gone missing
Open, HighPublic

Description

while restarting tools affected by the dns/nfs outages, I've found quite a few that are failing to pull theyir images from tools-harbor. Looking in harbor, in some cases the image is missing. In others, the entire project is missing.

one such case is tool-nodejs-flask-buildpack-sample which I maintain. In this case, the harbor project is missing entirely

some other affected tools:
milhistbot
mbh
teddybot

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
aborrero added a project: User-aborrero.
aborrero moved this task from Backlog to Radar/observer on the User-aborrero board.

I'm told via IRC discussion that 'quite a few' is ~10

a quick scan of the harbor images running on the cluster reveals 3 projects with missing harbor project:

dcaro@urcuchillay$ grep '###' out
################ unable to find project tool-nodejs-flask-buildpack-sample
################ unable to find project tool-teddybot
################ unable to find project tool-lebot

Used:

# get all the tool images from harbor
root@tools-k8s-control-9:~# kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec['initContainers', 'containers'][*].image}" |tr -s '[[:space:]]' '\n' |sort |uniq -c | sort -n | grep tool-

# copied locally, and run the script:
dcaro@urcuchillay$ cat test_images.sh 

for project in $(grep -o 'tool-[^/]*/' images.txt | cut -d/ -f1); do
    curl -X 'GET' \
    "https://tools-harbor.wmcloud.org/api/v2.0/projects/$project" \
    --fail-with-body \
    -H 'accept: application/json' \
    -H 'X-Is-Resource-Name: false' || echo "################ unable to find project $project"
done

Full list of tools without artifact or with no project in harbor (total of 8, half of them are ours):

dcaro@urcuchillay$ grep '###' out
################ unable to find project tool-mbh
################ unable to find project tool-milhistbot
################ unable to find project tool-nodejs-flask-buildpack-sample
################ unable to find project tool-teddybot
################ unable to find project tool-containers
################ unable to find project tool-lebot
################ unable to find project tool-sample-complex-app
################ unable to find project tool-containers

Modified a bit the script:

for project_repo in $(grep -o 'tool-[^:]*:' testimages.tx | cut -d/ -f1); do
    project="${project_repo%/*}"
    repo="${project_repo#*/}"
    echo "checking project $project repo $repo"
    curl --silent -X 'GET' \
    "https://tools-harbor.wmcloud.org/api/v2.0/projects/$project/repositories/$repo" \
    --fail-with-body \
    -H 'authorization: Basic <use your user/pass basic auth base64 string>' \
    -H 'accept: application/json' \
    | jq -r '.artifact_count' \
    | grep -v -e '^\(null\|0\)$' \
    || echo "################ unable to find project $project"
done

I looked into this a bit. In my opinion, there are 5 ways we know projects can be deleted [technically it's just 3, the others are just abstractions over the 3] (edit this list to add more if I missed anything)

  1. running maintain-harbor job to delete projects with no repositories: this can only succeed if the project has no artifacts and no repository. I don't believe this can happen without something else deleting the artifacts AND repo. maintain-harbor executes three jobs. The relevant job here is mh--delete-empty-tool-projects-cron which like the name implies, deletes harbor tool projects with no artifacts and repo.

    In the maintain-harbor logs, we saw that maintain-harbor deleted two of the projects ("tool-nodejs-flask-buildpack-sample" and "tool-sqid") but this is expected because it was supposed to delete projects with no artifacts and repo. I couldn't find any evidence that maintain-harbor deleted the other projects in the logs, and we haven't removed maintain-harbor logs atleast since April, possibly older.

    The question now is can maintain-harbor delete projects that has repos? the answer is no. maintain-harbor can't delete projects that has repos because of two fail-safes, one in the code, and one in harbor itself. In the code there is a line empty_tool_projects = [ harbor_project["name"] for harbor_project in tool_projects if harbor_project["repo_count"] == 0 ]. What it does is basically to filter out projects with repos. The only way this can fail is if there is a bug in harbor itself that sets repo_count to 0 when there is in-fact a repo in the project.

    the second fail-safe is in harbor itself and it has to do with the fact that you can't delete a harbor project, either through the UI or API if the project still has any repo in it. To delete a project, you have to saparately delete the artifacts, then delete the repos, before you can delete the project. the only way this can fail is if there is a bug in harbor that selectively allows the deletion of project with the repos and artifacts at once. These two fail-safes has to fail before maintain-harbor can delete a project that has repos, I find that unlikely.

    though maintain-harbor deleted the two projects in the end, there is something else that deleted both the artifacts and repositories.
  1. directly login through the harbor UI, delete the project's artifacts, delete the project's repo, then delete the project (or allow maintain-harbor do it for you): This is possible but improbably since I can't see why one of us will deliberately log into harbor just to go through these projects deleting the artifacts, then repo, then project, then forget about it.
  1. direct API request to delete the artifacts, repo and project: This is also possible but improbably. you need to make 3 requests (possibly more if you have more than 1 artifact) to 3 endpoints per project to archieve this. unlikely to happen by accident.
  1. run toolforge build clean: Again I don't think this can explain the deletion of the projects. it only deletes the artifacts, and you still need to delete the repository before you can delete the project or maintain-harbor comes along to do that for you.
  1. trigger the harbor retention policies to retain only 5 most recent artifacts and delete the rest: The functionality of this is a bit unclear. I've tried to simulate this and sometimes it ends up retaining only 1 artifact. I will test this a bit more but one thing is clear, this never deletes the underlying repo, and if a harbor project still has any repo, it is impossible to delete the project either through the UI or API.

I still have no idea why this happened. If any of the things in the above write up is wrong please indicate so I can correct that.









ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://phabricator.wikimedia.org/T380833

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy