Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload instrumentation #443

Open
ransomw1c opened this issue Apr 16, 2020 · 2 comments
Open

upload instrumentation #443

ransomw1c opened this issue Apr 16, 2020 · 2 comments

Comments

@ransomw1c
Copy link
Contributor

at One Concern, in addition to using the sidecar within Argo workflows, we distribute datamon to desktop with brew.

frequently, data-scientists need to "ingest," we say, data into the Argo workflows comprising the flood, for instance, simulation pipeline(s) without running a pre-packaged ingestor workflow. sometimes there's a 500 error or bundle upload or bundle mount new fail for one reason or another. this task proposes to begin to address the pain-point already solved in part by the fact that duplicate blobs (2k chunks) aren't uploaded twice.

specifically, the idea is to instrument (via golang in the binary, shell-script as in the sidecar, or Python, bindings for which exist in #393 , not having been merged only because of documentation requirements) the paths from desktop to cloud (bundle upload, bundle mount new, etc) to provide

  • metrics and usage statistics to improve datamon
  • progress indicators, logging, and a smoother experience for data-science
  • any and all additional tracing, timeing, and output formatting to ease backpressure on this known iss
@ransomw1c
Copy link
Contributor Author

this'd be a great starter issue because it's not cloud-specific (minor changes would allow fork that syncs your, the user/programmer's, local disk with arbitrary filesystem-like things) and the provided patch is mostly out-of-band/orthogonal/... to the rest of the datamon implementation.

@ransomw1c
Copy link
Contributor Author

ransomw1c commented Apr 16, 2020

i should also mention that there is an alternate approach to the same essential use-case of adding additional data sources from desktop in #413 . the idea in that proposal, again, is to allow arbitrary first-miles into the cluster, then allow the web scheduler to fully digest the data into datamon, dry style.

mr. rod serling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy