You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
at One Concern, in addition to using thesidecar within Argo workflows, we distribute datamon to desktop with brew.
frequently, data-scientists need to "ingest," we say, data into the Argo workflows comprising the flood, for instance, simulation pipeline(s) without running a pre-packaged ingestor workflow. sometimes there's a 500 error or bundle upload or bundle mount new fail for one reason or another. this task proposes to begin to address the pain-point already solved in part by the fact that duplicate blobs (2k chunks) aren't uploaded twice.
specifically, the idea is to instrument (via golang in the binary, shell-script as in the sidecar, or Python, bindings for which exist in #393 , not having been merged only because of documentation requirements) the paths from desktop to cloud (bundle upload, bundle mount new, etc) to provide
metrics and usage statistics to improve datamon
progress indicators, logging, and a smoother experience for data-science
any and all additional tracing, timeing, and output formatting to ease backpressure on this known iss
The text was updated successfully, but these errors were encountered:
this'd be a great starter issue because it's not cloud-specific (minor changes would allow fork that syncs your, the user/programmer's, local disk with arbitrary filesystem-like things) and the provided patch is mostly out-of-band/orthogonal/... to the rest of the datamon implementation.
i should also mention that there is an alternate approach to the same essential use-case of adding additional data sources from desktop in #413 . the idea in that proposal, again, is to allow arbitrary first-miles into the cluster, then allow the web scheduler to fully digest the data into datamon, dry style.
at One Concern, in addition to using the sidecar within Argo workflows, we distribute datamon to desktop with
brew
.frequently, data-scientists need to "ingest," we say, data into the Argo workflows comprising the flood, for instance, simulation pipeline(s) without running a pre-packaged ingestor workflow. sometimes there's a 500 error or
bundle upload
orbundle mount new
fail for one reason or another. this task proposes to begin to address the pain-point already solved in part by the fact that duplicate blobs (2k chunks) aren't uploaded twice.specifically, the idea is to instrument (via golang in the binary, shell-script as in the sidecar, or Python, bindings for which exist in #393 , not having been merged only because of documentation requirements) the paths from desktop to cloud (
bundle upload
,bundle mount new
, etc) to providetime
ing, and output formatting to ease backpressure on this known issThe text was updated successfully, but these errors were encountered: