Content-Length: 685546 | pFad | http://github.com/crim-ca/weaver/commit/a09d7af5900cf71faf58f916f8e75b95a8ffce89

9B update PROV docs · crim-ca/weaver@a09d7af · GitHub
Skip to content

Commit

Permalink
update PROV docs
Browse files Browse the repository at this point in the history
  • Loading branch information
fmigneault committed Dec 14, 2024
1 parent aab2a85 commit a09d7af
Show file tree
Hide file tree
Showing 15 changed files with 210 additions and 30 deletions.
2 changes: 1 addition & 1 deletion docs/source/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ Glossary
the defined script, calculation, or operation.

Provenance
Metadata using the :term:`W3C` |prov|_ standard that is applied to a submitted :term:`Job` execution to allow
Metadata using the :term:`W3C` |PROV|_ standard that is applied to a submitted :term:`Job` execution to allow
retrieving its origen, the related :term:`Application Package`, its :term:`I/O` sources and results, as well as
additional details about the server host and runtime user as applicable to replicate the experiment.

Expand Down
70 changes: 69 additions & 1 deletion docs/source/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,29 @@ Python Client Commands
For details about using the Python :py:class:`weaver.cli.WeaverClient`, please refer directly to its class
documentation and its underlying methods.

* :py:meth:`weaver.cli.WeaverClient.info`
* :py:meth:`weaver.cli.WeaverClient.version`
* :py:meth:`weaver.cli.WeaverClient.conformance`
* :py:meth:`weaver.cli.WeaverClient.register`
* :py:meth:`weaver.cli.WeaverClient.unregister`
* :py:meth:`weaver.cli.WeaverClient.deploy`
* :py:meth:`weaver.cli.WeaverClient.undeploy`
* :py:meth:`weaver.cli.WeaverClient.capabilities`
* :py:meth:`weaver.cli.WeaverClient.describe`
* :py:meth:`weaver.cli.WeaverClient.package`
* :py:meth:`weaver.cli.WeaverClient.jobs`
* :py:meth:`weaver.cli.WeaverClient.trigger_job`
* :py:meth:`weaver.cli.WeaverClient.update_job`
* :py:meth:`weaver.cli.WeaverClient.execute`
* :py:meth:`weaver.cli.WeaverClient.monitor`
* :py:meth:`weaver.cli.WeaverClient.dismiss`
* :py:meth:`weaver.cli.WeaverClient.status`
* :py:meth:`weaver.cli.WeaverClient.inputs`
* :py:meth:`weaver.cli.WeaverClient.outputs`
* :py:meth:`weaver.cli.WeaverClient.logs`
* :py:meth:`weaver.cli.WeaverClient.statistics`
* :py:meth:`weaver.cli.WeaverClient.exceptions`
* :py:meth:`weaver.cli.WeaverClient.provenance`
* :py:meth:`weaver.cli.WeaverClient.dismiss`
* :py:meth:`weaver.cli.WeaverClient.results`
* :py:meth:`weaver.cli.WeaverClient.upload`

Expand Down Expand Up @@ -479,6 +494,59 @@ Sample Output:
.. literalinclude:: ../../weaver/wps_restapi/examples/job_results.json
:language: json

.. _cli_example_job_prov:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Job Provenance Example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Accomplishes the :term:`Job` |PROV|_ request to obtain :term:`Provenance` metadata.

Below examples employ the ``Echo`` :term:`Process` available in |weaver-func-test-apps|_
and assume the referenced :term:`Job` was completed successfully.

.. note::
There fore multiple alternative format representations offered by this operation.
Not all of them are presented below. See the various ``prov_type`` and ``prov_format``
parameters for the combinations.

.. seealso::
- :ref:`proc_op_job_prov` provides more details about available endpoints, operations and metadata returned.

.. code-block:: shell
:caption: Command Line
weaver prov -u ${WEAVER_URL} -j "1c49f085-bbd7-410d-a801-81fd42469e8a" --pT run
.. code-block:: python
:caption: Python
from weaver.provenance import ProvenancePathType
client.prov("1c49f085-bbd7-410d-a801-81fd42469e8a", prov_type=ProvenancePathType.PROV_RUN)
Sample Output:

.. literalinclude:: ../../weaver/wps_restapi/examples/job_prov_run.txt
:language: text

.. code-block:: shell
:caption: Command Line
weaver prov -u ${WEAVER_URL} -nL --pF "PROV-JSON"
.. code-block:: python
:caption: Python
from weaver.provenance import ProvenanceFormat
client.prov("1c49f085-bbd7-410d-a801-81fd42469e8a", prov_format=ProvenanceFormat.PROV_N)
Sample Output:

.. literalinclude:: ../../weaver/wps_restapi/examples/job_prov.txt
:language: text

.. _cli_example_upload:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
17 changes: 17 additions & 0 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,23 @@ they are optional and which default value or operation is applied in each situat
.. versionadded:: 1.9

.. _weaver-cwl-prov:

- | ``weaver.cwl_prov = true|false`` [:class:`bool`-like]
| (default: ``true``)
|
| Configure whether :term:`W3C` |PROV|_ functionality using the :ref:`proc_op_job_prov` endpoints should be enabled
to collect :term:`Provenance` metadata when executing the underlying :term:`CWL` of a given :term:`Process`
or :term:`Workflow`.
.. note::

Any pre-existing :term:`Job` that was created when this option did not yet exist or that was executed while
it was disabled will not offer :term:`Provenance` metadata. This is intrinsic to the functionality that must obtain
timely metadata *while* executing to properly represent operational steps and :term:`Job` updates as they occur.

.. versionadded:: 6.1

.. _weaver-wps:

- | ``weaver.wps = true|false`` [:class:`bool`-like]
Expand Down
109 changes: 92 additions & 17 deletions docs/source/processes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ through some parsing (e.g.: :ref:`proc_wps_12`) or with some requirement indicat
special handling. The represented :term:`Process` is aligned with |ogc-api-proc|_ specifications.

When deploying one such :term:`Process` directly, it is expected to have a definition specified
with a :term:`CWL` `Application Package`_, which provides resources about one of the described :ref:`app_pkg_types`.
with a :term:`CWL` :ref:`application-package`, which provides resources about one of the described :ref:`app_pkg_types`.

This is most of the time employed to wrap operations packaged in a reference :term:`Docker` image, but it can also
wrap :ref:`app_pkg_remote` to be executed on another server (i.e.: :term:`ADES`). When the :term:`Process` should be
Expand Down Expand Up @@ -490,6 +490,8 @@ the |getcap-req|_ request.
Modify an Existing Process (Update, Replace, Undeploy)
-----------------------------------------------------------------------------

.. versionadded:: 4.20

Since `Weaver` supports |ogc-api-proc-part2|_, it is able to remove a previously registered :term:`Process` using
the :ref:`Deployment <proc_op_deploy>` request. The undeploy operation consist of a ``DELETE`` request targeting the
specific ``{WEAVER_URL}/processes/{processID}`` to be removed.
Expand All @@ -498,8 +500,6 @@ specific ``{WEAVER_URL}/processes/{processID}`` to be removed.
The :term:`Process` must be accessible by the user considering any visibility configuration to perform this step.
See :ref:`proc_op_deploy` section for details.

.. versionadded:: 4.20

Starting from version `4.20 <https://github.com/crim-ca/weaver/tree/4.20.0>`_, a :term:`Process` can be replaced or
updated using respectively the ``PUT`` and ``PATCH`` requests onto the specific ``{WEAVER_URL}/processes/{processID}``
location of the reference to modify.
Expand Down Expand Up @@ -1989,7 +1989,7 @@ the configured :term:`WPS` output directory.
Header ``X-WPS-Output-Context`` is ignored when using `S3` buckets for output location since they are stored
individually per :term:`Job` UUID, and hold no relevant *context* location. See also :ref:`conf_s3_buckets`.

.. versionadded:: 4.3
.. versionchanged:: 4.3
Addition of the ``X-WPS-Output-Context`` header.

.. _proc_op_execute_subscribers:
Expand Down Expand Up @@ -2419,26 +2419,101 @@ Note again that the more the :term:`Process` is verbose, the more tracking will
Job Provenance
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. fixme: CWL and Job Prov (https://github.com/crim-ca/weaver/issues/673)
.. todo::
implement ``GET /jobs/{jobID}/run`` and/or ``GET /jobs/{jobID}/prov``
(see https://github.com/crim-ca/weaver/issues/673)
.. versionadded:: 6.1

The provenance endpoints allow to obtain :term:`W3C` |PROV|_ metadata from a successfully completed :term:`Job`
using various representations. This provenance information can help identify traceability information such as the input
data sources, validate output checksums, and understand all internal :term:`Process` data transformations that were
involved within an executed :term:`Workflow`.

Configure ``PROV`` runtime options.

Provenance is information about entities, activities, and people involved in producing a
The |PROV|_ metadata consists of information about entities, activities, and people involved in producing a
piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness.

.. seealso::
- https://www.w3.org/TR/prov-overview/
- https://cwltool.readthedocs.io/en/latest/CWLProv.html
- https://docs.ogc.org/DRAFTS/24-051.html#_requirements_class_provenance
- |PROV-overview|_
- |cwltool-cwlprov|_

.. figure:: https://www.w3.org/TR/2013/REC-prov-o-20130430/diagrams/starting-points.svg
:alt: PROV-O Resources
:target: `PROV-O`_
:align: center
:width: 500px

PROV-O Resource Relationships

.. |prov-o-resources| image:: https://www.w3.org/TR/2013/REC-prov-o-20130430/diagrams/starting-points.svg
:alt: |prov-ontology| Resources
:target: `prov-ontology`_

The provenance endpoints are provided in alignment with the |ogc-api-proc-part4|_ provenance class requirement.
However, `Weaver` also provides additional functionalities in comparison to the minimal requirements from the
:term:`OGC` specification.

Following is a table of available formats and corresponding endpoints offered by `Weaver`.

.. list-table:: Job Provenance Endpoints
:name: table-job-prov
:align: center
:header-rows: 1
:widths: 25,10,20,45

* - Endpoint
- |PROV|_ Format
- :term:`Media-Type`
- Description
* - ``/jobs/{jobID}/prov``
- |PROV-JSON|_
- ``application/json``
- :term:`Provenance` metadata using :term:`JSON` representation.
* - ``/jobs/{jobID}/prov``
- |PROV-JSONLD|_
- ``application/ld+json``
- :term:`Provenance` metadata using |JSON-LD|_ representation.
* - ``/jobs/{jobID}/prov``
- |PROV-XML|_
- ``text/xml`` or ``application/xml``
- :term:`Provenance` metadata using :term:`XML` representation.
* - ``/jobs/{jobID}/prov``
- |PROV-N|_
- ``text/provenance-notation``
- :term:`Provenance` metadata using the main |PROV|_ notation representation.
* - ``/jobs/{jobID}/prov``
- PROV-NT
- ``application/n-triples``
- :term:`Provenance` metadata using |rdf-n-triples|_ (NT) representation.
* - ``/jobs/{jobID}/prov``
- PROV-TURTLE
- ``text/turtle``
- :term:`Provenance` metadata using |rdf-turtle|_ (TTL) representation.
* - ``/jobs/{jobID}/prov/info``
- |na|
- ``text/plain``
- Metadata about the *Research Object* packaging information.
* - ``/jobs/{jobID}/prov/who``
- |na|
- ``text/plain``
- Metadata of who ran the :term:`Job`.
* - ``/jobs/{jobID}/prov/runs``
- |na|
- ``text/plain``
- Obtain the list of ``runID`` steps of the :term:`Workflow` within the :term:`Job`.
* - ``/jobs/{jobID}/prov/run``
- |na|
- ``text/plain``
- Metadata of the main :term:`Job` and any nested step runs in the case of a :term:`Workflow`.
* - ``/jobs/{jobID}/prov/inputs``
- |na|
- ``text/plain``
- Metadata about the :term:`Job` input IDs.
* - ``/jobs/{jobID}/prov/outputs``
- |na|
- ``text/plain``
- Metadata about the :term:`Job` output IDs.
* - ``/jobs/{jobID}/prov/[run|inputs|outputs]/{runID}``
- |na|
- ``text/plain``
- Same as their respective definitions above, but for a specific step of a :term:`Workflow`.

.. seealso::
This feature is enabled by default. Its functionality and the corresponding :term:`API` endpoints
can be controlled using :ref:`Configuration Option <weaver-cwl-prov>` ``weaver.cwl_prov``.

.. _proc_op_job_stats:

Expand Down
28 changes: 23 additions & 5 deletions docs/source/references.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@
.. |cwl-metadata-schema-org| replace:: RDF Schema Definitions
.. _cwl-metadata-schema-org: https://schema.org/version/latest/schemaorg-current-https.rdf
.. _docker: https://docs.docker.com/develop/
.. |cwltool-cwlprov| replace:: CWLProv - Provenance Capture with :mod:`cwltool`
.. _cwltool-cwlprov: https://cwltool.readthedocs.io/en/latest/CWLProv.html
.. |docker| replace:: Docker
.. |ems| replace:: Execution Management Service
.. |esgf| replace:: Earth System Grid Federation
Expand Down Expand Up @@ -172,10 +174,26 @@
.. _openeo-api: https://openeo.org/documentation/1.0/developers/api/reference.html
.. |OpenAPI-spec| replace:: OpenAPI Specification
.. _OpenAPI-spec: https://spec.openapis.org/oas/v3.1.0
.. |prov| replace:: PROV
.. _prov: https://www.w3.org/TR/prov-overview/
.. |prov-ontology| replace:: PROV-O: The PROV Ontology
.. _prov-ontology: https://www.w3.org/TR/2013/REC-prov-o-20130430/
.. |JSON-LD| replace:: JSON Linked Data
.. _JSON-LD: https://json-ld.org/
.. |PROV| replace:: PROV
.. _PROV: https://www.w3.org/TR/prov-overview/
.. |PROV-JSON| replace:: PROV-JSON
.. _PROV-JSON: https://www.w3.org/submissions/prov-json/
.. |PROV-JSONLD| replace:: PROV-JSONLD
.. _PROV-JSONLD: https://www.w3.org/submissions/prov-jsonld/
.. |PROV-N| replace:: PROV-N
.. _PROV-N: https://www.w3.org/TR/prov-n/
.. |PROV-overview| replace:: PROV Overview
.. _PROV-overview: https://www.w3.org/TR/prov-overview/
.. |PROV-O| replace:: PROV-O: The PROV Ontology
.. _PROV-O: https://www.w3.org/TR/2013/REC-prov-o-20130430/
.. |PROV-XML| replace:: PROV-XML
.. _PROV-XML: https://www.w3.org/TR/2013/NOTE-prov-xml-20130430/
.. |rdf-n-triples| replace:: RDF N-Triples
.. _rdf-n-triples: https://www.w3.org/TR/n-triples/
.. |rdf-turtle| replace:: RDF Turtle
.. _rdf-turtle: https://www.w3.org/TR/rdf12-turtle/
.. |pywps| replace:: PyWPS
.. _pywps: https://github.com/geopython/pywps/
.. |pywps-status| replace:: Progress and Status Report
Expand Down Expand Up @@ -208,7 +226,7 @@
.. Example references
.. |examples| replace:: Examples
.. _examples: examples.rst
.. |weaver-func-test-apps| replace:: Weaver functional tests
.. |weaver-func-test-apps| replace:: Weaver functional tests Application Packages
.. _weaver-func-test-apps: https://github.com/crim-ca/weaver/tree/master/tests/functional/application-packages
.. |ogc-testbeds-apps| replace:: OGC-Testbeds Applications
.. _ogc-testbeds-apps: https://github.com/crim-ca/application-packages
Expand Down
6 changes: 3 additions & 3 deletions tests/test_provenance.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@
("run", None, ProvenancePathType.PROV_RUN, "run"),
("/run", None, ProvenancePathType.PROV_RUN, "run"),
("/prov/run", None, ProvenancePathType.PROV_RUN, "run"),
("run", "run-id", ProvenancePathType.PROV_RUN + "/run-id", "run"),
("/run", "run-id", ProvenancePathType.PROV_RUN + "/run-id", "run"),
("/prov/run", "run-id", ProvenancePathType.PROV_RUN + "/run-id", "run"),
("run", "run-id", f"{ProvenancePathType.PROV_RUN}/run-id", "run"),
("/run", "run-id", f"{ProvenancePathType.PROV_RUN}/run-id", "run"),
("/prov/run", "run-id", f"{ProvenancePathType.PROV_RUN}/run-id", "run"),
]
)
def test_provenance_path_type_resolution(provenance, prov_run_id, expect_path, expect_type):
Expand Down
5 changes: 3 additions & 2 deletions weaver/datatype.py
Original file line number Diff line number Diff line change
Expand Up @@ -1475,8 +1475,9 @@ def result_path(self, job_id=None, output_id=None, file_name=None):

def prov_url(self, container=None, extra_path=None):
# type: (Optional[AnySettingsContainer], Optional[ProvenancePathType]) -> str
extra_path = "/prov" + str(extra_path or "")
return self.job_url(container=container, extra_path=extra_path)
extra_path = str(extra_path or "")
prov_path = f"/prov{extra_path}"
return self.job_url(container=container, extra_path=prov_path)

def prov_path(self, container=None, extra_path=None, prov_format=None):
# type: (Optional[AnySettingsContainer], Optional[ProvenancePathType], Optional[AnyProvenanceFormat]) -> str
Expand Down
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file.
Empty file added weaver/wps_restapi/jobs/prov.py
Empty file.
3 changes: 2 additions & 1 deletion weaver/wps_restapi/jobs/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -1433,7 +1433,8 @@ def get_job_prov_response(request):
raise_job_bad_status_success(job, request)

prov_type = guess_target_format(request, override_user_agent=True, default=ContentType.APP_JSON)
prov_path = "/prov" + request.path.rsplit("/prov", 1)[-1]
prov_path = request.path.rsplit("/prov", 1)[-1]
prov_path = f"/prov{prov_path}"
prov_data, prov_type = job.prov_data(request, prov_path, prov_type)
if not prov_data:
prov_dir = job.prov_path(request)
Expand Down

0 comments on commit a09d7af

Please sign in to comment.








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: http://github.com/crim-ca/weaver/commit/a09d7af5900cf71faf58f916f8e75b95a8ffce89

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy