.. _tutorial-checking-job-creation: Validating jobs =============== Validating job creation ----------------------- Before submitting the task, the server will transform the :term:`ecf script` to a :term:`job file`. This process, known as :term:`job creation`, is performed by the :term:`ecflow_server` when the task is ready for submission, and includes the following steps: * Locating and loading the :term:`ecf script` -- see more about the :term:`location algorithm `. * Perform :term:`pre-processing` of :code:`%include` :term:`directives` * Perform :term:`variable substitution`. * Store the resulting script in the :term:`job file`, with a :code:`.job` extension The resulting :term:`job file` is the script that the :term:`ecflow_server` will actually submit for execution. Considering the :code:`$HOME/course/test/t1.ecf` file, defined in the previous section, the generation of the :term:`job file` will include the following steps: * :code:`%include "../head.h"` will be substituted by the content of the selected file. * :code:`%include "../tail.h"` will be substituted by the content of the selected file. * All variable occurrences (i.e. any text of the form :code:`%%`) will be substituted by the value of the named variable. For example, :code:`%ECF_NAME%` will be replaced by :code:`t1`. For practical purposes, it is often useful to check the :term:`job creation` process even before loading the :term:`suite definition`. This allows the early detection of potential problems, such as missing ecf script or include files, references to unspecified variables and other errors during :term:`pre-processing`. Using the ecFlow Python API it is possible to execute the :term:`job creation` process locally. .. tabs:: .. tab:: Python Consider the following regarding the :term:`job creation` process performed by the Python API: * The job creation is *independent* of the :term:`ecflow_server`, so default values will be used for server specific variables such as :code:`ECF_PORT` and :code:`ECF_HOST`. * The resulting job files will use extension :code:`.job0`, whereas the server will always generate jobs with extension :code:`.job` (where :code:`` corresponds to :term:`ECF_TRYNO` which is never zero). * The :term:`job file` is created in the same directory as the :term:`ecf script`. .. literalinclude:: src/checking-job-creation.py :language: python :caption: $HOME/course/validate.py The script above loads the suite definition from the :file:`$HOME/course/test/t1.ecf` file and performs the check via the call to :py:class:`ecflow.Defs.check_job_creation`. An all-in-one script could also create the suite definition programmatically, followed by the job creation check. **What to do:** #. Create the :code:`$HOME/course/validate.py` script as shown above, and execute it as follows: .. code-block:: shell cd $HOME/course # Either run by explicitly invoking python python3 ./validate.py # Or make the script executable, and run it directly chmod +x validate.py ./validate.py #. Examine the job file :file:`$HOME/course/test/t1.job0`, in particular note the variable substitutions made by the ecFlow server (e.g. :code:`ECF_PORT`, :code:`ECF_HOST`). Validating job execution ------------------------ The previous section demonstrated how a task script can be transformed into a job script. Unfortunatelly, trying to run this job script locally will fail, because the :code:`ecflow_client` commands embedded in the script/job will not be able to communicate with the server. In particular, the server specific variables such as :code:`ECF_PORT` and :code:`ECF_HOST` where generated by the Python API and will not typically correspond to an existing ecFlow server. Even if a server was running on the specified host and port, the job would be rejected because the :code:`ECF_PASSWD` variable would be used to identify the specific task. When this happens, i.e. a job uses an incorrect :code:`ECF_PASSWD`, the job is treated as a zombie and essentially ignored by the server. To disable the calls to :code:`ecflow_client`, and allow the job to be executed locally, export the environment variable :code:`NO_ECF=1`. When :code:`NO_ECF` is set, the :code:`ecflow_client` executable returns immediately with a success value, and allows the job to proceed uninterrupted. .. code-block:: shell export NO_ECF=1 $HOME/course/test/t1.job0 .. warning:: :code:`NO_ECF` can be used in any job script, regardless if it was generated using the Python API or by the ecFlow server itself, and is useful for testing and debugging purposes. This makes :code:`NO_ECF` usefull, but should **never** be used in a production environment. **What to do** #. Run the job :code:`$HOME/course/test/t1.job0`, disabling the calls to :code:`ecflow_client`.