Using Zocalo

1: Simple Recipe

This tutorial will explain the structure and content of a simple recipe.

The aim is to demonstrate how to write a recipe to provide instructions to a service. The service we will be using is the Simple Service, included as part of python-zocalo-examples. Find the source code for that service here.

If you want to skip ahead, you can find the entire recipe here or at the bottom of the page.

JSON

To start with, all Zocalo recipes are currently json files. This can make them a little difficult to write but it makes them very simple for a machine to read. As we are trying to automate data analysis, this is quite useful!

The entire recipe is wrapped in a big json dictionary.

So, start your recipe by opening an editor, creating simple_service_recipe.json, and starting a blank dictionary:

{

}

This will give us plenty of space to write the rest of the recipe.

Note: JSON files can be written in one line but it is good practice to use indentation and line breaks to make them more readable

Steps

Each recipe is made up of a number of processing steps, one or more. The simple recipe is only composed of one step, we will add more later.

Steps are recorded as entries in the dictionary and are dictionaries themselves. The keys are always numbers with speech marks around them.

So let’s add our first step with an empty dictionary:

{
    "1": {
    }
}

Queues and parameters

Now, we can start detailing this processing step.

First, we must specify what queue we wish to send the recipe to.

Queue names should be uniquely attached to a particular service. Otherwise there is a strong chance that the wrong service will read the message!

In this case, we are using the simpleservice.submission queue. To check which queue you wish to send to, it is often necessary to check the source code of the service itself.

Add this to the step dictionary (we will recap how the the whole file should be looking shortly):

"queue": "simpleservice.submission",

This brings us on to defining parameters for the service. The Simple Service takes the following parameters:

  • commands - list of commands to execute

  • workingdir - working directory

  • output_file - file in workingdir to record the command outputs in

To start with, we will just provide a basic echo command. Note, commands is a list so even if there is only one value it must be surrounded in square brackets:

"commands": [
    "echo This is a command"
]

Specify your own working directory and filename, with the output expected in workingdir/output_file. The recipe should look like this:

{
"1": {
    "queue": "simpleservice.submission",
    "parameters": {
        "commands": [
            "echo This is a command"
        ],
        "workingdir": "/output/folder",
        "output_file": "out.txt"
    }
}

}

Where to start?

Having specified the processing steps, we need to tell Zocalo what the first step is. This may seem obvious for our recipe but a more complex recipe can have many steps.

The start value of the recipe can trigger multiple steps to begin processing. This means you can choose whether to start many processing steps at once, if they are all using the same raw data, or to adopt a linear approach, important when processing steps depend on each other.

It is also possible to provide some initial information during this step, which will be covered later. For the time being, send an empty list.

At the same level as the processing step, add:

"start": [
    [
        1,
        []
    ]
]

The total recipe should now look like:

{
"1": {
    "queue": "simpleservice.submission",
    "parameters": {
        "commands": [
            "echo This is a command"
        ],
        "workingdir": "/output/folder",
        "output_file": "out.txt"
    }
},
"start": [
    [
        1,
        []
    ]
]

}

It is very easy to make a mistake when writing JSON by hand. To check for errors, use the workflows.validate_recipe tool to check the recipe will work for Zocalo.

workflows.recipe_validate /path/to/my/recipe.json

If there is an error, amend it. Pay close attention to commas!

Actually running the recipe!

To run the recipe, we need to start some services!

Open three terminals and make sure that they are all in an environment which has the zocalo commands available.

In the first terminal, start a Dispatcher service in the test space. This reads recipes, adds information if necessary and then puts the message on the correct queue.

$ zocStarted service: Dispatcher
Service successfully connected to transport layer
Dispatcher starting
Logbook disabled: Not running in live mode
Starting queue listener thread
Queue listener thread started
alo.service --test -s Dispatcher -v

In the second terminal, start a SimpleService in the test space. This is the service which will actually execute our commands.

$ zocalo.service --test -s SimpleService -v
Started service: Simple Service
Service successfully connected to transport layer
Simple Service starting
Starting queue listener thread
Queue listener thread started

And finally, in the third termianl, send the recipe in the test space. The “-f” option lets you point to the location where you saved the recipe.

$ zocalo.go --test -f zocalo_examples/recipes/simple_service_recipe.json 1234
Running recipe from file zocalo_examples/recipes/simple_service_recipe.json
for data collection 1234

Submitted.

Now if you read your output file, you should see:

This is a command

Congratulations!! You have just written and executed your first recipe!

2: More Commands and Parameters

This tutorial will be realtively brief as it builds directly on top of the previous tutorial.

To refresh, we were asking the Simple Service to execute an echo command, and it writes the output to a file.

But many processing steps require more than one command.

So, let’s add some more!

More echo

Copy and paste the echo line a few times, with some changes so you can be sure you are having an effect!

The commands section of your recipe might now look something like this:

"commands": [
    "echo This is a command",
    "echo This is a second command",
    "echo This is a third command",
],

It is far too easy when adding or removing lines and sections within a recipe, so make sure to validate it:

workflows.validate_recipe /path/to/my/recipe.json

Using the instructions from the previous tutorial, run up a Dispatcher and Simple Service and send the recipe off for processing!

Now the output file will have all your text:

This is a command
This is another command
This is a third hello

Being useful

We can use this to provide some useful information in the output file. For example, let’s add the date and time at which the recipe was executed.

Most unix machines have a useful command for that:

"commands": [
    "date",
    "echo This is a command",
    "echo This is a second command",
    "echo This is a third command",
]

Note: Because Simple Service execute from the command line, you can check the commands output what you want in your own terminal before adding to the recipe.

Now our output is a bit more useful when we get around to looking at it in a few weeks time:

Mon 29 Jul 16:57:27 BST 2019
This is a command
This is another command
This is a third command

Specifying variables

It may be the case that a variable to be used in the recipe will only be specified at runtime. Therefore, there is a way to input this into the recipe.

Variables can be specified with curly braces - {} - and allow for substitution from the command line.

For example, lets add a command to repeat our input:

"echo {input}"

If you send this recipe off normally you will be quite disappointed, however the “-s” command line option allows us to set this value:

zocalo.go --test -f /path/to/my/recipe.json 1234 -s input="From the command line"

The {input} is substituted at runtime and the value will be written to the output file:

This is a command
This is another command
This is a third command
input: From the command line

In fact, the DCID is just a special form of this command line substitution which can be accessed with {ispyb_dcid}:

"echo DCID: {ispyb_dcid}"

Gives:

input: From the command line
DCID: 1234

Putting it all together

The final recipe as specified here looks like this:

{
    "1": {
        "queue": "simpleservice.submission",
        "parameters": {
            "commands": [
                "date",
                "echo This is a command",
                "echo This is another command",
                "echo This is a third command",
                "echo input: {input}",
                "echo DCID: {ispyb_dcid}"
            ],
            "workingdir": "/output/folder",
            "output_file": "out.txt"
        }
    },
    "start": [
        [
            1,
            []
        ]
    ]
}

and is found here.

However, you should experiment a bit to see what you can do.

Substitutions can occur anywhere in the recipe, even in the parameters!

Read on to find out how to go from one processing step, which we have covered here, to many steps which occur in a specified order!

3: Adding Steps

Now that we’ve got the hang of using Zocalo and running multiple commands, you might be wondering what happens if you wanted to use more than one service?

Or how to send the results of one processing step to the next one?

We’ll get there but let’s focus on doing more than one processing step.

For the moment, we’ll just use the Simple Service but it should be straightforward to change this in the future to use more than one service.

See the final recipe here.

Add another step

Take the recipe which we developed above, copy the step and paste it on the same level. Change the number of the step from 1 to 2.

There should now be two similar steps:

{
    {
    "1": {
        "queue": "simpleservice.submission",
        "parameters": {
            "commands": [
                "echo This is a command"
            ],
            "workingdir": "/output/folder",
            "output_file": "out.txt"
        }
    },
    "2": {
        "queue": "simpleservice.submission",
        "parameters": {
            "commands": [
                "echo This is a command"
            ],
            "workingdir": "/output/folder",
            "output_file": "out.txt"
        }
    }
}

Change the message in the second step to something different so that when it runs you can tell the difference from the previous recipe.

Now, at the end of the first step, on the same level as the queue and parameters information, add another key called output:

 "output_file": "out.txt"
    }
},
"output":

and point it to the second step:

"output": 2

This is how to define the next step in Zocalo.

The output key can take a single value, like this, or a list or dictionary.

Note that the 2 points to the “2” which defines the second step. If the second step in your recipe has a different number, or you want to skip some steps, just make sure the number specified in the output matches the step you want to run.

When running this recipe, pay attention to the Simple Service which will execute twice - once for the first step, once for the second step!