Pipelines

Marcel provides commands, called operators, which do the basic work of a shell. An operator takes a stream of data as input, and generates another stream as output. Operators can be combined by pipes, causing one operator's output to be the next operator's input. For example, this command uses the ls and map operators to list the names and sizes of files in the /home/jao directory:

ls /home/jao | map (f: (f, f.size))

  • The ls operator produces a stream of File objects, representing the contents of the /home/jao directory.

  • | is the symbol denoting a pipe, as in any Linux shell.

  • The pipe connects the output stream from ls to the input stream of the next operator, map.

  • The map operator applies a given function to each element of the input stream, and writes the output from the function to the output stream. The function is enclosed in parentheses. It is an ordinary Python function, except that the keyword lambda is optional. In this case, an incoming File is mapped to a tuple containing the file and the file's size.

A pipeline is a sequence of operators connected by pipes. They can be used directly on the command line, as above. They also have various other uses in marcel. For example, a pipeline can be assigned to a variable, essentially defining a new operator. For example, here is a pipeline, assigned to the variable recent, which selects Files modified within the past day:

recent = [select (file: now() - file.mtime < hours(24))]

  • The pipeline being defined is bracketed by [...]. (Without the brackets, marcel would attempt to evaluate the pipeline immediately, and then complain because the parameter file is not bound.)

  • The pipeline contains a single operator, select, which uses a function to define the items of interest. In this case, select operates on a File, bound to the parameter file.

  • now() is a function defined by marcel which gives the current time in seconds since the epoch, (i.e., it is just time.time()).

  • File objects have an mtime property, providing the time of the last content modification.

  • hours() is another function defined by marcel, which simply maps hours to seconds, i.e., it multiplies by 3600.

 

This pipeline can be used in conjunction with any pipeline yielding files. E.g., to locate the recently changed files in ~/git/myproject:

ls ~/git/myproject | recent

  • github
  • Twitter
  • email