Database access

Marcel pipes streams of tuples between operators. Relational databases operate on tables, which are sets of tuples, and query results are streams of tuples. Because these models are so similar, database access from marcel is simple. Currently, marcel includes only the pyscopg2 driver for accessing Postgres databases.

For example, suppose you have a table recording process information over time:

create table process(time int, -- seconds since the epoch

pid int,

command varchar,

primary key(time, pid))

You can populate this table by piping the required data into the sql command:

timer 1: Generate a timestamp every 1 second.
args (| t: ... |): Bind each incoming timestamp to the pipeline parameter t.
ps: Generate a stream of Process objects.
map (p: ...): Map a Process p to a tuple comprising the timestamp t, p's pid, and p's command line.
sql ...: Insert incoming tuples into the process table.

This command produces no output, and will run as long as you let it. You can use Ctrl-C to kill the command. (Or use Ctrl-Z and the bg command to let it continue to run in the background.)

The sql command can also generate a stream of tuples. This command counts the number of python processes running at each timestamp:

M 0.18.3 jao@loon ~$ sql "select time, count(*) from process \
M +$ where command like '%%python%%' group by time"
(1698503820, 9)
(1698503721, 9)
(1698503830, 9)
(1698503723, 9)
(1698503831, 9)
(1698503829, 9)
(1698503832, 9)
(1698503821, 9)
(1698503722, 9)

Note that % has special significance in Python, for string formatting. To avoid that interpretation, Python allows escaping % with another %. I.e., the database system receives "... like '%python%' ...".