Pesto language specification

Pesto is a text-based, human-editable, and machine-transformable cooking recipe interchange format.

module Codec.Pesto where

1 About this document

This section contains various information about this document. The second section motivates why inventing another file format is necessary, followed by the goals of Pesto. After a short Pesto primer for the casual user, the language’s syntax and semantics are presented. The linting section limits the language to useful cooking recipes. Examples for user presentation of recipes and serialization follow.

Being a literate program, this document is specification and reference implementation simultaneously. The code is written in Haskell and uses the parsec parser combinator library, as well as HUnit for unit tests. Even without knowing Haskell’s syntax, you should be able to understand this specification. There’s a description above every code snippet explaining what is going on.

The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

Version: 1
License: CC0
Website: https://6xq.net/pesto/
Source code: https://codeberg.org/ldb/pesto

2 Motivation

The landscape of recipe interchange formats is quite fragmented. First, there’s HTML microdata: Google rich snippets, equivalent to the schema.org microdata vocabulary, are widely used by commercial recipe sites. Although the main objective of microdata is to make content machine-readable most sites will probably use it because it is considered a search-engine optimization (SEO) and not a method for sharing recipes. h-recipe provides a second vocabulary that has not been adopted widely yet.

Most cooking-related software comes with custom recipe file formats. Some of them can be imported by other programs. Meal-Master is one of these widely supported formats. A vast trove of recipe files is available in this format. There does not seem to be any official documentation for the format, but inofficial ABNF grammar and format description exist. A Meal-Master recipe template might look like this:

---------- Recipe via Meal-Master (tm)

      Title: <Title>
 Categories: <Categories>
      Yield: <N servings>

    <N> <unit> <ingredient>
    …

-------------------------------<Section name>-----------------------------
  <More ingredients>

  <Instructions>

-----

Rezkonv aims to improve the Mealmaster format by lifting some of its character limits, adding new syntax, and translating it to german. However, the specification is available on request only.

A second format some programs can import is MasterCook’s MXP file format and its XML-based successor, MX2. Beyond that there exist numerous application-specific, proprietary formats:

Living Cookbook: Uses a XML-based format called fdx version 1.1. There’s no specification to be found, but a few examples are available, and those are dated 2006.
My CookBook: Uses the file extension .mcb. A specification is available.
KRecipes: Uses its own export format. However, there is no documentation whatsoever.
Gourmet: The program’s export format suffers from the same problem. The only document available is the DTD.
CookML: Last updated in 2006 (version 1.0.4) for the german-language shareware program Kalorio has a custom and restrictive license that requires attribution and forbids derivate works.
Paprika: Cross-platform application which supports its own “emailed recipe format” and a simple YAML-based format.

Between 2002 and 2005, a bunch of XML-based exchange formats were created. They are not tied to specific software, so none of them seems to be actively used nowadays:

RecipeML: Formerly known as DESSERT and released in 2002 (version 0.5). The license requires attribution and – at the same time – forbids using the name RecipeML for promotion without written permission.
eatdrinkfeelgood: Version 1.1 was released in 2002 as well, but the site is not online anymore. The DTD is licensed under the CC by-sa license.
REML: Released in 2005 (version 0.5), aims to improve support for commercial uses (restaurant menus and cookbooks). The XSD’s license allows free use and redistribution, but the reference implementation has no licensing information.
RecipeBook XML: Released 2005 as well and shared under the terms of CC by-sa is not available on the web anymore.

Finally, a few non-XML or obscure exchange formats have been created in the past: YumML is an approach similar to those listed above but based on YAML instead of XML. The specification has been removed from the web and is available through the Web Archive only.

Cordon Bleu (1999) encodes recipes as programs for a cooking machine and defines a Pascal-like language. Being so close to real programming languages, Cordon Bleu is barely usable by anyone except programmers. Additionally, the language is poorly-designed since its syntax is inconsistent, and the user is limited to a set of predefined functions.

Finally, there is RxOL, created in 1985. It constructs a graph from recipes written down with a few operators and postfix notation, and does not separate ingredients and cooking instructions like every other syntax introduced before. Although Pesto is not a direct descendant of RxOL both share many ideas.

microformats.org has a similar list of recipe interchange formats.

3 Goals

First of all, recipes are written by humans for humans. Thus a human-readable recipe interchange format is not enough. The recipes need to be human-editable without guidance like a GUI or assistant. That’s why, for instance, XML is unsuitable, and the interchange formats listed above have largely failed to gain traction. Even though simple, XML is still too complicated for the ordinary user. Instead, a format needs to be as simple as possible, with as little markup as possible. A human editor must be able to remember the entire syntax. This works best if the file contents “make sense.” An excellent example of this is Markdown.

We also must acknowledge that machines play an important role in our daily lives. They can help us, the users, accomplish our goals if they can also understand the recipes. Thus they, too, need to be able to read and write recipes. Again, designing a machine-readable format is not enough. Recipes must be machine-transformable. A computer program should be able to create a new recipe from two existing ones, look up the ingredients and tell us how many joules one piece of that cake will have. And so on.

That being said, Pesto does not aim to carry additional information about ingredients or recipes themselves. Nutrition data for each ingredient should be maintained in a separate database. Due to its minimal syntax, Pesto is also not suitable for extensive guides on cooking or the usual chitchat found in cooking books.

4 Introduction by example

So let’s start by introducing Pesto by example. This text does not belong
to the recipe and is ignored by any software. The following line starts the
recipe:

%pesto

&pot
+1 l water
+salt
[boil]

+100 g penne
&10 min
[cook]

>1 serving pasta
(language: en)

And that’s how you make pasta: Boil one liter of water in a pot with a little bit of salt. Then add 100 g penne, cook them for ten minutes, and you get one serving pasta. That’s all.

There’s more syntax available to express alternatives (either penne or tagliatelle), ranges (1–2 l water or approximately 1 liter water), and metadata. But now you can have a first peek at my recipe collection.

5 Language syntax

module Codec.Pesto.Parse (
	  parse
	, test
	, Instruction(..)
	, Quantity(..)
	, Unit
	, Object
	, Approximately(..)
	, Amount(..)
	, isResult
	, isReference
	, isAlternative
	, isAnnotation
	, isAction
	, isDirective
	, isUnknown
	, spaces1
	, notspace
	) where
import Data.Char (isSpace)
import Data.Ratio ((%))
import Text.Parsec hiding (parse)
import Text.ParserCombinators.Parsec.Pos (newPos)
import Text.ParserCombinators.Parsec.Error (newErrorUnknown)
import Test.HUnit hiding (test)

import Codec.Pesto.Serialize (serialize)

Pesto parses UTF-8 encoded input data consisting of space-delimited token. Every character within the Unicode whitespace class is considered a space.

stream = ((,) <$> getPosition <*> instruction) `sepEndBy` spaces1
	<?> "stream"
spaces1 = many1 space

The following instructions are supported:

data Instruction =
	  Annotation String
	| Ingredient Quantity
	| Tool Quantity
	| Action String
	| Reference Quantity
	| Result Quantity
	| Alternative Quantity
	| Directive String
	| Unknown String
	deriving (Show, Eq)

instruction =
	    try annotation
	<|> try ingredient
	<|> try tool
	<|> try action
	<|> try result
	<|> try alternative
	<|> try reference
	<|> try directive
	<|> try unknown
	<?> "instruction"

The pesto grammar has two instruction types: The first one begins with a start symbol (start) and consumes any character up to and including a terminating symbol (end), which can be escaped with a backslash (\).

betweenEscaped :: Char -> Char -> Parsec String () String
betweenEscaped start end =
	   char start
	*> many (try (char '\\' *> char end) <|> satisfy (/= end))
	<* char end

Annotations and actions both are of this kind:

annotation = Annotation <$> betweenEscaped '(' ')'
action = Action <$> betweenEscaped '[' ']'

Here are examples for both:

testOpterm = [cmpInstruction "(skinless\nboneless)" (Right (Annotation "skinless\nboneless"))
	, cmpInstruction "[stir together]" (Right (Action "stir together"))
	, cmpInstruction "[stir\\]together]" (Right (Action "stir]together"))
	, cmpInstruction "[stir [together]" (Right (Action "stir [together"))]

The second one starts with one identifying character, ignores the following whitespace characters, and then consumes a Quantity.

oparg :: Char -> Parsec String () Instruction -> Parsec String () Instruction
oparg ident cont = char ident *> spaces *> cont

ingredient = oparg '+' (Ingredient <$> quantity)
tool = oparg '&' (Tool <$> quantity)
result = oparg '>' (Result <$> quantity)
alternative = oparg '|' (Alternative <$> quantity)
reference = oparg '*' (Reference <$> quantity)

Additionally, there are two special instructions. Directives are similar to the previous instructions but consume a quoted string (qstr).

directive = oparg '%' (Directive <$> qstr)

Unknown instructions are the fallthrough-case and accept anything. They must not be discarded at this point. The point of accepting anything is to fail as late as possible while processing input. This gives the parser a chance to print helpful messages that provide additional aid to the user, who can then fix the problem.

unknown = Unknown <$> many1 notspace

Below are examples for these instructions:

testOparg = [
	  cmpInstruction "+100 g flour"
	      (Right (Ingredient (Quantity (Exact (AmountRatio (100%1))) "g" "flour")))
	, cmpInstruction "&oven"
	      (Right (Tool (strQuantity "oven")))
	, cmpInstruction ">dough" (Right (Result (strQuantity "dough")))
	, cmpInstruction "|trimmings" (Right (Alternative (strQuantity "trimmings")))
	, cmpInstruction "*fish"
	      (Right (Reference (strQuantity "fish")))
	, cmpInstruction3 "% invalid" (Right (Directive "invalid")) "%invalid"
	, cmpInstruction3 "* \t\n 1 _ cheese"
	      (Right (Reference (Quantity (Exact (AmountRatio (1%1))) "" "cheese")))
	      "*1 _ cheese"
	, cmpInstruction3 "!invalid" (Right (Unknown "!invalid")) "!invalid"
	]

5.1 Qstr

Before introducing quantities we need to have a look at qstr, which is used by them. A qstr, short for quoted string, can be a string enclosed in double quotes, a single word or the underscore character that represents the empty string.

qstr = try (betweenEscaped '"' '"')
	<|> word
	<|> char '_' *> return ""

A word always starts with a letter, followed by any number of non-space characters.

word = (:) <$> letter <*> many notspace
notspace = satisfy (not . isSpace)

The empty string can be represented by two double quotes or the underscore, but not the empty string itself.

testQstr = [
	  cmpQstr3 "\"\"" (Right "") "_"
	, cmpQstr "_" (Right "")
	, cmpQstr "" parseError

Any Unicode character with a General_Category major class L (i.e., a letter, see Unicode standard section 4.5 for example) is accepted as first character of a word. That includes german umlauts as well as greek or arabic script. Numbers, separators, punctuation, and others are not permitted.

	, cmpQstr "water" (Right "water")
	, cmpQstr "Äpfel" (Right "Äpfel")
	, cmpQstr "τυρί" (Right "τυρί")
	, cmpQstr "جبن" (Right "جبن")
	, cmpQstr "1sugar" parseError
	, cmpQstr "+milk" parseError
	, cmpQstr "∀onion" parseError

The remaining letters of a word can be any character, including symbols, numbers, …

	, cmpQstr "rump-roast" (Right "rump-roast")
	, cmpQstr "v1negar" (Right "v1negar")
	, cmpQstr "mush\"rooms" (Right "mush\"rooms")

…but not spaces.

	, cmpQstr " tomatoes" parseError
	, cmpQstr "tomatoes " parseError
	, cmpQstr "lemon juice" parseError
	, cmpQstr "sour\tcream" parseError
	, cmpQstr "white\nwine" parseError

If a string contains spaces or starts with a special character, it must be enclosed in double quotes.

	, cmpQstr3 "\"salt\"" (Right "salt") "salt"
	, cmpQstr "\"+milk\"" (Right "+milk")
	, cmpQstr "\"soy sauce\"" (Right "soy sauce")
	, cmpQstr "\"1sugar\"" (Right "1sugar")
	, cmpQstr "\"chicken\tbreast\nmeat\"" (Right "chicken\tbreast\nmeat")

Doublequotes within a string can be quoted by prepending a backslash. However, the usual escape codes like \n, \t, … will not be expanded.

	, cmpQstr "\"vine\"gar\"" parseError
	, cmpQstr3 "\"vine\\\"gar\"" (Right "vine\"gar") "vine\"gar"
	, cmpQstr "\"oli\\ve oil\"" (Right "oli\\ve oil")
	, cmpQstr "\"oli\\\\\"ve oil\"" (Right "oli\\\"ve oil")
	, cmpQstr3 "\"sal\\tmon\"" (Right "sal\\tmon") "sal\\tmon"
	]

5.2 Quantity

A Quantity is a triple of Approximately, Unit and Object as parameter.

data Quantity = Quantity Approximately Unit Object deriving (Show, Eq)

The syntactic construct is overloaded and accepts one to three arguments. If just one is given, it is assumed to be the Object and Approximately and Unit are empty. Two arguments set Approximately and Unit, which is convenient when the unit implies the object (minutes usually refer to the object time, for example).

quantity = try quantityA <|> quantityB

quantityA = Quantity
	<$> approximately
	<*  spaces1
	<*> unit
	<*> (try (spaces1 *> object) <|> return "")

quantityB = Quantity
	<$> return (Exact (AmountStr ""))
	<*> return ""
	<*> object

testQuantityOverloaded = [
	  cmpQuantity "oven" (exactQuantity (AmountStr "") "" "oven")
	, cmpQuantity "10 min" (exactQuantity (AmountRatio (10%1)) "min" "")
	, cmpQuantity "100 g flour" (exactQuantity (AmountRatio (100%1)) "g" "flour")

The first two are equivalent to

	, cmpQuantity3 "_ _ oven" (exactQuantity (AmountStr "") "" "oven") "oven"
	, cmpQuantity3 "10 min _" (exactQuantity (AmountRatio (10%1)) "min" "") "10 min"

Missing units must not be omitted. The version with underscore should be preferred.

	, cmpQuantity3 "1 \"\" meal" (exactQuantity (AmountRatio (1%1)) "" "meal") "1 _ meal"
	, cmpQuantity "1 _ meal" (exactQuantity (AmountRatio (1%1)) "" "meal")
	]

Units and objects are just strings. However, units should be limited to well-known metric units.

type Unit = String
unit = qstr

type Object = String
object = qstr

Approximately is a wrapper for ranges, that is, two amounts separated by a dash, approximate amounts, prepended with a tilde, and exact amounts without a modifier.

data Approximately =
	  Range Amount Amount
	| Approx Amount
	| Exact Amount
	deriving (Show, Eq)

approximately = try range <|> try approx <|> exact
range = Range <$> amount <*> (char '-' *> amount)
approx = Approx <$> (char '~' *> amount)
exact = Exact <$> amount

testQuantityApprox = [
	  cmpQuantity "1-2 _ bananas" (Right (Quantity (Range (AmountRatio (1%1)) (AmountRatio (2%1))) "" "bananas"))
	, cmpQuantity "1 - 2 _ bananas" parseError
	, cmpQuantity "1- 2 _ bananas" parseError
	, cmpQuantity "1 -2 _ bananas" parseError
	, cmpQuantity "~2 _ bananas" (Right (Quantity (Approx (AmountRatio (2%1))) "" "bananas"))
	, cmpQuantity "~ 2 _ bananas" parseError
	]

Amounts are limited to rational numbers and strings. There are no real numbers by design, and implementations should avoid representing rational numbers as floating point numbers. They are not required and introduce ugly corner cases when rounding while converting units, for example.

data Amount =
	  AmountRatio Rational
	| AmountStr String
	deriving (Show, Eq)

amount = try ratio <|> AmountStr <$> qstr

testQuantityAmount = [
	  cmpQuantity "some _ pepper" (exactQuantity (AmountStr "some") "" "pepper")
	, cmpQuantity3 "\"some\"-\"a few\" _ bananas" (Right (Quantity (Range (AmountStr "some") (AmountStr "a few")) "" "bananas")) "some-\"a few\" _ bananas"
	, cmpQuantity "~\"the stars in your eyes\" _ bananas" (Right (Quantity (Approx (AmountStr "the stars in your eyes")) "" "bananas"))
	]

Rational numbers can be an integral, numerator, and denominator, each separated by a forward slash, just the numerator and denominator, again separated by a forward slash, or just a numerator with the default denominator 1 (i.e., ordinary integral number).

ratio = let toRatio i num denom = AmountRatio ((i*denom+num)%denom) in
	    try (toRatio <$> int <*> (char '/' *> int) <*> (char '/' *> int))
	<|> try (toRatio <$> return 0 <*> int <*> (char '/' *> int))
	<|> try (toRatio <$> return 0 <*> int <*> return 1)

The following representations are all equal with the first one being the preferred one:

testQuantityRatio = [
	  cmpQuantity "3 _ bananas" (exactQuantity (AmountRatio (3%1)) "" "bananas")
	, cmpQuantity3 "3/1 _ bananas" (exactQuantity (AmountRatio (3%1)) "" "bananas")
	               "3 _ bananas"
	, cmpQuantity3 "3/0/1 _ bananas" (exactQuantity (AmountRatio (3%1)) "" "bananas")
	               "3 _ bananas"

Two numbers are numerator and denominator:

	, cmpQuantity "3/5 _ bananas" (exactQuantity (AmountRatio (3%5)) "" "bananas")

Three numbers add an integral part:

	, cmpQuantity "3/5/7 _ bananas" (exactQuantity (AmountRatio ((3*7+5)%7)) "" "bananas")
	, cmpQuantity3 "10/3 _ bananas" (exactQuantity (AmountRatio (10%3)) "" "bananas")
	               "3/1/3 _ bananas"

Rational numbers can be used in ranges and mixed with strings too.

	, cmpQuantity "1-\"a few\" _ bananas" (Right (Quantity
			(Range (AmountRatio (1%1)) (AmountStr "a few")) "" "bananas"))
	, cmpQuantity "1/1/2-2 _ bananas" (Right (Quantity
			(Range (AmountRatio (3%2)) (AmountRatio (4%2))) "" "bananas"))
	, cmpQuantity "~1/1/2 _ bananas" (Right (Quantity
			(Approx (AmountRatio (3%2))) "" "bananas"))
	]

5.3 Appendix

Parser main entry point.

parse = runParser stream () ""
int = read <$> many1 digit

Test helpers:

isLeft (Left _) = True
isLeft _ = False

A generic parser error:

parseError = Left (newErrorUnknown (newPos "" 0 0))

Compare output of parser f for string str with expected. The expected result can be a parser error, which matches any actual parse error (first case).

cmpParser f str (Left _) = TestCase $ assertBool str $ isLeft $ runParser (f <* eof) () "" str
cmpParser f str expected = str ~: runParser (f <* eof) () "" str ~?= expected

cmpParseSerialize f str expectp@(Left _) _ = [cmpParser f str expectp]
cmpParseSerialize f str expectp@(Right expectpval) expects = [
	  cmpParser f str expectp
	, serialize expectpval ~?= expects]

Wrap qstr test in AmountStr to aid serialization test

cmpQstr input expectp = cmpQstr3 input expectp input
cmpQstr3 input (Left expect) _ = [cmpParser (AmountStr <$> qstr) input (Left expect)]
cmpQstr3 input (Right expect) expects = cmpParseSerialize (AmountStr <$> qstr) input (Right (AmountStr expect)) expects

cmpQuantity a b = cmpQuantity3 a b a
cmpQuantity3 = cmpParseSerialize quantity

cmpInstruction a b = cmpInstruction3 a b a
cmpInstruction3 = cmpParseSerialize instruction

exactQuantity a b c = Right (Quantity (Exact a) b c)
strQuantity = Quantity (Exact (AmountStr "")) ""

test = [
	  "quantity" ~: testQuantityOverloaded
			++ testQuantityApprox
			++ testQuantityAmount
			++ testQuantityRatio
	, "qstr" ~: testQstr
	, "oparg" ~: testOparg
	, "opterm" ~: testOpterm
	]

isResult (Result _) = True
isResult _ = False
isReference (Reference _) = True
isReference _ = False
isAlternative (Alternative _) = True
isAlternative _ = False
isAnnotation (Annotation _) = True
isAnnotation _ = False
isAction (Action _) = True
isAction _ = False
isDirective (Directive _) = True
isDirective _ = False
isUnknown (Unknown _) = True
isUnknown _ = False

6 Language semantics

module Codec.Pesto.Graph (
	  toGraph
	, walkRoot
	, outgoingEdges
	, outgoingNodes
	, incomingEdges
	, incomingNodes
	, firstNodeId
	, resolveReferences
	, test
	, extract
	, NodeId
	, Node
	, Nodes
	, Edge
	, Edges
	) where
import Data.Char (toLower)
import Data.List (sort, nub)
import Test.HUnit hiding (test, Node)

import Codec.Pesto.Parse hiding (test)

The parser’s output, a stream of instructions, may contain multiple recipes. A recipe must start with the directive “pesto” and may end with “buonappetito”, which allows embedding recipes into other (plain-text) documents. This function extracts all recipes from the stream and removes both directives.

startDirective = Directive "pesto"
endDirective = Directive "buonappetito"

extract [] = []
extract (s:stream) | s == startDirective = between:extract next
	where
		isEnd x | x `elem` [startDirective, endDirective] = True
		isEnd _ = False
		(between, next) = break isEnd stream
extract (_:xs) = extract xs

Start and end directives are removed from the extracted instructions. The directive “buonappetito” is optional at the end of a stream.

testExtract = [
	  extract [startDirective, endDirective] ~?= [[]]
	, extract [startDirective, Action "foobar", endDirective] ~?= [[Action "foobar"]]
	, extract [startDirective] ~?= [[]]
	, extract [startDirective, Directive "foobar"] ~?= [[Directive "foobar"]]

Instructions surrounding the start and end directive are removed.

	, extract [Unknown "Something", startDirective] ~?= [[]]
	, extract [Unknown "Something", Action "pour", startDirective] ~?= [[]]
	, extract [startDirective, endDirective, Annotation "something"] ~?= [[]]

The stream may contain multiple recipes. The start directive also ends the previous recipe and starts a new one.

	, extract [
		  startDirective
		, Action "pour"
		, endDirective
		, Action "foobar"
		, startDirective
		, Annotation "something"]
		~?= [[Action "pour"], [Annotation "something"]]
	, extract [
		  startDirective
		, Action "heat"
		, startDirective
		, Annotation "something"]
		~?= [[Action "heat"], [Annotation "something"]]
	, extract [
		  startDirective
		, Annotation "foobar"
		, startDirective
		, endDirective]
		~?= [[Annotation "foobar"], []]
	]

Each recipe’s stream of instructions drives a stack-based machine that transforms it into a directed graph. Think of the stack as your kitchen’s workspace that is used to prepare the food’s components. You can add new ingredients, perform actions on them, put them aside, and add them again.

This function processes a list of nodes, that is, instructions uniquely identified by an integer, and returns the edges of the directed graph as a list of tuples.

toGraph :: Nodes Instruction -> Edges
toGraph nodes = third $ foldl f (Nothing, [[]], []) nodes
	where

Ingredients are simply added to the current workspace. They should, for example, appear on the shopping list.

		f ctx (i, Ingredient _) = addToStack ctx i

The same happens for tools. However, they are not part of the final product but are used in the process of making it. Thus, they do not appear on the shopping list. Time is considered a tool.

		f ctx (i, Tool _) = addToStack ctx i

Actions take all ingredients and tools currently on the workspace, perform some action with them, and put the product back onto the workspace.

		f (_, stack:sx, edges) (i, Action _) = (Just i, [i]:stack:sx, edgesTo i stack ++ edges)
		f (_, [], _) (_, Action _) = undefined -- never reached

Results add a label to the current workspace’s contents and move them out of the way. It should be a meaningful name, not just A and B. Consecutive results add different labels to the same workspace. That’s useful when an action yields multiple results at once that are processed in different ways.

		f ctx (i, Result _) = consumeStack ctx i

Alternatives, too, add a label to the current workspace’s content, but they pick one of the things on the workspace and throw everything else away. This allows adding optional or equivalent ingredients to a recipe (i.e., margarine or butter).

		f ctx (i, Alternative _) = consumeStack ctx i

References are similar to ingredients. They are used to add items from a workspace labeled with Result or Alternative. More on that in the next section.

		f ctx (i, Reference _) = addToStack ctx i

Annotations add a description to any of the previous instructions. They can be used to provide more information about ingredients (so “hot water” becomes +water (hot), tools (&oven (200 °C)), or actions ([cook] (until brown)).

		f ctx@(Nothing, _, _) (_, Annotation _) = ctx
		f (Just prev, s, edges) (i, Annotation _) = (Just prev, s, (i, prev):edges)

Unused directives or unknown instructions are danging nodes with no connection to other nodes.

		f ctx (_, Directive _) = ctx
		f ctx (_, Unknown _) = ctx

These are helper functions:

		addToStack (_, stack:sx, edges) i = (Just i, (i:stack):sx, edges)
		addToStack (_, [], _) _ = undefined -- never reached
		consumeStack (_, s, edges) i =
			let
				stack = dropWhile null s
				(top:sx) = if null stack then [[]] else stack
			in (Just i, []:top:sx, edgesTo i top ++ edges)
		edgesTo i = map (\x -> (x, i))

Here are a few examples of how this stack-machine works. Each edge is a tuple of two integer numbers. These are the nodes it connects, starting with zero. Ingredient, Tool, and Reference itself do not create any edges:

testGraph = [
	  cmpGraph "+ketchup &spoon *foobar" []

But Action, Alternative and Result do in combination with them:

	, cmpGraph "+foobar [barbaz]" [(0, 1)]
	, cmpGraph "+foobar |barbaz" [(0, 1)]
	, cmpGraph "+foobar >barbaz" [(0, 1)]
	, cmpGraph "+foobar +B >barbaz" [(0, 2), (1, 2)]
	, cmpGraph "+foobar >barbaz +foobar >barbaz" [(0, 1), (2, 3)]
	, cmpGraph "+foobar [barbaz] +foobar >barbaz" [(0, 1), (1, 3), (2, 3)]
	, cmpGraph "&foobar [barbaz] [C] >D" [(0, 1), (1, 2), (2, 3)]

If the stack is empty, i.e. it was cleared by a Result or Alternative instruction, consecutive results or alternatives operate on the previous, non-empty stack.

	, cmpGraph "+foobar >barbaz >C" [(0, 1), (0, 2)]
	, cmpGraph "+foobar >barbaz >C >D" [(0, 1), (0, 2), (0, 3)]
	, cmpGraph "+foobar |barbaz |C" [(0, 1), (0, 2)]
	, cmpGraph "+foobar >barbaz |C" [(0, 1), (0, 2)]

Unless that stack too is empty. Then they do nothing:

	, cmpGraph ">foobar >foobar" []
	, cmpGraph "|foobar |foobar" []
	, cmpGraph "(foobar) (foobar)" []
	, cmpGraph "[foobar]" []

The Annotation instruction always creates an edge to the most-recently processed node that was not an annotation. Thus two consecutive annotations create edges to the same node.

	, cmpGraph "+foobar (barbaz)" [(1, 0)]
	, cmpGraph "+foobar (barbaz) (C)" [(1, 0), (2, 0)]
	, cmpGraph "+foobar (barbaz) >barbaz" [(1, 0), (0, 2)]
	, cmpGraph "+foobar >barbaz (C)" [(0, 1), (2, 1)]
	, cmpGraph "+foobar |barbaz (C)" [(0, 1), (2, 1)]
	, cmpGraph "*foobar (C)" [(1, 0)]

Unknown directives or instructions are never connected to other nodes.

	, cmpGraph "%invalid" []
	, cmpGraph "invalid" []
	]

6.1 References

Results and alternatives can be referenced with the Reference instruction. Resolving these references does not happen while building the graph but afterward. This allows referencing a result or alternative before its definition with regard to their processing order.

Resolving references is fairly simple: For every reference’s object name, a case-insensitive lookup is performed in a table containing all results and alternatives. If it succeeds, an edge from every result and alternative returned to the reference in question is created.

resolveReferences :: Nodes Instruction -> Edges
resolveReferences nodes = foldl f [] nodes
	where
		f edges (i, ref@(Reference _)) = map (\x -> (x, i)) (findTargets nodes ref) ++ edges
		f edges _ = edges

findTargets :: Nodes Instruction -> Instruction -> [NodeId]
findTargets nodes (Reference (Quantity _ _ a)) = map fst $ filter (isTarget a) nodes
	where
		lc = map toLower
		isTarget dest (_, Result (Quantity _ _ x)) = lc x == lc dest
		isTarget dest (_, Alternative (Quantity _ _ x)) = lc x == lc dest
		isTarget _ _ = False
findTargets _ _ = []

References are position-independent and can be used before or after the result instruction they are referencing.

testRef = [
	  cmpGraphRef ">foobar *foobar" [(0, 1)]
	, cmpGraphRef ">foobar |foobar *foobar" [(0, 2), (1, 2)]
	, cmpGraphRef "+A >foobar +B >barbaz *foobar *barbaz" [(1, 4), (3, 5)]
	, cmpGraphRef "*foobar >foobar" [(1, 0)]

Nonexistent references do not create an edge.

	, cmpGraphRef ">foobar *barbaz" []

References can use amounts and units.

	, cmpGraphRef ">foobar *1 _ foobar *2 _ foobar" [(0, 1), (0, 2)]

There are a few cases that do not make sense here (like loops or multiple results with the same name). They are permitted at this stage, but rejected later.

	, cmpGraphRef "*foobar |foobar >foobar" [(1, 0), (2, 0)]
	, cmpGraphRef "|foobar *foobar >foobar *foobar" [(0, 1), (0, 3), (2, 1), (2, 3)]
	]

6.2 Appendix

runGraphWith f doc expect = sort edges ~?= sort expect
	where
		(Right op) = (head . extract . snd . unzip) <$> parse ("%pesto " ++ doc)
		nodes = zip [firstNodeId..] op
		edges = f nodes
cmpGraph = runGraphWith toGraph
cmpGraphRef = runGraphWith resolveReferences

type NodeId = Int
type Node a = (NodeId, a)
type Nodes a = [Node a]
type Edge = (NodeId, NodeId)
type Edges = [Edge]
firstNodeId = 0 :: NodeId

Find graph’s root node(s), that is a node without outgoing edges:

walkRoot nodes edges = let out = nub $ map fst edges
	in filter (\(x, _) -> notElem x out) nodes

Get all nodes with edges pointing towards nodeid

incomingEdges edges nodeid = filter ((==) nodeid . snd) edges
incomingNodes nodes edges nodeid = map ((!!) nodes . fst) $ incomingEdges edges nodeid

outgoingEdges edges nodeid = filter ((==) nodeid . fst) edges
outgoingNodes nodes edges nodeid = map ((!!) nodes . snd) $ outgoingEdges edges nodeid

test = ["graph" ~: testGraph, "ref" ~: testRef, "extract" ~: testExtract]

third (_, _, x) = x

7 Linting

module Codec.Pesto.Lint (lint
	, test
	, parseMetadata
	, extractMetadata
	, Metadata(..)
	, LintResult(..)) where
import Test.HUnit hiding (test, Node)
import Data.List (sort, isPrefixOf, insert, intersect)
import Text.Parsec hiding (parse)
import Data.Char (isSpace, toLower)
import Data.Ratio ((%))
import Data.Maybe (fromMaybe)
import qualified Data.Map.Strict as M

import Codec.Pesto.Graph hiding (test)
import Codec.Pesto.Parse hiding (test)

Not every graph generated in the previous section is a useful recipe. Some instruction sequences just do not make sense. The tests in this section can detect those. Failing any of them does not render a stream of instructions or graph invalid. They just do not describe a useful recipe. Thus implementations must not generate or export such documents. However, they should accept input that fails any of the tests and warn the user about the failure.

Additionally, this section provides guidance on how to use the instructions provided by the Pesto language properly.

7.1 Graph properties

The graph must have exactly one root node (i.e., a node with incoming edges only). This also requires all results and alternatives to be referenced somewhere. Directives are either consumed when parsing, generating a graph, and linting. Otherwise they are dangling as well. Unknown instructions are always dangling.

rootIsResult nodes edges = case walkRoot nodes edges of
	[] -> [LintResult NoRootNode []]
	(_, Result _):[] -> []
	(i, _):[] -> [LintResult NonResultRootNode [i]]
	xs -> [LintResult MoreThanOneRootNode (map fst xs)]

Empty recipes or circular references have no root node:

testConnectivity = [
	  cmpLint "" [LintResult NoRootNode [], LintResult NoMetadata []]
	, cmpLint "*foobar >foobar"
		[LintResult NoRootNode [], LintResult CircularLoop [0, 1], LintResult NoMetadata []]
	, cmpLint "+foobar"
		[LintResult NonResultRootNode [0], LintResult NoMetadata []]

Directives and unknown instructions are dangling and thus root nodes.

	, cmpLint "invalid %invalid +foo >bar"
		[LintResult MoreThanOneRootNode [0,1,3], LintResult NoMetadata []]
	]

7.2 Metadata

The graph’s root node must be a result. It contains yield (amount and unit) and title (object) of the recipe.

extractMetadata nodes edges = case walkRoot nodes edges of
		[(i, Result q@(Quantity _ _ title))] ->
			Just $ (i, ("title", MetaStr title))
				:(i, ("yield", MetaQty q))
				:foldl f [] (incomingNodes nodes edges i)
		_ -> Nothing
	where

Additional key-value metadata for the whole recipe can be added as annotations to the root node. If multiple annotations with the same key exist, the key maps to a list of those values. Annotations that are unparseable key-value pairs are added as recipe description instead.

		f xs (i, Annotation s) = case parseMetadata s of
				Left _ -> (i, ("description", MetaStr s)):xs
				Right (k, v) -> (i, (k, MetaStr v)):xs
		f xs _ = xs

Key and value are separated by a colon. Keys must not contain whitespace or the colon char. A value may be empty.

parseMetadata = runParser metadata () ""
metadata = let keychars = satisfy (\x -> not (isSpace x) && x /= ':') in (,)
	<$> many1 keychars
	<*> (char ':' *> spaces *> many anyChar)

lintMetadata nodes edges = case extractMetadata nodes edges of
		Just result -> foldl checkKey [] result
		Nothing -> [LintResult NoMetadata []]
	where
		checkKey xs (_, (k, _)) | isKeyKnown k = xs
		checkKey xs (i, _) = LintResult UnknownMetadataKey [i]:xs

Valid metadata keys are listed below. Additionally, applications may add keys by prefixing them with “x-myapp-”. Thus an application called “basil” adding “some-key” would use the full key “x-basil-some-key”.

		isKeyKnown k = k `elem` knownKeys || "x-" `isPrefixOf` k

The following metadata keys are permitted:

		knownKeys = [

The title, description and yield are implicit.

			  "title"
			, "description"
			, "yield"

The recipe’s language, as 2 character code (ISO 639-1).

			, "language"

Time both must be a time-unit quantity.

			, "time"

An image can be a relative file reference or URI

			, "image"
			, "author"
			]

For instance a german language recipe for one person would look like this:

testMetadata = [
	  cmpLintMeta "+foo >1 _ foobar (language: de) (x-app-key: value)"
		[]
		(Just [(1, ("title", MetaStr "foobar"))
			, (1, ("yield", MetaQty (Quantity (Exact (AmountRatio (1%1))) "" "foobar")))
			, (2, ("language", MetaStr "de"))
			, (3, ("x-app-key", MetaStr "value"))])

Unparseable annotations or unknown keys are linting errors:

	, cmpLintMeta "+foo >foobar (unknown-key: value)"
		[LintResult UnknownMetadataKey [2]]
		(Just [(1, ("title", MetaStr "foobar"))
			, (1, ("yield", MetaQty (strQuantity "foobar")))
			, (2, ("unknown-key", MetaStr "value"))])

Root node annotations not containing a parseable key-value pair are assigned the key “description”.

	, cmpLintMeta "+foo >foobar ( some description ) (another one: with colon) (another:     valid key-value)"
		[LintResult UnknownMetadataKey [4]]
		(Just [(1, ("title", MetaStr "foobar"))
			, (1, ("yield", MetaQty (strQuantity "foobar")))
			, (2, ("description", MetaStr " some description "))
			, (3, ("description", MetaStr "another one: with colon"))
			, (4, ("another", MetaStr "valid key-value"))])
	]

7.3 Time is a tool

By definition, time is a tool and not an ingredient.

timeUnits = ["s", "min", "h", "d"]

isTime (Quantity _ unit "") | unit `elem` timeUnits = True
isTime _ = False

timeIsATool nodes _ = foldl f [] nodes
	where
		f xs (nodeid, Ingredient q) | isTime q = LintResult TimeIsATool [nodeid]:xs
		f xs _ = xs

testLintQuantity = [
	  cmpLint "+10 min >foo" [LintResult TimeIsATool [0]]
	, cmpLint "+10-12 h >foo" [LintResult TimeIsATool [0]]
	, cmpLint "+90/120 s >foo" [LintResult TimeIsATool [0]]
	, cmpLint "+~12 s >foo" [LintResult TimeIsATool [0]]
	, cmpLint "&10 min [bar] >foo" []
	]

Only actions can be annotated like this. It can be used to indicate how long a particular action is expected to take (i.e., peeling potatoes takes two minutes) or how long the action is supposed to be executed (i.e. cook for five minutes). More time annotations improve the software’s scheduling capabilities.

timeAnnotatesAction nodes edges = foldl f [] nodes
	where
		f xs (nodeid, Tool q) | isTime q && (not . allActions) (outgoingEdges edges nodeid) = LintResult TimeAnnotatesAction [nodeid]:xs
		f xs _ = xs
		toNodelist = (!!) nodes . snd
		allActions = all (isAction . snd . toNodelist)

For example, “cook 10 minutes” can be expressed with:

testTimeAnnotatesAction = [
	  cmpLint "&10 min [cook] >soup" []
	, cmpLint "&10 min [cook] &5-6 h [cook again] >soup" []
	, cmpLint "&10 min >soup" [LintResult TimeAnnotatesAction [0]]
	, cmpLint "&10 min &15 min |time *time [cook] >soup"
		[LintResult TimeAnnotatesAction [0], LintResult TimeAnnotatesAction [1]]
	]

7.4 Well-known units

Units can be arbitrary strings, but implementations should recognize the standard metric units g (gram), l (liter), and m (meter). One of these prefixes may be used with each of them: m (milli-), c (centi-), d (dezi-), and k (kilo-). Additionally, time in s (second), min (minute), h (hour), and d (day) should be accepted.

wellKnownUnit nodes _ = foldl f [] nodes
	where
		extractQty (Ingredient q) = Just q
		extractQty (Tool q) = Just q
		extractQty (Result q) = Just q
		extractQty (Alternative q) = Just q
		extractQty (Reference q) = Just q
		extractQty _ = Nothing
		f xs (nodeid, instr) | fromMaybe False (extractQty instr >>= (return . not . known)) =
				LintResult UnitNotWellKnown [nodeid]:xs
		f xs _ = xs
		known (Quantity _ unit _) = unit `elem` knownUnits
		knownUnits = [
				  ""
				, "mg", "g", "kg"
				, "ml", "cl", "dl", "l"
				, "cm", "dm", "m"
				] ++ timeUnits

Usage of imperial units (inch, pound, …), non-standard units like “teaspoon,” “cup,” or similar is discouraged because the former is used by just three countries in the world right now, and the latter is language- and country-dependent. The implementation may provide the user with a conversion utility.

testLintWellKnownUnit = [
	  cmpLint "+1 in foobar >bar" [LintResult UnitNotWellKnown [0]]
	, cmpLint "+2 teaspoons foobar >bar" [LintResult UnitNotWellKnown [0]]
	, cmpLint "+3 cups foobar >bar" [LintResult UnitNotWellKnown [0]]
	, cmpLint "+1 ml foobar >bar" []
	, cmpLint "+1 cl foobar >bar" []
	, cmpLint "+1 dl foobar >bar" []
	, cmpLint "+1 l foobar >bar" []
	, cmpLint "+2 _ something >bar" []
	, cmpLint "&1 min [foo] >bar" []

The unit is case-sensitive, thus

	, cmpLint "+1 Mg foobar >bar" [LintResult UnitNotWellKnown [0]]
	, cmpLint "+1 kG foobar >bar" [LintResult UnitNotWellKnown [0]]
	, cmpLint "&1 MIN [foo] >bar" [LintResult UnitNotWellKnown [0]]
	]

7.5 References

All references must be resolved. An earlier check ensures all results and alternatives are referenced at some point.

referencesResolved nodes edges = foldl f [] nodes
	where
		f xs (nodeid, Reference _) | null (incomingEdges edges nodeid) =
				LintResult UndefinedReference [nodeid]:xs
		f xs _ = xs

testLintRefs = [
	  cmpLint "*foobar >foobar >barbaz" [LintResult CircularLoop [0, 1]]
	, cmpLint "*foobar >foo" [LintResult UndefinedReference [0]]
	]

Results and alternatives must not have duplicate names, so collect their lower-case object names into a Map and flag those which reference multiple nodes.

uniqueNames nodes _ = M.foldl f [] nameMap
	where
		f xs fnodes | length fnodes > 1 = LintResult DuplicateReferenceName fnodes:xs
		f xs _ = xs
		nameMap = foldl buildMap M.empty nodes
		buildMap m (nodeid, Result qty) = M.insertWith append (getObject qty) [nodeid] m
		buildMap m (nodeid, Alternative qty) = M.insertWith append (getObject qty) [nodeid] m
		buildMap m _ = m
		getObject (Quantity _ _ object) = map toLower object
		append a b = insert (head a) b

testUniqueNames = [
	  cmpLint "+a >x +b >y *x *y >foo" []
	, cmpLint "+a >x +b >x *x >y" [LintResult DuplicateReferenceName [1, 3]]
	, cmpLint "+a >x +b +c |x *x >y" [LintResult DuplicateReferenceName [1, 4]]
	, cmpLint "+a >1 _ foo +a >2 _ FOO +a >3 _ foO *Foo >y"
		[LintResult DuplicateReferenceName [1, 3, 5]]
	]

A result must have at least one incoming edge. This is a special case and can only occur at the beginning of a recipe.

resultNonempty nodes edges = foldl f [] nodes
	where
		f xs (nodeid, Result _) | null (incomingEdges edges nodeid) =
				LintResult TooFewChildren [nodeid]:xs
		f xs _ = xs

testLintResultNonempty = [
	  cmpLint ">bar *bar >baz" [LintResult TooFewChildren [0]]
	, cmpLint "+A >bar *bar >baz" []
	, cmpLint "+A >bar >foo *bar *foo >baz" []
	]

Alternatives must have at least two incoming edges since a smaller amount would make the alternative pointless.

twoAlternatives nodes edges = foldl f [] nodes
	where
		f xs (nodeid, Alternative _) | length (incomingEdges edges nodeid) < 2 =
				LintResult TooFewChildren [nodeid]:xs
		f xs _ = xs

testLintTwoAlternatives = [
	  cmpLint "+A |foo *foo >bar" [LintResult TooFewChildren [1]]
	, cmpLint "+A +B |foo *foo >bar" []
	, cmpLint "+A &B |foo *foo >bar" []
	]

References cannot loop because, well, you cannot cook something and use an ingredient you have not made yet. It is possible to branch out and merge again if an ingredient is split into multiple parts and added to different outputs.

circularLoops nodes edges = map (LintResult CircularLoop) circles
	where
		allReferences = foldl referenceNodes [] nodes
		referenceNodes xs (nodeid, Reference _) = nodeid:xs
		referenceNodes xs _ = xs
		circles = filter (not . null) $ map (visitIncoming [] . singleton) allReferences
		singleton x = [x]
		visitIncoming _ [] = []
		visitIncoming visited next = case length (intersect visited nextNext) of
			0 -> visitIncoming nextVisited nextNext
			_ -> nextVisited
			where
				nextVisited = visited ++ next
				nextNext = map fst $ concat $ map (incomingNodes nodes edges) next

testLintCircularLoops = [
	  cmpLint "*y >x *x >y >foobar"
		[LintResult CircularLoop [0, 3, 2, 1] , LintResult CircularLoop [2, 1, 0, 3]]
	, cmpLint "*z >x *x >y *y >z *z >foobar" [
		  LintResult CircularLoop [0, 5, 4, 3, 2, 1]
		, LintResult CircularLoop [2, 1, 0, 5, 4, 3]
		, LintResult CircularLoop [4, 3, 2, 1, 0, 5]
		, LintResult CircularLoop [6, 5, 4, 3, 2, 1, 0]
		]
	, cmpLint "+a >foobar *1/2 _ foobar >x *1/2 _ foobar >y *x *y >final" []
	, cmpLint "+a >foobar *1/2 _ foobar >x *x *1/2 _ foobar >final" []
	]

7.6 Ranges

The first amount of a range ratio must be strictly smaller than the second. This limitation is not enforced for ranges containing strings.

rangeFromLargerThanTo nodes _ = foldl f [] nodes
	where
		f xs (nodeid, Ingredient q) | not $ rangeOk q =
				LintResult RangeFromLargerThanTo [nodeid]:xs
		f xs (nodeid, Reference q) | not $ rangeOk q =
				LintResult RangeFromLargerThanTo [nodeid]:xs
		f xs _ = xs
		rangeOk (Quantity (Range (AmountRatio a) (AmountRatio b)) _ _) = a < b
		rangeOk _ = True

testRangeFromLargerThanTo = [
	  cmpLint "+2-3 l water >bar" []
	, cmpLint "+3-2 l water >bar" [LintResult RangeFromLargerThanTo [0]]
	, cmpLint "+2/3-1/3 l water >bar" [LintResult RangeFromLargerThanTo [0]]
	, cmpLint "+some-many _ eggs >bar" []
	, cmpLint "+1-\"a few\" _ eggs >bar" []
	]

7.7 Appendix

data LintResult = LintResult LintStatus [NodeId] deriving (Show, Eq, Ord)
data LintStatus =
	  NoRootNode
	| NonResultRootNode
	| MoreThanOneRootNode
	| UndefinedReference
	| DuplicateReferenceName
	| CircularLoop
	| TooFewChildren
	| TimeIsATool
	| TimeAnnotatesAction
	| UnitNotWellKnown
	| InvalidNode
	| RangeFromLargerThanTo
	| NoMetadata
	| UnknownMetadataKey
	deriving (Show, Eq, Ord)

Every lint test checks a single aspect of the graph.

lint nodes edges = concatMap (\f -> f nodes edges) lintTests

lintTests = [
	  rootIsResult
	, referencesResolved
	, uniqueNames
	, circularLoops
	, resultNonempty
	, twoAlternatives
	, timeIsATool
	, timeAnnotatesAction
	, wellKnownUnit
	, lintMetadata
	, rangeFromLargerThanTo
	]

withGraph doc f = f nodes edges
	where
		(Right op) = (head . extract . snd . unzip) <$> parse ("%pesto " ++ doc)
		nodes = zip [firstNodeId..] op
		edges = toGraph nodes ++ resolveReferences nodes

cmpLint doc expect = withGraph doc (\nodes edges -> doc ~: sort (lint nodes edges) ~?= sort expect)

data Metadata = MetaQty Quantity | MetaStr String deriving (Show, Eq)

cmpLintMeta doc expectLint expectMeta = withGraph doc (\nodes edges -> doc ~: [
		  sort (lint nodes edges) ~?= sort expectLint
		, extractMetadata nodes edges ~?= expectMeta
		])
strQuantity = Quantity (Exact (AmountStr "")) ""

test = [
	  testConnectivity
	, testMetadata
	, testLintRefs
	, testUniqueNames
	, testLintCircularLoops
	, testLintQuantity
	, testLintWellKnownUnit
	, testTimeAnnotatesAction
	, testLintTwoAlternatives
	, testLintResultNonempty
	, testRangeFromLargerThanTo
	]

8 Serializing

module Codec.Pesto.Serialize (serialize) where
import Data.Char (isSpace, isLetter)
import Data.Ratio (numerator, denominator)

import {-# SOURCE #-} Codec.Pesto.Parse

Serialization turns a linear list of instructions back into a human representation.

class Serializeable a where
	serialize :: a -> String

instance Serializeable a => Serializeable [a] where
	serialize ops = unlines $ map serialize ops

instance Serializeable Instruction where
	serialize (Annotation s) = quote '(' ')' s
	serialize (Ingredient q) = '+':serialize q
	serialize (Tool q) = '&':serialize q
	serialize (Action s) = quote '[' ']' s
	serialize (Reference q) = '*':serialize q
	serialize (Result q) = '>':serialize q
	serialize (Alternative q) = '|':serialize q
	serialize (Directive s) = '%':serializeQstr s
	serialize (Unknown s) = s

instance Serializeable Quantity where
	serialize (Quantity a b "") = serialize a ++ " " ++ serializeQstr b
	serialize (Quantity (Exact (AmountStr "")) "" c) = serializeQstr c
	serialize (Quantity a "" c) = serialize a ++ " _ " ++ serializeQstr c
	serialize (Quantity a b c) = serialize a ++ " " ++ serializeQstr b ++ " " ++ serializeQstr c

instance Serializeable Approximately where
	serialize (Range a b) = serialize a ++ "-" ++ serialize b
	serialize (Approx a) = '~':serialize a
	serialize (Exact a) = serialize a

There are two special cases here, both for aesthetic reasons:

If the denominator is one we can just skip printing it, because \(\frac{2}{1} = 2\) and
if the numerator is larger than the denominator use mixed fraction notation, because \(\frac{7}{2} = 3+\frac{1}{2}\)

instance Serializeable Amount where
	serialize (AmountRatio a) | denominator a == 1 = show (numerator a)
	serialize (AmountRatio a) | numerator a > denominator a =
		show full ++ "/" ++ show num ++ "/" ++ show denom
		where
			full = numerator a `div` denom
			num = numerator a - full * denom
			denom = denominator a
	serialize (AmountRatio a) = show (numerator a) ++ "/" ++ show (denominator a)
	serialize (AmountStr s) = serializeQstr s

serializeQstr "" = "_"
serializeQstr s | (not . isLetter . head) s || hasSpaces s = quote '"' '"' s
serializeQstr s = s
hasSpaces = any isSpace
quote start end s = [start] ++ concatMap (\c -> if c == end then ['\\', end] else [c]) s ++ [end]

9 Using this project

This project uses cabal. It provides the Codec.Pesto library that implements the Pesto language as described in the previous sections. It also comes with three binaries.

9.1 User interface

module Main (main) where
import System.Environment (getArgs)
import Data.List (intercalate)

import Codec.Pesto.Parse (parse, Instruction (Ingredient), Quantity (..))
import Codec.Pesto.Graph (extract, toGraph, firstNodeId, resolveReferences)
import Codec.Pesto.Lint (lint, extractMetadata, Metadata(..), LintResult (LintResult))
import Codec.Pesto.Serialize (serialize)

The user-interface reads a single recipe from the standard input.

main = do
	(op:_) <- getArgs
	s <- getContents
	either malformedRecipe (run op) (parse s)

It has three modes of operation, described in the next sections.

run "dot" = runDot
run "metadata" = runMeta
run "ingredients" = runIngredients
run _ = const (putStrLn "unknown operation")

9.1.1 dot

Since each recipe is just a directed graph (digraph), GraphViz’ dot language can represent recipes as well. Example:

cabal run --verbose=0 pesto dot < spaghetti.pesto | dot -Tpng > spaghetti.png

runDot stream = putStrLn $ toDot dotNodes dotEdges
	where
		(nodes, edges) = streamToGraph stream
		maxId = (maximum $ map fst nodes) + 1
		(lintNodes, lintEdges) = unzip $ map (uncurry lintToNodesEdges)
				$ zip [maxId..] (lint nodes edges)
		dotNodes = concat [
				  [("node", [("fontname", "Roboto Semi-Light")])]
				, map (\(a, label) -> (show a, [("label", serialize label)])) nodes
				, lintNodes
				]
		dotEdges = concat [
				  map (both show) edges
				, concat lintEdges
				]

lintToNodesEdges nodeid (LintResult t nodes) = let
	n = (show nodeid, [("label", show t), ("color", "red")])
	e = map (\i -> both show (nodeid, i)) nodes
	in (n, e)

both f (a, b) = (f a, f b)

toDot nodes edges = "digraph a {"
		<> mconcat (map nodeToDot nodes)
		<> mconcat (map edgeToDot edges)
		<> "}"
	where
		edgeToDot (a, b) = a <> " -> " <> b <> ";"
		nodeToDot (a, b) = a <> " [" <> mconcat (mapToDot b) <> "];"

mapToDot = map kvToDot
kvToDot (k, v) = k <> "=\"" <> quoteString v <> "\""
quoteString s = mconcat $ map quoteChar s
quoteChar '\n' = "\\n"
quoteChar '"' = "\\\""
quoteChar x = [x]

9.1.2 metadata

Print metadata as key-value pairs, separated by =.

runMeta stream = maybe (return ()) (mapM_ printMeta) $ uncurry extractMetadata $ streamToGraph stream

9.1.3 ingredients

Extract ingredients and print them in CSV format. This does not take alternatives into account yet.

runIngredients stream = mapM_ (putStrLn . csvQty) $ reverse $ foldl getIngredient [] stream
	where
		getIngredient xs (_, Ingredient q) = q:xs
		getIngredient xs _ = xs

printMeta (_, (key, MetaStr value)) = putStrLn $ key ++ "=" ++ value
printMeta (_, (key, MetaQty q)) = putStrLn $ key ++ "=" ++ csvQty q

csvQty (Quantity a b c) = intercalate "," [serialize a, b, c]

9.1.4 Appendix

malformedRecipe = print

streamToGraph stream = (nodes, edges)
	where
		doc = (head . extract . snd . unzip) stream
		nodes = zip [firstNodeId..] doc
		edges = toGraph nodes ++ resolveReferences nodes

9.2 Running tests

import Test.HUnit
import System.Exit (exitFailure, exitSuccess)
import Codec.Pesto.Parse (test)
import Codec.Pesto.Lint (test)
import Codec.Pesto.Graph (test)

The testcases can be run with cabal test. This runs all testcases from all modules and prints a summary.

main = runTestTT tests >>= \c -> if errors c + failures c > 0 then exitFailure else exitSuccess

tests = TestList [
	  "parse" ~: Codec.Pesto.Parse.test
	, "graph" ~: Codec.Pesto.Graph.test
	, "lint" ~: Codec.Pesto.Lint.test
	]

9.3 Building documentation

{-# LANGUAGE OverloadedStrings #-}
import Text.Pandoc
import Text.Pandoc.Highlighting (tango)
import qualified Data.Text.IO as TIO
import System.Directory (setCurrentDirectory)
import Data.Map as M
import Text.DocTemplates (ToContext(toVal), Context(..))
import Data.Text (pack)
import Data.Either.Combinators (rightToMaybe)

The documentation can be generated by running cabal run pesto-doc. It is exclusively based on the restructuredText inside this package’s literal Haskell source code.

readDoc = readRST def {
	  readerExtensions = extensionsFromList [
		  Ext_literate_haskell
		, Ext_implicit_header_references
		]
	, readerStandalone = True }

Pandoc outputs a single HTML5 page with syntax highlighting and MathJax for formulas.

 writeDoc tpl = writeHtml5String def {
	  writerTemplate = tpl
 	, writerHighlightStyle = Just tango
	, writerNumberSections = True
 	, writerSectionDivs = True
 	, writerTabStop = 4
 	, writerHTMLMathMethod = MathJax "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
 	, writerVariables = Context $ M.fromList [
 		  ("css", toVal $ pack "pesto.css")
 		, ("lang", toVal $ pack "en")
 		, ("include-before", toVal $ pack "<div class=\"wrapper\">")
 		, ("include-after", toVal $ pack "</div>")
 		]
 	}

 main = do

The module Codec.Pesto serves as starting point and it includes every other module in a sensible order. For the relative includes to work, we need to change our current working directory.

	tpl <- runIO $ compileDefaultTemplate "html5"
	setCurrentDirectory "src/lib/Codec"
 	doc <- TIO.readFile "Pesto.lhs"
 	result <- runIO $ readDoc doc >>= writeDoc (rightToMaybe tpl)
	setCurrentDirectory "../../../"
 	html <- handleError result

Output is written to the directory _build, which contains the corresponding stylesheet.

	TIO.writeFile "_build/index.html" html