==================================================
How to use the Logic Object space features of PyPy
==================================================

Outline
=======

This document gives some information about the content and usage of an
extension of PyPy known as the Logic Objectspace (LO), and also about
the constraint programming library that ships with PyPy. The LO, when
finished, will provide additional builtins that will allow to write:

* concurrent programs based on coroutines scheduled by dataflow logic
  variables,

* concurrent logic programs,

* concurrent constraint programs,

* new search "engines" to help solve logic/constraint programs.

Currently, the `integrated concurrent logic and constraint
programming` part is, unfortunately, quite unfinished. It will take
some effort, time and knowledge of PyPy internals to finish it. We
provide however a full-blown constraint-solving infrastructure that
can be used (and extended) out of the box.

To fire up a working standard PyPy with the the constraint library,
please type::

  root-of-pypy-dist/pypy/bin/py.py --withmod-_cslib

To fire up a working PyPy with the LO (including the constraint
solving library), please type::

  root-of-pypy-dist/pypy/bin/py.py -o logic

More information is available in the `EU Interim Report`_, especially
with respect to the (unfinished) integrated framework for constraint
and logic programming.

Logic Variables and Dataflow Synchronisation of Coroutines
==========================================================

This section peruses the LO, so you should try the examples with a
logic build or the `-o logic` argument to `py.py`.

Logic Variables
+++++++++++++++

Description and examples
------------------------

Logic variables are (should be, in an ideal LO) similar to Python
variables in the following sense: they map names to values in a
defined scope. But unlike normal Python variables, they have two
states: free and bound. A bound logic variable is indistinguishable
from a normal Python value, which it wraps. A free variable can only
be bound once (it is also said to be a single-assignment variable). It
is good practice to denote these variables with a beginning capital
letter, so as to avoid confusion with normal variables.

The following snippet creates a new logic variable and asserts its
state::

  X = newvar()
  assert is_free(X)
  assert not is_bound(X)

Logic variables can be bound like that::

  bind(X, 42)
  assert X / 2 == 21

The single-assignment property is easily checked::

  bind(X, 'hello') # would raise a RebindingError
  bind(X, 42)      # is admitted (it is a no-op)

It is quite obvious from this that logic variables are really objects
acting as boxes for python values. 

The bind operator is low-level. The more general operation that binds
a logic variable is known as "unification". Unify is an operator that
takes two arbitrary data structures and tries to assert their
equality, much in the sense of the == operator, but with one
important twist: unify mutates the state of the involved logic
variables.

Unify is thus defined as follows (it is symmetric):

+-----------------+-----------------+-----------------+
| ``Unify``       | **value**       | **unbound var** |
+-----------------+-----------------+-----------------+
| **value**       | equal?          | bind            |
+-----------------+-----------------+-----------------+
| **unbound var** | bind            | alias           |
+-----------------+-----------------+-----------------+

Unifying structures devoid of logic variables, like::

  unify([1, 2], [1, 2])
  unify(42, 43) # raises UnificationError

A basic example involving logic variables embedded into dictionaries::

  Z, W = newvar(), newvar()
  unify({'a': 42, 'b': Z},
        {'a':  Z, 'b': W})
  assert Z == W == 42

An example involving custom data types::

  class Foo(object):
      def __init__(self, a):
          self.a = a
          self.b = newvar()

  f1 = Foo(newvar())
  f2 = Foo(42)
  unify(f1, f2)
  assert f1.a == f2.a == 42    # assert (a)
  assert alias_of(f1.b, f2.b)  # assert (b)
  unify(f2.b, 'foo')
  assert f1.b == f2.b == 'foo' # (b) is entailed indeed


The operators table
-------------------

Logic variables support the following operators (with their arity):

Predicates::

 is_free/1
   any -> bool

 is_bound/1
   any -> bool

 alias_of/2
   logic vars. -> bool

Variable Creation::

 newvar/0
   nothing -> logic variable
 
Mutators::

 bind/2
   logic var., any -> None

 unify/2
   any, any -> None


Coroutines and dataflow synchronisation
++++++++++++++++++++++++++++++++++++++++++

Description and examples
------------------------

When a piece of code tries to access a free logic variable, the coroutine
in which it runs is blocked (suspended) until the variable becomes
bound. This behaviour is known as "dataflow synchronization" and
mimics exactly the dataflow variables from the `Oz programming
language`_. With respect to behaviour under concurrency conditions,
logic variables come with two operators :

* wait: this suspends the current coroutine until the variable is
  bound, it returns the value otherwise (impl. note: in the logic
  object space, all operators make an implicit wait on their
  arguments)

* wait_needed: this suspends the current coroutine until the variable
  has received a wait message. It has to be used explicitly, typically
  by a producer coroutine that wants to produce data only when needed.

In this context, binding a variable to a value will make runnable all
coroutines blocked on this variable.

Wait and wait_needed allow to write efficient lazy evaluating code.

Using the "stacklet" builtin (which spawns a coroutine and applies the
2..n args to its first arg), here is how to implement a
producer/consumer scheme::

  from cclp import stacklet

  def generate(n, limit, R):
      if n < limit:
          Tail = newvar()
          unify(R, (n, Tail))
          return generate(n + 1, limit, Tail)
      bind(R, None)
      return

  def sum(L, a, R):
      Head, Tail = newvar(), newvar()
      unify(L, (Head, Tail))
      if Tail != None:
          return sum(Tail, Head + a, R)
      bind(R, a + Head)
      return

  X = newvar()
  S = newvar()
  stacklet(sum, X, 0, S)
  stacklet(generate, 0, 10, X)
  assert S == 45

Note that this eagerly generates all elements before the first of them
is consumed. Wait_needed helps us write a lazy version of the
generator. But the consumer will be responsible of the termination,
and thus must be adapted too::

  def lgenerate(n, L):
      """lazy version of generate"""
      wait_needed(L)
      Tail = newvar()
      bind(L, (n, Tail))
      lgenerate(n+1, Tail)

  def lsum(L, a, limit, R):
      """this summer controls the generator"""
      if limit > 0:
          Head, Tail = newvar(), newvar()
          wait(L) # awakes those waiting by need on L
          unify(L, (Head, Tail))
          return lsum(Tail, a+Head, limit-1, R)
      else:
          bind(R, a)

  Y = newvar()
  T = newvar()

  stacklet(lgenerate, 0, Y)
  stacklet(lsum, Y, 0, 10, T)
  assert T == 45

The operators table
-------------------

Blocking ops

 wait/1 # blocks if first argument is a free logic var., till it becomes bound
   value -> value

 wait_needed/1 # blocks until its arg. receives a wait
   logic var. -> logic var.

Coroutine spawning

 stacklet/n
   callable, (n-1) optional args -> None

Constraint Programming
======================

PyPy comes with a flexible, extensible constraint solver engine based
on the CPython `Logilab constraint package`_ (and we paid attention to
API compatibility). We therein describe how to use the solver to
specify and get the solutions of a constraint satisfaction problem.

.. _`Logilab constraint package`: http://www.logilab.org/view?rql=Any%20X%20WHERE%20X%20eid%20852

Specification of a problem 
++++++++++++++++++++++++++

A constraint satisfaction problem is defined by a triple (X, D, C)
where X is a set of finite domain variables, D the set of domains
associated with the variables in X, and C the set of constraints, or
relations, that bind together the variables of X.

So we basically need a way to declare variables, their domains and
relations; and something to hold these together. 

Let's have a look at a reasonably simple example of a constraint
program::

  from cslib import *

  variables = ('c01','c02','c03','c04','c05',
               'c06','c07','c08','c09','c10')
  values = [(room,slot) 
            for room in ('room A','room B','room C') 
            for slot in ('day 1 AM','day 1 PM',
                         'day 2 AM','day 2 PM')]
  domains = {}

  # let us associate the variables to their domains
  for v in variables:
      domains[v]=fd.FiniteDomain(values)

  # let us define relations/constraints on the variables
  constraints = []

  # Internet access is in room C only
  for conf in ('c03','c04','c05','c06'):
      constraints.append(fd.make_expression((conf,),
                                            "%s[0] == 'room C'"%conf))

  # Speakers only available on day 1
  for conf in ('c01','c05','c10'):
      constraints.append(fd.make_expression((conf,),
                                            "%s[1].startswith('day 1')"%conf))
  # Speakers only available on day 2
  for conf in ('c02','c03','c04','c09'):
      constraints.append(fd.make_expression((conf,),
                                            "%s[1].startswith('day 2')"%conf))

  # try to satisfy people willing to attend several conferences
  groups = (('c01','c02','c03','c10'),
            ('c02','c06','c08','c09'),
            ('c03','c05','c06','c07'),
            ('c01','c03','c07','c08'))
  for g in groups:
      for conf1 in g:
          for conf2 in g:
              if conf2 > conf1:
                  constraints.append(fd.make_expression((conf1,conf2),
                                                        '%s[1] != %s[1]'%\
                                                        (conf1,conf2)))

  constraints.append(fd.AllDistinct(variables))

  # now, give the triple (X, D, C) to a repository object
  r = Repository(variables,domains,constraints)

  # that we can give to one solver of our choice 
  # (there, it is the default depth-first search, 
  #  find-all solutions solver)

  solutions = Solver().solve(r, 0)
  assert len(solutions) == 64

Extending the solver machinery
++++++++++++++++++++++++++++++

The core of the solving system is written in pure RPython and resides
in the rlib/cslib library. It should be quite easy to subclass the
provided elements to get specialized, special-case optimized
variants. On top of this library is built a PyPy module that exports
up to application level the low-level engine functionality.

.. _`EU Interim Report`: http://codespeak.net/pypy/extradoc/eu-report/D09.1_Constraint_Solving_and_Semantic_Web-interim-2007-02-28.pdf

.. _`Oz programming language`: http://www.mozart-oz.org
