Friday, December 21, 2012

Visitor Pattern in Python

Visitor Pattern, what is it?
In object-oriented programming and software engineering, the visitor design pattern is a way of separating an algorithm from an object structure on which it operates. A practical result of this separation is the ability to add new operations to existing object structures without modifying those structures.[1] 

When to use?
Visitor Pattern can be used when it's needed to call method, which depends on both types of two objects.
To implement this, we need to provide double dispatch mechanism to determine which function must be called.

Example 
https://github.com/yakov-g/py_visitor/blob/master/py_visitor.py
Suppose there are two types of objects: Text, Picture and three file formats: txt, jpg, blob. Each object can be saved in each file format, or error message should be printed.

In order to keep terminology used to describe Visitor Pattern in other sources, I will use names like: accept(), visit(), Visitor(); but remember, that for our example accept == save; visit == save, Visitor == Saver.

Double dispatch
We want to provide convenient way of saving objects in any format, so it could look like this:  o.save(v)
Both of our object classes must implement accept() (save) function, which can receive Visitors (Savers) of different types:

I will write it in pseudo python code, to show that function's parameter can be of any type:

class Text():
   data = "Text"
   def accept(self, v <txtVisitor> || v <jpgVisitor> || v <blobVisitor>):
      v.visit(self)

class_Picture():
  data = "Picture"
  def accept(self, v <txtVisitor> || v <jpgVisitor> || v <blobVisitor>):
      v.visit(self)

Now any object can call accept(), to save object in provided format.

From the other side, each Visitor class must implement visit() functions, which actually saves object in desired format.
So it must be set of functions, which receive object of certain type.

class txtVisitor():
   def visit(self, o <Text Object>):
      print "Saving Text object in txt format"
   def visit(self, o <Picture Object>):
      print "Can not save Picture in to txt format"

class jpgVisitor():
   def visit(self, o <Text Object>):
      print "Can not save Text in jpg format"
   def visit(self, o <Picture Object>):      

      print "Saving Picture in jpj format"
 
class jpgVisitor():
   def visit(self, o <Text Object>):
      print "Saving Text in blob format"
   def visit(self, o <Picture Object>):
      print "Saving Picture in blob format"


Now, if we would have list of different objects and list of visitors:

lst = [Text(), Picture()]
vst = [txtSaver(), jpgSaver(), blobSaver()]
for o in lst:
  for v in vst:
    o.accept(v)

First dispatch happens in o.accept(v):
  • accept function will be called according to type of object, t.e. Text or Picture, and Visitor will be passed;
  • inside of accept() func we don't know anything, about type of Visitor; we know type of object, because dispatch had already happen.
Second dispatch happens in v.visit(o):
  • visit() func will be called according to type of Visitor.
  • inside of visit(), types of both objects are known, so we actually save desired object in desired format.
Implementation in Python
As soon as there is no function overloading in Python, we can't declare several visit() funcs with different parameter's types.

We need provide another way to call needed visit() func. It can be done by defining several functions with different names according to type of object.

class txtVisitor():
   def visit_Text(self, o):
      print "Saving Text object in txt format"
   def visit_Picture(self, o):
      print "Can not save Picture in to txt format"

    etc for each Visitor class.

We will also implement visit() func, which will determine type of object and call proper method.
We'll do it in Visitor class, which will be parent for other Visitors:

class Visitor(object): #object is a base python class
   def visit(self, o):

      # class name is Text or Picture
      method_name = "visit_" + o.__class__.__name__
      
      # looking for attribute 'visit_ClassName', return False, if not found      
      method = getattr(self, method_name, False)                                                 
     

      # if there is no attribute 'visit_ClassName' in Visitor class
      # error message will be printed
      if not method:
         self.error_message(o)
         return
      if callable(method): # checking, if found attribute is func()
         method(o) # calling func
      else:
         print "%s is not callable attribute"%(method_name)

   def error_message(self, o):
      print "Error: '%s' can't save '%s'"%(self.__class__.__name__, o.data)


We also can move accept() out of object's classes and define it in base class like this:

class File(object):
   def accept(self, v):
      v.visit(self)

class Text(File):
   data = "Text"

class Picture(File):
   data = "Picture"


So dispatch actually happens here:
        method_name = "visit_" + o.__class__.__name__
and we need only one dispatch in Python.

Why do we need only one dispatch in Python?
As soon as Python works with objects, we don't need two dispatches.
We need only "first" one, where we determine type of object according to object's class names.
Call o.accept.(v) can be changed to v.visit(o), and accept() can be deleted.

But what if 'o' is not object, but pointer to base class 'File'?
(In case of C++: it's imposible to define array of elements with different types, we must define array of  pointers to base class: File *arr[]; i )
Without first dispatch, parameter of 'File' type will be passed into visit(o).
There is no visit() which accepts 'File' as argument. That's why first dispatch is needed.

You can find full example here:
   https://github.com/yakov-g/py_visitor/blob/master/py_visitor.py

References:
   [1] http://en.wikipedia.org/wiki/Visitor_pattern

External links:
   http://en.wikipedia.org/wiki/Double_dispatch
   http://pythonwise.blogspot.co.il/2006/06/visitor-design-pattern.html

I'll be happy to hear comments from you. :)