Takadonet/Tree--Simple/lib/Tree/Simple
Tree--Simple src
NAME
Tree::Simple - A simple tree object
SYNOPSIS
use Tree::Simple;
# make a tree root
my $tree = Tree::Simple.new("0", $Tree::Simple::ROOT);
# explicity add a child to it
$tree.addChild(Tree::Simple.new("1"));
# specify the parent when creating
# an instance and it adds the child implicity
my $sub_tree = Tree::Simple.new("2", $tree);
# chain method calls
$tree.getChild(0).addChild(Tree::Simple.new("1.1"));
# add more than one child at a time
$sub_tree.addChildren(
Tree::Simple.new("2.1"),
Tree::Simple.new("2.2")
);
# add siblings
$sub_tree.addSibling(Tree::Simple.new("3"));
# insert children a specified index
$sub_tree.insertChild(1, Tree::Simple.new("2.1a"));
# clean up circular references
$tree.DESTROY();
DESCRIPTION
This module in an fully object-oriented implementation of a simple n-ary
tree. It is built upon the concept of parent-child relationships, so
therefore every B<Tree::Simple> object has both a parent and a set of
children (who themselves may have children, and so on). Every B<Tree::Simple>
object also has siblings, as they are just the children of their immediate
parent.
It is can be used to model hierarchal information such as a file-system,
the organizational structure of a company, an object inheritance hierarchy,
versioned files from a version control system or even an abstract syntax
tree for use in a parser. It makes no assumptions as to your intended usage,
but instead simply provides the structure and means of accessing and
traversing said structure.
This module uses exceptions and a minimal Design By Contract style. All method
arguments are required unless specified in the documentation, if a required
argument is not defined an exception will usually be thrown. Many arguments
are also required to be of a specific type, for instance the $parent
argument to the constructor B<must> be a B<Tree::Simple> object or an object
derived from B<Tree::Simple>, otherwise an exception is thrown. This may seems
harsh to some, but this allows me to have the confidence that my code works as
I intend, and for you to enjoy the same level of confidence when using this
module. Note however that this module does not use any Exception or Error module,
the exceptions are just strings thrown with die .
CONSTANTS
=over 4
=item B<ROOT>
This class constant serves as a placeholder for the root of our tree. If a tree
does not have a parent, then it is considered a root.
=back
METHODS
=head2 Constructor
=over 4
=item B<new ($node, $parent)>
The constructor accepts two arguments a $node value and an optional $parent .
The $node value can be any scalar value (which includes references and objects).
The optional $parent value must be a B<Tree::Simple> object, or an object
derived from B<Tree::Simple>. Setting this value implies that your new tree is a
child of the parent tree, and therefore adds it to the parent's children. If the
$parent is not specified then its value defaults to ROOT.
=back
=head2 Mutator Methods
=over 4
=item B<setNodeValue ($node_value)>
This sets the node value to the scalar $node_value , an exception is thrown if
$node_value is not defined.
=item B<setUID ($uid)>
This allows you to set your own unique ID for this specific Tree::Simple object.
A default value derived from the object's hex address is provided for you, so use
of this method is entirely optional. It is the responsibility of the user to
ensure the value's uniqueness, all that is tested by this method is that $uid
is a true value (evaluates to true in a boolean context). For even more information
about the Tree::Simple UID see the getUID method.
=item B<addChild ($tree)>
This method accepts only B<Tree::Simple> objects or objects derived from
B<Tree::Simple>, an exception is thrown otherwise. This method will append
the given $tree to the end of it's children list, and set up the correct
parent-child relationships. This method is set up to return its invocant so
that method call chaining can be possible. Such as:
my $tree = Tree::Simple.new("root").addChild(Tree::Simple.new("child one"));
Or the more complex:
my $tree = Tree::Simple.new("root").addChild(
Tree::Simple.new("1.0").addChild(
Tree::Simple.new("1.0.1")
)
);
=item B<addChildren (@trees)>
This method accepts an array of B<Tree::Simple> objects, and adds them to
it's children list. Like addChild this method will return its invocant
to allow for method call chaining.
=item B<insertChild ($index, $tree)>
This method accepts a numeric $index and a B<Tree::Simple> object ( $tree ),
and inserts the $tree into the children list at the specified $index .
This results in the shifting down of all children after the $index . The
$index is checked to be sure it is the bounds of the child list, if it
out of bounds an exception is thrown. The $tree argument's type is
verified to be a B<Tree::Simple> or B<Tree::Simple> derived object, if
this condition fails, an exception is thrown.
=item B<insertChildren ($index, @trees)>
This method functions much as insertChild does, but instead of inserting a
single B<Tree::Simple>, it inserts an array of B<Tree::Simple> objects. It
too bounds checks the value of $index and type checks the objects in
@trees just as insertChild does.
=item B<removeChild> ($child | $index)>
Accepts two different arguemnts. If given a B<Tree::Simple> object ( $child ),
this method finds that specific $child by comparing it with all the other
children until it finds a match. At which point the $child is removed. If
no match is found, and exception is thrown. If a non-B<Tree::Simple> object
is given as the $child argument, an exception is thrown.
This method also accepts a numeric $index and removes the child found at
that index from it's list of children. The $index is bounds checked, if
this condition fail, an exception is thrown.
When a child is removed, it results in the shifting up of all children after
it, and the removed child is returned. The removed child is properly
disconnected from the tree and all its references to its old parent are
removed. However, in order to properly clean up and circular references
the removed child might have, it is advised to call it's DESTROY method.
See the L<CIRCULAR REFERENCES> section for more information.
=item B<addSibling ($tree)>
=item B<addSiblings (@trees)>
=item B<insertSibling ($index, $tree)>
=item B<insertSiblings ($index, @trees)>
The addSibling , addSiblings , insertSibling and insertSiblings
methods pass along their arguments to the addChild , addChildren ,
insertChild and insertChildren methods of their parent object
respectively. This eliminates the need to overload these methods in subclasses
which may have specialized versions of the *Child(ren) methods. The one
exceptions is that if an attempt it made to add or insert siblings to the
B<ROOT> of the tree then an exception is thrown.
=back
B<NOTE:>
There is no removeSibling method as I felt it was probably a bad idea.
The same effect can be achieved by manual upwards traversal.
=head2 Accessor Methods
=over 4
=item B<getNodeValue>
This returns the value stored in the object's node field.
=item B<getUID>
This returns the unique ID associated with this particular tree. This can
be custom set using the setUID method, or you can just use the default.
The default is the hex-address extracted from the stringified Tree::Simple
object. This may not be a I<universally> unique identifier, but it should
be adequate for at least the current instance of your perl interpreter. If
you need a UUID, one can be generated with an outside module (there are
many to choose from on CPAN) and the C<setUID> method (see above).
=item B<getChild ($index)>
This returns the child (a B<Tree::Simple> object) found at the specified
$index . Note that we do use standard zero-based array indexing.
=item B<getAllChildren>
This returns an array of all the children (all B<Tree::Simple> objects).
It will return an array reference in scalar context.
=item B<getSibling ($index)>
=item B<getAllSiblings>
Much like addSibling and addSiblings , these two methods simply call
getChild and getAllChildren on the invocant's parent.
=item B<getDepth>
Returns a number representing the invocant's depth within the hierarchy of
B<Tree::Simple> objects.
B<NOTE:> A ROOT tree has the depth of -1. This be because Tree::Simple
assumes that a tree's root will usually not contain data, but just be an
anchor for the data-containing branches. This may not be intuitive in all
cases, so I mention it here.
=item B<getParent>
Returns the invocant's parent, which could be either B<ROOT> or a
B<Tree::Simple> object.
=item B<getHeight>
Returns a number representing the length of the longest path from the current
tree to the furthest leaf node.
=item B<getWidth>
Returns the a number representing the breadth of the current tree, basically
it is a count of all the leaf nodes.
=item B<getChildCount>
Returns the number of children the invocant contains.
=item B<getIndex>
Returns the index of this tree within its parent's child list. Returns -1 if
the tree is the root.
=back
=head2 Predicate Methods
=over 4
=item B<isLeaf>
Returns true (1) if the invocant does not have any children, false (0) otherwise.
=item B<isRoot>
Returns true (1) if the invocant's "parent" field is B<ROOT>, returns false
(0) otherwise.
=back
=head2 Recursive Methods
=over 4
=item B<traverse ($func, ?$postfunc)>
This method accepts two arguments a mandatory $func and an optional
$postfunc . If the argument $func is not defined then an exception
is thrown. If $func or $postfunc are not in fact CODE references
then an exception is thrown. The function $func is then applied
recursively to all the children of the invocant. If given, the function
$postfunc will be applied to each child after the child's children
have been traversed.
Here is an example of a traversal function that will print out the
hierarchy as a tabbed in list.
$tree.traverse(sub {
my ($_tree) = @_;
print (("\t" x $_tree.getDepth()), $_tree.getNodeValue(), "\n");
});
Here is an example of a traversal function that will print out the
hierarchy in an XML-style format.
$tree.traverse(sub {
my ($_tree) = @_;
print ((' ' x $_tree.getDepth()),
'<', $_tree.getNodeValue(),'>',"\n");
},
sub {
my ($_tree) = @_;
print ((' ' x $_tree.getDepth()),
'</', $_tree.getNodeValue(),'>',"\n");
});
=item B<size>
Returns the total number of nodes in the current tree and all its sub-trees.
=item B<height>
This method has also been B<deprecated> in favor of the getHeight method above,
it remains as an alias to getHeight for backwards compatability.
B<NOTE:> This is also no longer a recursive method which get's it's value on demand,
but a value stored in the Tree::Simple object itself, hopefully making it much
more efficient and usable.
=back
=head2 Visitor Methods
=over 4
=item B<accept ($visitor)>
It accepts either a B<Tree::Simple::Visitor> object (which includes classes derived
from B<Tree::Simple::Visitor>), or an object who has the C<visit> method available
(tested with C<$visitor-E<gt>can('visit')>). If these qualifications are not met,
and exception will be thrown. We then run the Visitor's C<visit> method giving the
current tree as its argument.
I have also created a number of Visitor objects and packaged them into the
B<Tree::Simple::VisitorFactory>.
=back
=head2 Cloning Methods
Cloning a tree can be an extremly expensive operation for large trees, so we provide
two options for cloning, a deep clone and a shallow clone.
When a Tree::Simple object is cloned, the node is deep-copied in the following manner.
If we find a normal scalar value (non-reference), we simply copy it. If we find an
object, we attempt to call clone on it, otherwise we just copy the reference (since
we assume the object does not want to be cloned). If we find a SCALAR, REF reference we
copy the value contained within it. If we find a HASH or ARRAY reference we copy the
reference and recursively copy all the elements within it (following these exact
guidelines). We also do our best to assure that circular references are cloned
only once and connections restored correctly. This cloning will not be able to copy
CODE, RegExp and GLOB references, as they are pretty much impossible to clone. We
also do not handle tied objects, and they will simply be copied as plain
references, and not re- tied .
=over 4
=item B<clone>
The clone method does a full deep-copy clone of the object, calling clone recursively
on all its children. This does not call clone on the parent tree however. Doing
this would result in a slowly degenerating spiral of recursive death, so it is not
recommended and therefore not implemented. What happens is that the tree instance
that clone is actually called upon is detached from the tree, and becomes a root
node, all if the cloned children are then attached as children of that tree. I personally
think this is more intuitive then to have the cloning crawl back I<up> the tree is not
what I think most people would expect.
=item B<cloneShallow>
This method is an alternate option to the plain clone method. This method allows the
cloning of single B<Tree::Simple> object while retaining connections to the rest of the
tree/hierarchy.
=back
=head2 Misc. Methods
=over 4
=item B<DESTROY>
To avoid memory leaks through uncleaned-up circular references, we implement the
DESTROY method. This method will attempt to call DESTROY on each of its
children (if it has any). This will result in a cascade of calls to DESTROY on
down the tree. It also cleans up it's parental relations as well.
Because of perl's reference counting scheme and how that interacts with circular
references, if you want an object to be properly reaped you should manually call
DESTROY . This is especially nessecary if your object has any children. See the
section on L<CIRCULAR REFERENCES> for more information.
=item B<fixDepth>
Tree::Simple will manage your tree's depth field for you using this method. You
should never need to call it on your own, however if you ever did need to, here
is it. Running this method will traverse your all the invocant's sub-trees
correcting the depth as it goes.
=item B<fixHeight>
Tree::Simple will manage your tree's height field for you using this method.
You should never need to call it on your own, however if you ever did need to,
here is it. Running this method will correct the heights of the current tree
and all it's ancestors.
=item B<fixWidth>
Tree::Simple will manage your tree's width field for you using this method. You
should never need to call it on your own, however if you ever did need to,
here is it. Running this method will correct the widths of the current tree
and all it's ancestors.
=back
=head2 Private Methods
I would not normally document private methods, but in case you need to subclass
Tree::Simple, here they are.
=over 4
=item B<_init ($node, $parent, $children)>
This method is here largely to facilitate subclassing. This method is called by
new to initialize the object, where new's primary responsibility is creating
the instance.
=item B<!setParent ($parent)>
This method sets up the parental relationship. It is for internal use only.
=item B<_setHeight ($child)>
This method will set the height field based upon the height of the given $child .
=back
CIRCULAR REFERENCES
I have revised the model by which Tree::Simple deals with ciruclar references.
In the past all circular references had to be manually destroyed by calling
DESTROY. The call to DESTROY would then call DESTROY on all the children, and
therefore cascade down the tree. This however was not always what was needed,
nor what made sense, so I have now revised the model to handle things in what
I feel is a more consistent and sane way.
Circular references are now managed with the simple idea that the parent makes
the descisions for the child. This means that child-to-parent references are
weak, while parent-to-child references are strong. So if a parent is destroyed
it will force all it's children to detach from it, however, if a child is
destroyed it will not be detached from it's parent.
=head2 Optional Weak References
By default, you are still required to call DESTROY in order for things to
happen. However I have now added the option to use weak references, which
alleviates the need for the manual call to DESTROY and allows Tree::Simple
to manage this automatically. This is accomplished with a compile time
setting like this:
use Tree::Simple 'use_weak_refs';
And from that point on Tree::Simple will use weak references to allow for
perl's reference counting to clean things up properly.
For those who are unfamilar with weak references, and how they affect the
reference counts, here is a simple illustration. First is the normal model
that Tree::Simple uses:
+---------------+
| Tree::Simple1 |<---------------------+
+---------------+ |
| parent | |
| children |-+ |
+---------------+ | |
| |
| +---------------+ |
+.| Tree::Simple2 | |
+---------------+ |
| parent |-+
| children |
+---------------+
Here, Tree::Simple1 has a reference count of 2 (one for the original
variable it is assigned to, and one for the parent reference in
Tree::Simple2), and Tree::Simple2 has a reference count of 1 (for the
child reference in Tree::Simple2).
Now, with weak references:
+---------------+
| Tree::Simple1 |.......................
+---------------+ :
| parent | :
| children |-+ : <--[ weak reference ]
+---------------+ | :
| :
| +---------------+ :
+.| Tree::Simple2 | :
+---------------+ :
| parent |..
| children |
+---------------+
Now Tree::Simple1 has a reference count of 1 (for the variable it is
assigned to) and 1 weakened reference (for the parent reference in
Tree::Simple2). And Tree::Simple2 has a reference count of 1, just
as before.
BUGS
None that I am aware of. The code is pretty thoroughly tested (see
L<CODE COVERAGE> below) and is based on an (non-publicly released)
module which I had used in production systems for about 3 years without
incident. Of course, if you find a bug, let me know, and I will be sure
to fix it.
SEE ALSO
I have written a number of other modules which use or augment this
module, they are describes below and available on CPAN.
=over 4
=item L<Tree::Parser> - A module for parsing formatted files into Tree::Simple hierarchies.
=item L<Tree::Simple::View> - A set of classes for viewing Tree::Simple hierarchies in various output formats.
=item L<Tree::Simple::VisitorFactory> - A set of several useful Visitor objects for Tree::Simple objects.
=item L<Tree::Binary> - If you are looking for a binary tree, this you might want to check this one out.
=back
Also, the author of L<Data::TreeDumper> and I have worked together
to make sure that B<Tree::Simple> and his module work well together.
If you need a quick and handy way to dump out a Tree::Simple heirarchy,
this module does an excellent job (and plenty more as well).
I have also recently stumbled upon some packaged distributions of
Tree::Simple for the various Unix flavors. Here are some links:
=over 4
=back
ACKNOWLEDGEMENTS
=over 4
=item Thanks to Nadim Ibn Hamouda El Khemir for making L<Data::TreeDumper> work
with B<Tree::Simple>.
=item Thanks to Brett Nuske for his idea for the getUID and setUID methods.
=item Thanks to whomever submitted the memory leak bug to RT (#7512).
=item Thanks to Mark Thomas for his insight into how to best handle the I<height>
and I<width> properties without unessecary recursion.
=item Thanks for Mark Lawrence for the &traverse post-func patch, tests and docs.
=back
AUTHOR
Original Authors of Perl 5 version on CPAN
Stevan Little, E<lt>stevan@iinteractive.comE<gt>
Rob Kinyon, E<lt>rob@iinteractive.comE<gt>
Current Author
Philip mabon, E<lt>philipmabon@gmail.comE<gt>
COPYRIGHT AND LICENSE
Copyright 2010 by Philip mabon
This library is free software; you can redistribute it and/or modify
it under the same terms as Perl itself.
=cut