Manage hierarchical data with MySQL stored procedures

Below you will find all code to create the table and the stored procedures to manage hierarchical trees in MySQL.
The following stored procedures are provided:

  • tree_add_root()
  • tree_add_node(id,name,label,description)
  • tree_update_node(id,label,description)
  • tree_get_all(depth,indentstring)
  • tree_get_branch(id,depth,indentstring)
  • tree_get_branch_by_name(parentname,name,depth,indentstring)
  • tree_get_parents(id,indentstring)
  • tree_swap_leafs(id1,id2)
  • tree_del(id)

All code is documented and can be downloaded in a zip file.
Note that I do not describe the parameters for each procedure in this article. The parameters are described in the code.
Also, the procedures try to verify the input of the stored procedures and appropriate error messages will be returned if needed.

The method used to store a tree in the database:
Almost all applications need to store and manage hierarchical data in a database.
There are several ways of doing this. The ‘adjacency list model’ or the ‘recursion method’ is the most commonly used model.
A table created for this method will look something like this:

CREATE TABLE categories (
id       INT NOT NULL  PRIMARY KEY
, name     VARCHAR(50) NOT NULL
, parent_id INT NULL
, FOREIGN KEY parent_id_fk (parent_id)
REFERENCES categories (id)
);

This is the most used method because it is very simple to understand the relations by looking at the data, very simple to insert the data and relative simple to retrieve it.
But retrieving the data can be slow and inefficient. You will need multiple joins or recursive functions in your application to retrieve the results you want.

Another way to store trees in your database is the Modified Preorder Tree Traversal method.
A table created for this method will look something like this:

CREATE TABLE categories(
id     INT NOT NULL PRIMARY KEY
,name   VARCHAR(50) NOT NULL
,lft    INT NOT NULL
,rht    INT NOT NULL
);

In this model there is no parent_id but the left and right columns define the relations. See this article by Joe Celko for an explanation.

With this model you can retrieve the whole tree as simple as this:
SELECT * FROM categories ORDER BY lft;

To retrieve a branch in the tree:
SELECT * FROM categories WHERE lft BETWEEN 1 AND 20 ORDER BY lft;

So, it is very simple and much more efficient to retrieve the data with this method.

How to set it up?
Install the table:

mysql>source create_tables.sql

This will create the table ‘trees’ if it does not already exists.

Install the stored procedures:

mysql>source create_procedures.sql

This will create the procedures as described above. Note that they will overwrite existing stored procedures if they have the same name. Be careful!

Set up test data:
mysql>source populate_trees.sql

How to use the stored procedures
To test if everything works run the procedure to show the whole tree and have the branches indented by ‘–’:

mysql>call tree_get_all(NULL,'--');
+---------+-----------+---------------------+------------------+-----+-----------+
| tree_id | name      | label               | description      | lvl | is_branch |
+---------+-----------+---------------------+------------------+-----+-----------+
|       1 | Root      | Root                | NULL             |   0 |         1 |
|       2 | pages     | --Pages             | Pages in website |   1 |         1 |
|       4 | Home      | ----Home            | NULL             |   2 |         1 |
|       5 | About     | ------About us      | NULL             |   3 |         1 |
|       6 | Contact   | --------Contact us  | NULL             |   4 |         0 |
|       7 | Events    | --------Our events  | NULL             |   4 |         0 |
|       8 | Service   | ------Service       | NULL             |   3 |         1 |
|       9 | Products  | --------Products    | NULL             |   4 |         1 |
|      10 | Search    | ----------Search    | NULL             |   5 |         0 |
|      11 | Reviews   | ----------Reviews   | NULL             |   5 |         0 |
|      25 | Furniture | ----------Furniture | NULL             |   5 |         0 |
|      12 | FAQ       | --------FAQ         | NULL             |   4 |         0 |
|      13 | Order     | ------Shoppingcart  | NULL             |   3 |         0 |
|       3 | products  | --Products          | Product list     |   1 |         1 |
|      14 | Furniture | ----Furniture       | NULL             |   2 |         1 |
|      16 | Living    | ------Living        | NULL             |   3 |         1 |
|      19 | Chairs    | --------Chairs      | NULL             |   4 |         0 |
|      20 | Tables    | --------Tables      | NULL             |   4 |         0 |
|      17 | Kitchen   | ------Kitchen       | NULL             |   3 |         0 |
|      18 | Bedroom   | ------Bedroom       | NULL             |   3 |         1 |
|      21 | Chairs    | --------Chairs      | NULL             |   4 |         0 |
|      22 | Tables    | --------Tables      | NULL             |   4 |         0 |
|      23 | Beds      | --------Beds        | NULL             |   4 |         0 |
|      24 | Closets   | --------Closets     | NULL             |   4 |         0 |
|      15 | Office    | ----Office suplies  | NULL             |   2 |         0 |
+---------+-----------+---------------------+------------------+-----+-----------+
25 rows in set (0.00 sec)

The example data has two trees, one for the navigation in our website and another tree for products that we sell.
tree_id : The id of the node, this is the primary key
name : name of the node, must be unique (case sensitive) in a branch
label : the pretty name of the node
description : an optional long description of the node
lvl : the level of the node in the tree
is_branch : if it is ‘1′ there is a sub-branch available. If it is ‘0′ then this node is a leaf in the tree.

Now lets get the first two levels in the tree indented by 2 spaces:

mysql>call tree_get_all(2,'  ');
+---------+-----------+--------------------+------------------+-----+-----------+
| tree_id | name      | label              | description      | lvl | is_branch |
+---------+-----------+--------------------+------------------+-----+-----------+
|       1 | Root      | Root               | NULL             |   0 |         1 |
|       2 | pages     |   Pages            | Pages in website |   1 |         1 |
|       4 | Home      |     Home           | NULL             |   2 |         1 |
|       3 | products  |   Products         | Product list     |   1 |         1 |
|      14 | Furniture |     Furniture      | NULL             |   2 |         1 |
|      15 | Office    |     Office suplies | NULL             |   2 |         0 |
+---------+-----------+--------------------+------------------+-----+-----------+
6 rows in set (0.00 sec)

Get only a branch of the tree:
Get the ‘Products’ branch and only it’s direct children and do not indent it:

mysql>call tree_get_branch(3,1,NULL);
+---------+-----------+----------------+--------------+-----+-----------+
| tree_id | name      | label          | description  | lvl | is_branch |
+---------+-----------+----------------+--------------+-----+-----------+
|       3 | products  | Products       | Product list |   1 |         1 |
|      14 | Furniture | Furniture      | NULL         |   2 |         1 |
|      15 | Office    | Office suplies | NULL         |   2 |         0 |
+---------+-----------+----------------+--------------+-----+-----------+
3 rows in set (0.00 sec)

Or get a branch by it’s name.
Because it is very likely that a tree has duplicate names in it, you need to provide the name of the parent and the name of the node that you want to retrieve. If this combination is used more then once in the tree, only one result is retrieved.
This procedure is not as efficient as the one above and in case of duplicate parent-child combinations the result can be unpredictable.
But it is much more verbose to ask for ‘Root’,'products’ then only for ‘id:3′.
Note that the names are case sensitive.
E.g.;

mysql>call tree_get_branch_by_name('Root','products',1,NULL);
+---------+-----------+----------------+--------------+-----+-----------+
| tree_id | name      | label          | description  | lvl | is_branch |
+---------+-----------+----------------+--------------+-----+-----------+
|       3 | products  | Products       | Product list |   1 |         1 |
|      14 | Furniture | Furniture      | NULL         |   2 |         1 |
|      15 | Office    | Office suplies | NULL         |   2 |         0 |
+---------+-----------+----------------+--------------+-----+-----------+
3 rows in set (0.00 sec)

To retrieve the parents of the node ‘Beds’, for a breadcrumb for example, use;

mysql>call tree_get_parents(23,NULL);
+---------+-----------+-----------+--------------+-----+
| tree_id | name      | label     | description  | lvl |
+---------+-----------+-----------+--------------+-----+
|       1 | Root      | Root      | NULL         |   0 |
|       3 | products  | Products  | Product list |   1 |
|      14 | Furniture | Furniture | NULL         |   2 |
|      18 | Bedroom   | Bedroom   | NULL         |   3 |
+---------+-----------+-----------+--------------+-----+
4 rows in set (0.00 sec)

Adding, removing and changing the tree:

Add the root to the tree:
When you first set up the table you first need to add a root to the tree. You only need to do this once.
This will give you an error message if the table trees is not empty.
mysql>call tree_add_root();

Add a new node to the tree:

mysql>call tree_add_node(18,'beds',NULL,NULL);
+---------+
| tree_id |
+---------+
|      26 |
+---------+
1 row in set (0.01 sec)

This has added the node ‘beds’ to the parent ‘Bedroom’. The new tree_id for this node is returned.
We left the ‘label’ parameter empty so the name will be used as the label too.

Note that in the parent ‘Bedroom’ there is already another node called ‘Beds’.
We are allowed to add this node name because the names of the nodes are case sensitive. Another ‘Beds’ would generate an error message.
Let’s see the result:

mysql:call tree_get_branch(18,1,'  ');
+---------+---------+-----------+-------------+-----+-----------+
| tree_id | name    | label     | description | lvl | is_branch |
+---------+---------+-----------+-------------+-----+-----------+
|      18 | Bedroom | Bedroom   | NULL        |   3 |         1 |
|      21 | Chairs  |   Chairs  | NULL        |   4 |         0 |
|      22 | Tables  |   Tables  | NULL        |   4 |         0 |
|      23 | Beds    |   Beds    | NULL        |   4 |         0 |
|      24 | Closets |   Closets | NULL        |   4 |         0 |
|      26 | beds    |   beds    | NULL        |   4 |         0 |
+---------+---------+-----------+-------------+-----+-----------+
6 rows in set (0.00 sec)

Change a node:
We noticed that the label and description of the newly inserted node is not correct and we can change it like this:

mysql>call tree_update_node(26,'Beds','other beds');
Query OK, 1 row affected (0.04 sec)

mysql:call tree_get_branch(18,1,'  ');
+---------+---------+-----------+-------------+-----+-----------+
| tree_id | name    | label     | description | lvl | is_branch |
+---------+---------+-----------+-------------+-----+-----------+
|      18 | Bedroom | Bedroom   | NULL        |   3 |         1 |
|      21 | Chairs  |   Chairs  | NULL        |   4 |         0 |
|      22 | Tables  |   Tables  | NULL        |   4 |         0 |
|      23 | Beds    |   Beds    | NULL        |   4 |         0 |
|      24 | Closets |   Closets | NULL        |   4 |         0 |
|      26 | beds    |   Beds    | other beds  |   4 |         0 |
+---------+---------+-----------+-------------+-----+-----------+
6 rows in set (0.00 sec)

Change the order of leafs:
We can change the order of the leafs like this:

mysql>call tree_swap_leafs(24,26);
Query OK, 0 rows affected (0.04 sec)

mysql> call tree_get_branch(18,1,'  ');
+---------+---------+-----------+-------------+-----+-----------+
| tree_id | name    | label     | description | lvl | is_branch |
+---------+---------+-----------+-------------+-----+-----------+
|      18 | Bedroom | Bedroom   | NULL        |   3 |         1 |
|      21 | Chairs  |   Chairs  | NULL        |   4 |         0 |
|      22 | Tables  |   Tables  | NULL        |   4 |         0 |
|      23 | Beds    |   Beds    | NULL        |   4 |         0 |
|      26 | beds    |   Beds    | other beds  |   4 |         0 |
|      24 | Closets |   Closets | NULL        |   4 |         0 |
+---------+---------+-----------+-------------+-----+-----------+
6 rows in set (0.00 sec)

Remove a node:
And finally we can remove a node from the tree.
If the node has children, these children will be removed too.

mysql>call tree_del(18);
Query OK, 6 rows affected (0.01 sec)

mysql> call tree_get_branch(18,1,'  ');
ERROR 1644 (45000): The tree_id does not exists.

Let’s try it one level higher to see the whole furniture branch:

mysql>call tree_get_branch(14,NULL,'  ');
+---------+-----------+------------+-------------+-----+-----------+
| tree_id | name      | label      | description | lvl | is_branch |
+---------+-----------+------------+-------------+-----+-----------+
|      14 | Furniture | Furniture  | NULL        |   2 |         1 |
|      16 | Living    |   Living   | NULL        |   3 |         1 |
|      19 | Chairs    |     Chairs | NULL        |   4 |         0 |
|      20 | Tables    |     Tables | NULL        |   4 |         0 |
|      17 | Kitchen   |   Kitchen  | NULL        |   3 |         0 |
+---------+-----------+------------+-------------+-----+-----------+
5 rows in set (0.00 sec)

UPDATE
In response to the question from Max about counting the children of each node I added a new column to the output of the tree result.
The new column is called “cnt_children”.

mysql>call tree_get_all(NULL,'--');
+---------+-----------+---------------------+------------------+-----+--------------+-----------+
| tree_id | name      | label               | description      | lvl | cnt_children | is_branch |
+---------+-----------+---------------------+------------------+-----+--------------+-----------+
|       1 | Root      | Root                | NULL             |   0 | 24           |         1 |
|       2 | pages     | --Pages             | Pages in website |   1 | 11           |         1 |
|       4 | Home      | ----Home            | NULL             |   2 | 10           |         1 |
|       5 | About     | ------About us      | NULL             |   3 | 2            |         1 |
|       6 | Contact   | --------Contact us  | NULL             |   4 | 0            |         0 |
|       7 | Events    | --------Our events  | NULL             |   4 | 0            |         0 |
|       8 | Service   | ------Service       | NULL             |   3 | 5            |         1 |
|       9 | Products  | --------Products    | NULL             |   4 | 3            |         1 |
|      10 | Search    | ----------Search    | NULL             |   5 | 0            |         0 |
|      11 | Reviews   | ----------Reviews   | NULL             |   5 | 0            |         0 |
|      25 | Furniture | ----------Furniture | NULL             |   5 | 0            |         0 |
|      12 | FAQ       | --------FAQ         | NULL             |   4 | 0            |         0 |
|      13 | Order     | ------Shoppingcart  | NULL             |   3 | 0            |         0 |
|       3 | products  | --Products          | Product list     |   1 | 11           |         1 |
|      14 | Furniture | ----Furniture       | NULL             |   2 | 9            |         1 |
|      16 | Living    | ------Living        | NULL             |   3 | 2            |         1 |
|      19 | Chairs    | --------Chairs      | NULL             |   4 | 0            |         0 |
|      20 | Tables    | --------Tables      | NULL             |   4 | 0            |         0 |
|      17 | Kitchen   | ------Kitchen       | NULL             |   3 | 0            |         0 |
|      18 | Bedroom   | ------Bedroom       | NULL             |   3 | 4            |         1 |
|      21 | Chairs    | --------Chairs      | NULL             |   4 | 0            |         0 |
|      22 | Tables    | --------Tables      | NULL             |   4 | 0            |         0 |
|      23 | Beds      | --------Beds        | NULL             |   4 | 0            |         0 |
|      24 | Closets   | --------Closets     | NULL             |   4 | 0            |         0 |
|      15 | Office    | ----Office suplies  | NULL             |   2 | 0            |         0 |
+---------+-----------+---------------------+------------------+-----+--------------+-----------+
25 rows in set (0.00 sec)

Download the code
You can download the stored procedures in this zip file: trees_2012-11-26.zip

Related posts

Tags: , , , ,

19 Responses to “Manage hierarchical data with MySQL stored procedures”

  1. sachin says:

    Hi,

    You write that source command overwrite procedure if it is already exist with same name, but MySQL doesn’t support atomic updates of stored procedures so it overwrite procedure. It is bug from MySQL end and yet not fixed. This is a long-standing bug, first filed in 2005 and still not fixed (http://bugs.mysql.com/bug.php?id=9588).

    mysql>source create_procedures.sql

  2. Banzai says:

    Hups.. I renamed the table from ‘trees’ to ‘NESTED_SET’!!! Please be careful about this!!!
    Sorry for this.

    Lutz aka Banzai

  3. Banzai says:

    Hello there,

    greate article, helps me a lot to implement nested set to database.

    I’ve made a new procedure for your table structure to moving nodes, base on this message in stackoverflow: http://stackoverflow.com/questions/889527/move-node-in-nested-set

    It’s not exactly tested in all cases, but seems to work. Hope this helps.

    /*****************************************************************
    * tree_move_node
    *****************************************************************/
    DELIMITER //
    DROP PROCEDURE IF EXISTS tree_move_node;
    //
    CREATE DEFINER = CURRENT_USER PROCEDURE tree_move_node (
    IN p_id INT,
    IN n_id INT
    )
    /**
    * Part of the hierarchical tree functionality package. This package moves a node
    * to another parent node.
    * %see http://stackoverflow.com/questions/889527/move-node-in-nested-set for more information
    *
    * %author Lutz Mahle aka Banzai
    * %version 1.0
    * Move a node in the tree with all children
    * %param p_id Number: The tree_id of the branch to move
    * %param n_id Number: The tree_id of the new parent
    * %return nothing
    */
    proc: BEGIN

    — Declare variables
    DECLARE e_no_id CONDITION FOR SQLSTATE ‘45000′;
    DECLARE e_id_not_found CONDITION FOR SQLSTATE ‘45000′;
    DECLARE v_node_found INT DEFAULT NULL;
    DECLARE v_node_lft INT DEFAULT NULL;
    DECLARE v_node_rht INT DEFAULT NULL;
    DECLARE v_node_lvl MEDIUMINT DEFAULT NULL;
    DECLARE v_node_tmp INT DEFAULT NULL;

    DECLARE w_node_found INT DEFAULT NULL;
    DECLARE w_node_lft INT DEFAULT NULL;
    DECLARE w_node_rht INT DEFAULT NULL;
    DECLARE w_node_lvl MEDIUMINT DEFAULT NULL;
    DECLARE w_node_tmp INT DEFAULT NULL;

    DECLARE node_size INT DEFAULT NULL;

    — Validate the tree_id
    IF ( p_id IS NULL) THEN
    SIGNAL e_no_id
    SET MESSAGE_TEXT = ‘The id cannot be empty.’;
    LEAVE proc;
    END IF;
    – Validate the tree_id
    IF ( n_id IS NULL) THEN
    SIGNAL e_no_id
    SET MESSAGE_TEXT = ‘The new id cannot be empty.’;
    LEAVE proc;
    END IF;

    — Check if nodes really exists
    SELECT tree_id, lft, rht, lvl INTO v_node_found, v_node_lft, v_node_rht, v_node_lvl
    FROM NESTED_SET
    WHERE tree_id = p_id;

    IF ( v_node_found IS NULL) THEN
    SIGNAL e_id_not_found
    SET MESSAGE_TEXT = ‘The tree_id to move does not exists.’;
    LEAVE proc;
    END IF;

    SELECT tree_id, lft, rht, lvl INTO w_node_found, w_node_lft, w_node_rht, w_node_lvl
    FROM NESTED_SET
    WHERE tree_id = n_id;

    IF ( w_node_found IS NULL) THEN
    SIGNAL e_id_not_found
    SET MESSAGE_TEXT = ‘The tree_id for new parent does not exists.’;
    LEAVE proc;
    END IF;

    — Start deleting
    START TRANSACTION;
    — ’size’ of moving node (including all it’s sub nodes)
    SET node_size = v_node_rht – v_node_lft + 1;

    – step 1: temporary “remove” moving node by update lft and right to negativ
    UPDATE NESTED_SET
    SET lft = 0-(lft), rht = 0-(rht)
    WHERE lft >= v_node_lft AND rht v_node_rht;
    UPDATE NESTED_SET
    SET rht = rht – node_size
    WHERE rht > v_node_rht;

    — step 3: increase left and/or right position values of future ‘lower’ items (and parents)
    UPDATE NESTED_SET
    SET lft = lft + node_size
    WHERE lft >= IF(w_node_rht > v_node_rht, w_node_rht – node_size, w_node_rht);
    UPDATE NESTED_SET
    SET rht = rht + node_size
    WHERE rht >= IF(w_node_rht > v_node_rht, w_node_rht – node_size, w_node_rht);

    — step 4: move node (ant it’s subnodes) and update it’s parent item id
    UPDATE NESTED_SET
    SET
    lft = 0-(lft)+IF(w_node_rht > v_node_rht, w_node_rht – v_node_rht – 1, w_node_rht – v_node_rht – 1 + node_size),
    rht = 0-(rht)+IF(w_node_rht > v_node_rht, w_node_rht – v_node_rht – 1, w_node_rht – v_node_rht – 1 + node_size)
    WHERE lft = 0-v_node_rht;
    COMMIT;

    END //
    DELIMITER ;

    Greetings & thanks for sharing your code

    Lut aka Banzai

  4. StackWalker says:

    Genius concept. Insightful article

  5. Rajasingh says:

    Any way to change parent of the node. I use http://stackoverflow.com/a/1274175/2031560. But i doesn’t update lvl. Any idea to fix and release the function

  6. Jay says:

    Ronald –

    Your work is outstanding. I am using the trees procedures you published here and I’d like to ask you to post your code to Github. I think this would increase your standing and earning power, as well as ensure the code remains available.

    If you have no time or interest in doing this – would you let me post it on your behalf?

    J

  7. nest says:

    It is very useful. Is there any way to change node change current branch to another branch?

  8. grails says:

    It is good that MySQL stored proc is very powerful now, like in Oracle PL/SQL and DB2 Natural. But I just find using the delimiter awkward and counter intuitive.

  9. steve says:

    How can one count children without counting parent nodes?

  10. willowdan says:

    Hi,

    Thanks for the awesome tutorial. I noticed that the stored procedure “tree_change_position” has not been present in the codes, but was mentioned in the “testdata.sql” file.

    Many thanks and cheers!

  11. Lou Reed says:

    Ronald –

    [1] Love your work. Incredible.

    [2] I found this code about a year ago and have been using it since. Other than adding a column to point each row to a reference in another table and a column to keep a “status” on each node, I haven’t made any changes to the source.

    [3] One day, when I get around to it…I want to add a feature to change the way you count result nodes. Currently, the script counts not only childless nodes but also parent nodes. So if you are displaying inventory for a store, for example…you end up returning a number of total results for all nodes instead of only terminating child nodes.

    I don’t want to use a scripting language to do this, for performance reasons. Yet I’m not strong enough in mysql to know how to approach this. Any ideas?

  12. julio says:

    Great code.

    Thank you.

  13. Using says:

    I’m using this. It’s awesome. Thank you.

  14. Hi Max,

    You can count the children with this formula: (((rht – lft) -1) / 2)
    This will include also all children in deeper levels.

    I have updated the code and added a column ‘cnt_children’.

    Cheers, Ronald.

  15. Max says:

    Hi. I like this example and it is very helped me and my group in our study. We are creating web application where we need to use hierarchy like this. We have Categories and related items table. But how can I count all the items in each category and subcategory? Just want to show the list like this:
    Category Cars
    Volvo(10)
    Volvo Parts(5)
    Volvo 325(5)
    Opel(20)
    VW(30)

  16. Hi Mike,

    Thanks for spotting the error!
    I have updated the code and provided a new zip file (trees_2012-11-25) with your change.
    If you have any additions to the code or want to share some new functions to the code, feel free to send them to me.

    Thanks again!

    Cheers, Ronald.

  17. Mike Tommasi says:

    By the way there is a small mistake on tree_del (delete node procedure) that ends up leaving gaps in the series of lft and rht values. In the “update the parents” section (which actually also updates all neighbours to the right) the line should be :

    SET v_node_tmp = ((v_node_rht – v_node_lft) + 1);

    Right now it ends in “- 1″. Take a simple case of deleting a leaf: then v_node_tmp in the current formula would be 0, thus doing nothing… with the corrected line, the value is +2, which is what you want to subtract from all neighbors by (and from rht on parents).

  18. Mike Tommasi says:

    Very good article, helped me learn procedure in MySQL. Works great. I added a few variants of the calls, like one for adding a node giving the NAME of the parent instead of the Id. Had to fix a problem arising from my having access only to version 5.1, all the SIGNAL statements failed. So I substituted code to emulate SIGNAL (from O’Reilly book MySQL Stored Procedure, Error Handling):

    CREATE PROCEDURE my_signal(in_errortext VARCHAR(255))
    BEGIN
    SET @sql=CONCAT(‘UPDATE `’,in_errortext,’` SET x=1′);
    PREPARE my_signal_stmt FROM @sql;
    EXECUTE my_signal_stmt;
    DEALLOCATE PREPARE my_signal_stmt;
    END //

    thanks again

    Mike

  19. Alex says:

    awesome !! man thanks a lot

Leave a Reply