BSP Trees - an introduction

What are BSP trees?

This first tutorial is going to take a look at the concept of BSP trees. We are going to start by 'doing a BSP tree' operation on the text "BSP trees", that is split it in to "BSP" and "trees"! We'll take a look at trees first, we'll get to the BSP bit later.

Actually, BSP trees remind me a lot of my old school chemistry teacher Mr. Horkan. Who the hell is he I hear you cry!? He was a pretty scruffy individual famous in our school for carrying out mad experiments in his study. We would often hear an explosion, followed by a cry of either Shit! or Eureka! and he would come running out of his study followed by a cloud of smoke and the odd flame. I remember him because he smacked the hell out of me once. Well Iwas a viscious little git in my day. I was heating up 50 pence pieces with a bunsen burner and leaving them lying around so that people would pick them up and (hopefully) get burned. WTF? Anyway, I digress, BSP trees remind me of Mr. Horkan because of something he said. He was wittering on about some Greek guy who wondered about what would happen if he cut a piece of paper in two, then cut them again and so on. This Greek guy wondered if there would come a point when he couldn't cut the paper anymore? The thought of this Greek guy surrounded by lots of bits of paper piqued my interest and the image stayed with me.

What the hell has any of this got to do with BSP trees? - ah, well read on and all will be revealed.

Ok, so what are trees?

Trees are simply a programmer's data structure for storing data. There are some interesting aspects of trees such as the ability to move through them (traverse in the jargon) quickly, their recursive nature and their myriad applications.

Depending on how you add items to the tree or traverse the tree, the tree can be used for sorting data into a particular order. It probably best to play around with some simple tree code to see how they work. Take a look at the following code:


//
// Tree node
//
struct tnode {
	int item;
	int count;
	struct tnode *left;
	struct tnode *right;
};
This shows typical tree node. Note that it has pointers to other nodes - a left child and a right child. Each child, itself a node, would have left and right child pointers and so on given the tree its recursive nature. The very first node in the data structure is the root. If the root had no children then its left and right pointers would be NULL, this applies all the way down the tree. In the above structure we have the data itself, which in this case is just a simple integer. We also keep a count of the number of data items stored in this node, this saves space in the tree and avoids duplicating information unecessarily. Having got the tree node structure, what sort of functions will operate on this structure. Here's some examples:

//
// Function prototypes
//
struct tnode *addNode(struct tnode *tn, int data);
void treePrint(struct tnode *tn);
struct tnode *createNode(void);
We at least need to be able to add a node into the tree. Before we add an item we need to create it though. Also, we might like to view what's in the tree so we have a function for that too. Other functions you might like to try and add include a deleteNode. If you don't you could end up with memory leaks! Looking at these three functions in turn you can see that createNode is trivial. It just allocates memory for the tree node data structure. Next there's addNode() which is a bit more complicated. Here's the code for it:

struct tnode *addNode(struct tnode *p, int data)
{
	if (p==NULL) 				   // create new node 
	{
		p=createNode();
		p->item = data;
		p->count = 1;
		p->left = p->right = NULL;
	}
	else
	{
		if (data==p->item)	 // item in tree so..
			p->count++;	 // inc item count
		else
		{
			if (dataitem)	 // goes on left so search left
				p->left = addNode(p->left,data);
			else 
				p->right = addNode(p->right,data); // goes on right so search right
		}
	}
	return p;
}
First we check to see if the node needs to be created. Note how the items in tree node are initialised. The other important thing to notice here is that addNode calls itself! Wooah - we have recursion! Recursion can be a dangerous thing if not handled correctly, if we don't provide a way to stop the recusion we will very soon get a stack overflow situation. This happens when your stack grows down onto the program code area. If you look carefully at the code you should also see that we are deciding whether to put the data on the left or right of the current node based on its size. addNode gives us back a pointer to the node added.

If you are having problems getting to grips with this, then just imagine the list is empty and follow through the code. Once you've gone through a couple of node additions you should have a firm grasp of what's going on.

Here's the code for treePrint:

void treePrint(struct tnode *p)
{
	int i;

	if (p != NULL) // ends recursion
	{
		treePrint(p->left);
		for (i=0; i< p->count;i++)
			printf("%d\n",p->item);  // when p is NULL we return to here.
		treePrint(p->right);
	}
	// else just return
}
The first thing you notice is that treePrint is also recursive! What will happen is that we will end up printing nodes from the last node added and traverse the tree printing the data items.

Let's get our hands dirty now. Compile up the code (you can get the complete source from the downloads page) and run it adding integer data items as prompted. When you enter -1 the tree is printed. Lo and behold, automagically everything comes out in a nice order. Clever things these trees. Actually, as you probably figured, the clever bit was done by addNode, which put the data items in a particular order based on size.

Hopefully, that's given you a basic understanding of some of the basic properties of trees.

Now the BSP bit!

I struggled for a little while trying to understand Binary Space Partitioning (BSP), but actually the concept itself is fairly straightforward, although the implementation can be tricky (which is why the likes of Carmack are driving around in Ferraris :p). In this tutorial I'm going to concentrate on getting a better understanding of what BSP trees are and deal with the actual coding of it all in another tutorial.We are going to concentrate on looking at 2D BSPs, as used in the Doom engine. Remember, the Heretic engine is really an enhanced version of the Doom engine.

First off a few terms. For more info on these you need to check out Matt Fell's Unofficial Doom Specs, which all self-respecting Heretic hackers should know inside out ;) Also fire up Heretic (any excuse), hit tab and look at the 2D map mode. This will give you a better idea of what Heretic levels are all about - 2D space.

Vertex

A 2D x,y position in the map.

Linedef

A line basically. It is described by two points (vertexes), a start and a finish. A linedef has two sides, when going from start to finish the left hand side is the 'left' and the right hand side is the 'right'. Seems logical to me :) These linedefs define where the walls are in Heretic.

Sidedef

A linedef has one or two sidedefs. If the linedef is at the edge of a map it might have only one sidedef - as nonone ever goes around to the other side of it. The sidedefs typically describe what a wall will look like i.e. what textures it will use. However, there are other 'special' sidedefs too, that do things like trigger monsters.

Sector

A sector is described by a number of linedefs. It represents a 2D region in the heretic map and the most important piece of information it contains is the height of the sector. This is needed because the vertexes do not have any height information in them, just positional. Other sector information could include the textures to use for the floor and ceiling.

Subsectors

Sectors can be fairly complex areas, so to make it easier to render the walls correctly, sectors can be divided down into smaller regions called subsectors. The most notable thing about a subsector is that from any where in a subsector you can see any other part of the same subsector without a wall getting in the way.

Segs

Subsectors are described by a collection of segs. Segs are parts of a wall i.e. all or part of a sidedef.

Nodes

Here's the code for a Node taken straight from the Heretic source code. You can see straight away some similarity between this data structure and the one we wrote earlier for the tree example. Nodes are essentially tree nodes.


// 
// BSP node. 
// 
typedef struct { 
	// Partition line. 
	fixed_t x; 
	fixed_t y; 
  	fixed_t dx; 
	fixed_t dy; 


	// Bounding box for each child. 
	fixed_t bbox[2][4]; 
  	
	// If NF_SUBSECTOR its a subsector. 
	unsigned short children[2]; 
} node_t; 

You'll notice that a node consists of three main bits.

The secret to understanding nodes is to realise that the child "node" is either another node or a subsector. You can think of each "node" as being almost arbitrary concpetual thing, but a subsector relates to a specific part of the level map.

Putting it all together!

When a Heretic map is designed it has to be preprocessed before it can be displayed correctly by the engine. What is the nature of this pre-processing? Because, of the way the engine works it needs to know, from any arbitrary point in the level, what can be seen and what can't be. This helps give the Doom engine its great speed as it doesn't have to draw walls unecessarily. The preprocessing is something akin to getting a piece of paper, cutting it in half, which gives you two halves. You then cut each of these halves in two again and so on. The whole process is complete when all you have left are subsectors. As this process is carried out, the information generated is stored in the Nodes lump of the WAD file. This is essentially a big tree structure where each tree node is either another node or a subsector. As we have seen trees can be used to store data in a particular order and the rendering engine uses this fact to allow it to render the walls correctly.

In Conclusion

Well, we have only scratched the surface so far. In future tutorials I want to expand on these basic concepts and look at some diagrams and real working code. Stay tuned.