Recursion -- See RecursionFiled: Mon, Jan 29 2007 under Programming|| Tags: methodology quicksort sort recurssion xml javascript Recursion is one of those programming concepts that separate "REAL" programmers from script kiddies. "Real" programmers can do their taxes with regular expressions, write a full blown operating system in native assembly (the REAL REAL programmers do it in binary), and real programmers grok recursion. Recursion isn't exclusive to the high gurus of programming however -- it's actually a very simple concept to use and understand. IntroductionRecursion is the process where a function will call itself. It's that simple. Unfortunately most textbooks and even the infamous Wikipedia tend to make this a lot more complicated that it really is. Recursion, in mathematics and computer science, is a method of defining functions in which the function being defined is applied within its own definition. The term is also used more generally to describe a process of repeating objects in a self-similar way. For instance, when the surfaces of two mirrors are almost parallel with each other the nested images that occur are a form of recursion. Wikipedia I've been using recursion ever since I wrote my first Mandelbrot fractal program in my teens and I have a tough time pulling useful information from that Wikipedia definition. It really, really is not that difficult. Recursion to travel a Linked ListLets say we have a data structure with some data in it and a pointer to another element with the same exact structure, just different data in it. +----------+ +----------+ +----------+ | data: 15 |-->| data: 10 |-->| data: 20 | +----------+ +----------+ +----------+ Specifically, in javascript, this data structure (known as a linked list)
var nodeList= {}; // Declare an empty object
nodeList.value=15;
nodeList.next=null;
newNode= {}; // Create a new node
newNode.value=10;
newNode.next=null;
nodeList.next=newNode; // Insert newNode into the nodeList
finalNode= {}; // Create a new node
finalNode.value=20;
finalNode.next=null;
nodeList.next.next=finalNode; // Insert finalNode into the nodeList
Now lets say we create a function...
function processNode(node) {
if (node.next != null) {
processNode(node.next)
}
outputStr+= node.value+'<br>';
}
A node is a single cell of a data set, a single record if you will. Given the above data structure we pass processNode the top record of our data set in this case the one with a value of 15. processNode(nodeList); Now this node (value 15) has a pointer to another node (value 10) and since processNode already knows how to process nodes of this data-set, we just go ahead and call processNode again only this time passing it the reference to the node with the data "10" in it, and then subsequently we pass the final node with a value of 20 to processNode. Even though processNode was FIRST called with the node with a value of 15, our outputStr which we are building will have as its first value the value 20. This is because we called processNode again and then again before we added 20 to the build string. You can see this as the process is broken down step by step. processNode(node with data=15) processNode(node with data=10) processNode(node with data=20) outputStr+=20 return to calling process (processNode with data=10) outputStr+=10 return to calling process (processNode with data=15) outputStr+=15 End processNode function outputStr's value = 20, 10, 15 If we'd like to reverse the output of the display all we need to do is to change the placement of where we build our outputStr. This simple change -- building the outputStr before we check for other "nodes" will reverse the order of the output.
function processNode(node) {
outputStr+= node.value+'<br>';
if (node.next != null) {
processNode(node.next)
}
}
The new step-by-step process looks like this... processNode(node with data=15) outputStr+=15 processNode(node with data=10) outputStr+=10 processNode(node with data=20) outputStr+=20 return to calling process (processNode with data=10) return to calling process (processNode with data=15) End processNode function OutputStr's value = 15, 10, 20 The key to understanding what is going on in recurssion is to understand that processNode knows how to deal with the individual records in our data structure. Since it knows how to deal with one record it knows how to deal with all of them and so instead of writing many different instances of processNode we just have it call itself, passing along another record. The key to using recursion is to understand you'd use it just like you use any other function. If it helps, think of the function as having an exact external copy of itself. You're coding away, doing your own thing and then you get to a point where you'd call a function in this case the function just happens to be itself. Procedurally, there's no difference in calling the same function and calling an external function which just happens to do the same thing. Using Recursion to Solve FactorialsFactorials are a classical means to illustrate recursion because the factorial of a number is the value of multiplying everything before the number. So the factorial of 5 is 1*2*3*4*5 and the factorial of 10 is 1*2*3*4*5*6*7*8*9*10. Now because of the overhead of calling functions, recursion isn't the most efficient way to generate a factorial, it is, however, the most efficient way to generate a tutorial ;) This code is in javascript since it will be used to generate an interactive experience after the break.
function factorial(n) {
if(n <= 1) {
return 1;
} else {
return n * factorial(n - 1);
}
}
As you can see this is very simple stuff! Just pass factorial a number, say 5 and it will compute the value of 5*4*3*2*1. if (n <= 1) is our break point (officially this is known as a base case -- the point at which the recursive calls stop and the function starts to "unwind" so to speak), if we've reached 1 then the function simply returns 1 without calling itself again thus starting the "unwinding" process. To see what's going on with this function, click here to start the interactive demo and enter a number between 1 and 20. Using Recursion to Crawl XMLAny data type which is formatted into tree-like structures are a candidate for recursion, either from collapsible message boards (email and Usenet threads are a great example of this) or processing XML data. XML especially is a great candidate for recursion. Here's a spot of code which will traverse any xml document you give it from xml data to an rss feed -- it will even traverse most HTML if you pass document -- crawlXML(document).
function crawlXML(doc) { // Crawls an XML document
if(doc.hasChildNodes()) { // If present element has children
_xmlStr+='<ul><li>'+doc.tagName+'> '; // Display current tag name
for(var i=0; i<doc.childNodes.length; i++) { // for each child node on current level
crawlXML(doc.childNodes[i]); // Call this function recursively
} // end for loop
_xmlStr+='<\/li><\/ul>'; // Close the list item.
} else { // current element has no children
_xmlStr+=doc.nodeValue; // So display the value of the data
} // End childNode check
} // End crawlXML
As you can see, if the current "node" has child elements then crawlXML calls itself to deal with the child elements, otherwise it outputs the value of the current node. Using Recursion to Sort (quicksort)Recursion is also the cornerstone of the "quicksort" algorithm.
sortArray = Array(5,8,9,10,38,81,24,83,22,21,1,4,63,73,42,88);
function qSort(sortArray, left, right)
{
var scratch_left = left; // remember current left
var scratch_right = right; // remember current right
var pivot = sortArray[left]; // remember current value
while (left < right) {
while ((sortArray[right] >= pivot) && (left < right)) {
right--;
}
if (left != right) {
sortArray[left] = sortArray[right];
left++;
}
while ((sortArray[left] <= pivot) && (left < right)) {
left++;
}
if (left != right) {
sortArray[right] = sortArray[left];
right--;
}
}
sortArray[left] = pivot;
pivot = left;
left = scratch_left;
right = scratch_right;
if (left < pivot) { qSort(sortArray, left, pivot-1); }
if (right > pivot) { qSort(sortArray, pivot+1, right); }
}
qSort(sortArray, 0, sortArray.length-1);
To get a visual idea of how this sort is working, here is a wikipedia image of the sorting process. If you'd like a very in-depth view of quicksort and how it works the Wikipedia entry is a great place to start. Javascript Recursion in the BrowserIn the browser, javascript will limit recursion to 1,000 iterations -- that is a function may only call itself 1,000 times before it needs to start "unwinding". This is to prevent infinite loops (IE a process where a node points to itself or two nodes point to each other). 1,000 is actually a fairly low number and as a result will limit the size of data files you can work with inside the browser. Fortunately most other languages will allow you to create as much recurssion as you need to get the job done. Recursive Imagry
The second example is the more benign, and certainly safer image to the right. This illustrates what's known as a "Droste" effect. The Droste effect is a Dutch term for a specific kind of recursive picture. A picture exhibiting the Droste effect depicts a smaller version of itself in a place where a similar picture would realistically be expected to appear. This smaller version then depicts an even smaller version of itself in the same place, and so on. Technically this can go on forever, but practically it continues as long as the resolution of the picture allows, which is relatively short since each iteration exponentially reduces the picture's size. And the Droste effect is of course another example of recursion. Tail RecursionTail recursion is a condition where you should strongly question the need to use recursion. Specifically it's when the recursive call happens at the very END of the function and nothing occurs afterwards. If this is the case, in languages which do not do this for you automatically -- which are most all of them -- you should consider changing the recursive function into a loop. The reason for this is that function calls are more inefficient and use more resources than loops so the ability to recognize tail recursion and code around it is an important consideration in designing your programs. The example factorial function above is an example of tail recursion because the recursive call is the last command inside the function. So the most efficient way to express this function (when you are not demonstrating recursion of course) is as follows...
// recursive
function factorial(n) {
if(n <= 1) {
return 1;
} else {
return n * factorial(n - 1);
}
}
// iterative
function factorial(n) {
var result=1;
for (var i=1; i <= n; i++) {
result = result * i;
}
return result;
}
On the otherhand, even though quicksort does its recursive calls at the end, there are two seperate calls as such:
if (left < pivot) { qSort(sortArray, left, pivot-1); }
if (right > pivot) { qSort(sortArray, pivot+1, right); }
Since "(right > pivot)" comes after (left < pivot) and there is a possibility both might be executed this is not tail recursion because of the possibility that "(right > pivot)" might need to execute after "(left < pivot)". Likewise the XML crawler function is not an example of tail recursion because its recursive call happens in the middle of the function. ConclusionRecursion actually isn't very common in day to day programming. I often think
that XML became so buzz-worthy because all these programmers with textbook college
educations were so shocked and astonished to discover something they sweated bullets
to master could actually be used in the real world. However in situations where recursion
can be used it is definitely one of the more valuable tools in the programmer's arsenal,
and now, hopefully one of your tools as well.
|