XmlObject - Adaptive Object Model
Better Designs Faster
Sieve of Eratosthenes
The Sieve of Eratosthenes program was created early on when I was still developing the framework. It was created in an attempt to answer some questions I was getting about how much was it costing to use the XmlObject framework.
I am not going to delve into the code too much, it is simple enough and if you have questions you can always email me. Note, to run the code you will need to install the Visual Studio Tools for Office package.
From the dictionary, here is a definition for the word utility.
the quality or state of being useful for something.
2. something useful
something that serves a useful purpose
As developers we are both entitled and expected to question the utility of new technology. The experienced among us understand that, to not do so invites trouble. A common initial critique of any technology is one of utility. In the case of the XmlObject Framework experienced and novice developers alike see XML and immediately think about cost, both in terms of storage costs and CPU cycles. The concern is that the overhead associated with the XML parser might make the framework unsuitable for everyday use. I feel compelled to make an attempt to respond to these questions, recognizing that I do not have the luxury of answering every last one.
To answer the question of utility I asked myself the following question. What would make the framework not useful? As open ended a question as that might seem, the answer is simple. As engineers ask and answer this question every day. We asked it when computer science was considering the move from procedural programming to object oriented programming (OOP). Before we understood and had realized benefits from OOP, the cost of v-table binding seemed to many to be unnecessary and wasteful.
XmlObjects, like any new technology, forces us to ask many of the same questions. If, for example, using this framework raised the known cost of an algorithm from O(n) to O(n log(n)) then there would be reason for concern. If, on the other hand, I can show that the overhead is minimal, then perhaps the benefits of XmlObject will win the day much as OOP has mostly won out over procedural programming.
After thinking about this problem for awhile, I decided to design and implement an example application as an effort to demonstrate utility. I considered many options and came up with an example based on the algorithm for finding prime numbers first discovered by the Greek mathematician, Eratosthenes. This algorithm is known as the Sieve of Eratosthenes and it is attractive for several reasons. The algorithm is simple, well documented and predictable in terms of its performance. I can use it to create arbitrarily large object models and easily be able to verify that the implementation is producing the correct results. The object model will be interesting enough to allow me to demonstrate utility, using real world operations, and at the same time not be so complex that it would be difficult to explain and understand.
It is a bit of a stretch to call this an example of an adaptive object model. There is really nothing adaptive about it. However, it does use the XmlObject Framework, it does create an object model, and the object model it creates can be arbitrarily large. However, I concede that it has little need (or ability) to adapt to a changing environment. The object model it creates is also homogonous meaning that the objects are all pretty much identical. I am pretty sure that none of this has any bearing on the intended purpose.
Here is a simplified description of Eratosthenes’ algorithm, i.e. to test each of the numbers in a list to see if it is a prime number.
· Write a list of numbers from 1 to the largest number you want to test.
· Mark the number 1 as special and the rest of the numbers as undefined.
· The first number in the list marked undefined, is a prime number.
· Mark this number as prime and mark all multiples of this number as composite.
· Repeat steps 3 through 5 until no more numbers are left in the list.
My implementation is faithful to the above description and I resisted my normal instincts to try and improve it. Here is how one can expect the algorithm to perform.
Steps 1 & 2 first create a list of numbers and then mark them with default values. Together these steps have a cost of O(n).
Steps 3 – 5 require us to iterate through the entire list once for each number in the list. This will result in a cost of O(n2) for these steps.
To turn Eratosthenes’ algorithm into a useful demonstration requires a little more work. Many of you are no doubt familiar with the acronym CRUD. The letters stand for create, read, update, and delete. Cast aside all the high level descriptions of what developers do and this is what you are left with. Steps 1 & 2 from above create data, steps 3 – 5 update that data. More steps were added to read and finally delete the data. If we can measure the cost of these operations and show they are what we expect them to be as we create ever larger object models, then perhaps we can get some answers to this question of utility.
This example will be designed to take the largest number to be evaluated ( the limit) as an input parameter and to record the length of time it takes to compute each of these steps. This time will be divided by the limit to get an average time for each operation. This is what we will use as the cost of an operation. Remember that we expect the cost of create to be linear, i.e. O(n). By comparing the cost to create 1000, 2000, …, 10,000 numbers (remember each number is an object in the framework) we can learn if the XmlObject Framework has an unacceptable overhead.
Here is a chart that shows the results of the run.
Here is what the graph is plotting
where f is an operation of Create, Update, Read, or Delete. Dividing by n makes O(n2) operations linear and O(n) operations constant. As predicted the Update operation has a cost of O(n2) and the other operations are linear. I don’t think I can make any claims beyond that though. However, I am satisfied that the XmlObject Framework didn’t make the operations more expensive than the analysis predicted.