Pages

November 21, 2008

Task Distribution in Legacy Application

I have been tasked with optimizing a legacy application that runs as a data calculation process within a J2EE web server. I was told that it was designed in such a way that multiple servers could be used to distribute the processing among the different servers. However, after getting into the source code I found that the data to be processed is not truly distributed among the servers and in fact they all try to process the same data simultaneously.

I need to create a true distribution of data for the various servers, but the servers do not know about each other and the current architecture is not equipped to add this feature. The only solution I have is to randomize the order of the data so that it is unlikely multiple servers will process the different pieces of data simultaneously. I think that it is a bit of a crude solution, but I feel like my hands are tied.

In an ideal world, I would modify the architecture so that it would mirror the MPP model of parallel processing. This would allow the data to be easily segmented for each server. Has anybody done anything like this before? I would love to hear about it and how you implemented your solution.