Merge Sort algorithm Assignment

Given: Merge Sort algorithm (Reference: Brassard and Bratley, "Fundamentals of Algorithmics", pg. 228, among many other references). Design a parallel Merge Sort.

Your program will run on a distributed computer system with the following properties:

Number of processors P

Memory per CPU M = 4GB

Cycle time of CPU T = .5 nanosecond

Register size R = 64 bits

Local disk, bandwidth 320 MBytes/sec (Ultra-320 SCSI)

size is (NFS drive size)/N

Switched Ethernet network

Latency L = 20 microseconds

B = 1Gb/sec == 100 Mbytes/sec for messages

larger than 32Kbytes

Shared NFS drive, size D, bandwith same as network.

You are free to chose the number of processors P and the disk size D. You may not need all the above information. If you feel you need some other system property, feel free to assume some reasonable value.

Assume that the initial unsorted file F is on the shared NFS drive, and the final sorted file should also be on that drive. Each item to be sorted is 1024 bytes in length, and the file F has N such items. Comparing two items involves 32 1-byte comparisons.

Merge Sort requires work space; this work space may be on the NFS drive, memory, or any combination.


  1. Parallel Merge Sort algorithm should be in the message passing, SPMD model with a fixed number of processes; assume the implementation will be in PC or MPI. You do not have to provide code that may be compiled; pseudo code, outline or diagrams will be sufficient; however you should provide enough detail that a good programmer would be able to code your design.(See algorithm descriptions in Foster, Scott or Brassard and Bratley).
  2. Provide an argument to justify that your algorithm will not deadlock.
  3. Estimate how your algorithm would perform on the computer system described above. Consider:

-a. What minium file size (in bytes, number of elements, or both) is required for your algorithm to provide speedup on 2 processors?

-b. How does the number of processors you would use depend on the filesize?

-c. With P processors and file size calculated as above, what speedup would you expect?