Building the Beowulf Cluster

This build started with 6, HP xw6400 workstations.

So the first thing to be done was take off the side panels and see what was to be done. As these are pre-built desktops there will probably be some sort of proprietary system or hardware somewhere inside.

Looking from here there doesn’t seem to be anything different to a regular system. However with a closer look problems start to appear.

First problem is the screws used to hold down the motherboard.

I did not have access to the bit needed to undo these screws. However they could be removed. But this is when I discovered the next problem. The CPU cooler is mounted to the case rather than to a back plate on the motherboard. This meant that the rear panel of the case must remain attached to the motherboard.

So the process of removing all the extra internal pieces began.

From here the case was dismantled.

And then cut to the desired shape.

After doing this process to each of the other machines it was now time to build an enclosure to house them in.

I acquired an old server rack and striped it for materials. This served as the main basis for the frame of the enclosure.

This was the original server and it was used along with some shelves to create this.

Next step was to start putting the machines on their shelves.

After placing all the machines on their shelves, some cable management was done along with attaching all the power and Ethernet cables.

Next step was to finish tidying up the wires and add the side panels and top panel.

Then finally the top panel and wheels were added.

MPI with 1-Dimensional Leapfrog

using Plots
using MPI
using Statistics
using DelimitedFiles

function real_psi(N, R_current, I_current, delta_t, delta_x, V)
    R_next = fill(0.0, N)
    s = delta_t/(2*delta_x^2)
   for x in 2:N-1
      R_next[x] = R_current[x]-s*(I_current[x+1]-2*I_current[x]+I_current[x-1])+delta_t*V[x].*I_current[x]
      R_next[1] = R_next[2]
      R_next[N] = R_next[N-1]
   end
   return R_next
end

function imag_psi(N, I_current, R_current, delta_t, delta_x, V)
   I_next = fill(0.0, N)
   s = delta_t/(2*delta_x^2)
   for x in 2:N-1
      I_next[x] = I_current[x]+s*(R_current[x+1]-2*R_current[x]+R_current[x-1])-delta_t*V[x].*R_current[x];
      I_next[1] = I_next[2]
      I_next[N] = I_next[N-1]
   end
   return I_next
end

function leapfrog(comm, shared, width)
   rank = MPI.Comm_rank(comm)
   ENV["PLOTS_TEST"] = "true"
   ENV["GKSwstype"] = "100"
   N = 1000
   x = collect(0:(1/(N-1)):1)
   x_0 = fill(0.4, N)
   C = fill(10.0, N)
   σ_sqrd = fill(1e-3, N)
   k_0 = 500.0
   Δ_x = 1e-3
   Δ_t = 5e-8
   ψ = C.*exp.((-(x-x_0).^2)./σ_sqrd).*exp.((k_0*x)*1im)
   R_cur = real(ψ)
   I_cur = imag(ψ)
   V = fill(0.0, N)
   for i = 600:convert(Int64, 600+width)
      V[i] = rank*8000
   end
   I_next = imag_psi(N, I_cur, R_cur, Δ_t, Δ_x, V)
   before = fill(0.0, 386)
   after = before
   # Do the leapfrog
   anim = @animate for time_step = 1:20000
      R_next = real_psi(N, R_cur, I_cur, Δ_t, Δ_x, V)
      R_cur = R_next
      I_next = imag_psi(N, I_cur, R_cur, Δ_t, Δ_x, V)
      prob_density = R_cur.^2+I_next.*I_cur
      I_cur = I_next
      if time_step == 1
         before = filter(!isnan, prob_density[200:585])
      end
      if time_step == 19999
         after = filter(!isnan, prob_density[200:585])
      end
      plot(x, prob_density,
         title = "Wave packet against $(convert(Int64, round(V[600]))) high $width width barrier",
         xlabel = "x",
         ylabel = "Probability density",
         ylims = (0,200),
         legend = false,
         show = false
         )
      plot!(x,abs.(V))
   end every 20
   percentage = round(100*(((mean(before)-mean(after))/mean(before))); digits=2)
   gif(anim, "./Figures/ParallelTest/MPILeapFrog_height_$(convert(Int64, round(V[600])))_width_$(width)_barrier_$(percentage).gif", fps=30)
   MPI.Win_lock(MPI.LOCK_EXCLUSIVE, 0, 0, win)
   MPI.Put([Float64(convert(Int64, round(V[600])))], 1, 0, convert(Int64, ((2*MPI.Comm_rank(comm)))), win)
   MPI.Put([Float64(percentage)], 1, 0, convert(Int64, (1+(2*MPI.Comm_rank(comm)))), win)
   MPI.Win_unlock(0, win)
   MPI.Barrier(comm)
end

MPI.Init()
comm = MPI.COMM_WORLD
shared = zeros(MPI.Comm_size(comm)*2)
win = MPI.Win()
MPI.Win_create(shared, MPI.INFO_NULL, comm, win)
MPI.Barrier(comm)
open("./Files/MPIresults.csv", "w") do fo
   for width = 0:25:200
      if MPI.Comm_rank(comm) == 0
         @time leapfrog(comm, win, width)
      else
         leapfrog(comm, win, width)
      end
      if MPI.Comm_rank(comm) == 0
         for i = 0:MPI.Comm_size(comm)-1
            height = shared[convert(Int64, (1+(2*i)))]
            percentage = shared[convert(Int64, (2+(2*i)))]
            writedlm(fo, [i height width percentage], ",")
         end
      end
   end
end
MPI.Barrier(comm)

MPI.Finalize()

This code was used to produce a huge set of data which resulted in the following graphs.

This graph shows the effect of barrier height and width of the wave packet. Every single point on this graph has an associated animation that are all available on the GitHub. But here is a few sample ones.

MPI Examples in Julia

using MPI # Adding the MPI Library

MPI.Init() # Initialising the MPI library
const comm = MPI.COMM_WORLD # setting configuration constants
const rank = MPI.Comm_rank(comm)
const size = MPI.Comm_size(comm)

N = convert(Int64, size*2) # setting N to twice the size
shared = zeros(N) # creating an array that will become shared
win = MPI.Win() # creating the window that will be used to share an array
MPI.Win_create(shared, MPI.INFO_NULL, comm, win) # adding the array to the window
MPI.Barrier(comm) # waiting for all ranks to catch up
offset = N*(rank/size) # setting the offset for each rank
dest = 0 # setting the destination node
nb_elms = 1 # setting the number of elements that will be added at a time
no_assert = 0 # setting assert to false
for i = 0:1
    MPI.Win_lock(MPI.LOCK_EXCLUSIVE, dest, no_assert, win) # locking the shared resource
    MPI.Put([Float64(rank)], nb_elms, dest, convert(Int64, offset+i), win) # adding a ranks rank number to the array
    MPI.Win_unlock(dest, win) # unlocking the resource
end
MPI.Barrier(comm) # waiting for all ranks to catch up

if rank == 0 # Only rank 0 will performe this
    MPI.Win_lock(MPI.LOCK_SHARED, dest, no_assert, win) # locking resource
    println("I was sent this: ", shared') # displaying the shared array
    MPI.Win_unlock(dest, win) # unlocking the resource
end
MPI.Finalize() # Finalising the MPI section of code

This MPI example writes the rank number to two locations in a shared array. Doing this displays all characteristics that will be expected to be performed for the implementation of the leapfrog method in the future.

Speed Comparison between Julia and Matlab

I tested the same algorithm in both Julia and Matlab to see which one could compute it faster. The same machine was used for both computations.

I did four tests.

The first test is running the program as is without having anything stored in the work space. This shows how long it would take you to run the program for the first time.

Results:
Matlab: 36.7 seconds
Julia: 33.2 seconds

The second test i did was running the same program again but this time it will have information stored in its work space.

Results:
Matlab: 32.6 seconds
Julia: 12.4 seconds

The third test I did was running the programs without displaying the graph as it was produced, while also keeping its work space from previous runs.

Results:
Matlab: 18.6 seconds
Julia: 10.7 seconds

The final test i did was running the program again with displaying the results but doubling the number of frames of the resulting graph that is displayed.

Results:
Matlab: 73.3 seconds
Julia: 25.5 seconds

As can be seen from the above results Julia is faster that Matlab in all regards for this problem. And as for the last comparison it is almost 3 times faster.