Parallel use of Py-SPHViewer 2.0

Parallel SPH Rendering Tutorial with ipyparallel

This tutorial will extend the previous one by introducing parallel computing using the ipyparallel library to render large cosmological datasets more efficiently. We will split the computational load across multiple processes (or cores), using MPI engines to distribute the work across different nodes or processors. This allows the rendering to be done in parallel, drastically speeding up the computation.

Note: For issues setting up the cluster environment, consider checking the official ipyparallel documentation.


Prerequisites

Before starting, make sure you have the following installed:

Install them via pip if needed:

pip install ipyparallel h5py mpi4py sphviewer2

Step-by-Step Code Walkthrough for Parallel Rendering

1. Setup the Cluster

import ipyparallel as ipp

# Start a cluster with 8 engines using MPI for communication
cluster = ipp.Cluster(n=8, engines='mpi')

# Await cluster startup
await cluster.start_cluster(n=8)

# Connect the client to the cluster
rc = cluster.connect_client_sync()

Here, we initialize a cluster using 8 MPI engines. This will allow the parallel execution of tasks across 8 processes (or more if you specify). The ipyparallel library will manage distributing tasks to different workers (engines).


2. Defining the make_image Function

def make_image(xx, yy, mm, hh, xmin, xmax, BoxSize, LevelMin, LevelMax):
    from sphviewer2 import sphviewer2
    sph = sphviewer2()
    image = sph.render(xx, yy, mm, hh, xmin, xmax, BoxSize, LevelMin, LevelMax)
    return image

Since the rendering will be performed in parallel, each worker will handle a subset of the full dataset, process it, and then return an image corresponding to that subset.


3. Splitting the Data and Dispatching Parallel Tasks

nproc = 8  # Number of processes for parallel rendering

async_results = [] 
LevelMin = 8
LevelMax = 11

for i in range(nproc):
    # Dispatch the parallel task to one of the engines
    async_result = rc[i].apply_async(
        make_image, 
        pgas[i::nproc,0], pgas[i::nproc,1], 
        mgas[i::nproc], hgas[i::nproc], 0.0, 0.0, 
        BoxSize, LevelMin, LevelMax)
    
    # Store the result
    async_results.append(async_result)

In this part of the code, we:


4. Combining the Results

final_image = np.zeros([1<<LevelMax, 1<<LevelMax])  # Initialize an empty image of size 2^(2*LevelMax)

# Collect the results from the engines and sum the images
for i in range(nproc):
    final_image += async_results[i].get()

# Display the final combined image
plt.imshow(np.log10(final_image))

Full Parallel Code Example

Here is the complete code for rendering the image in parallel. As before, you can download a simulation file from the CAMELS project:

https://users.flatironinstitute.org/~camels/Sims/IllustrisTNG/CV/CV_0/snapshot_090.hdf5

import ipyparallel as ipp
import h5py
import numpy as np
import matplotlib.pyplot as plt

# Initialize and start the cluster with MPI engines
cluster = ipp.Cluster(n=4, engines='mpi')
await cluster.start_cluster(n=8)
rc = cluster.connect_client_sync()

# Ensure that all 8 engines are ready
rc.wait_for_engines(n=8)

# Load the simulation snapshot (HDF5 file)
snap = h5py.File("path/to/snapshot_090.hdf5", 'r')
pgas = snap['PartType0/Coordinates'][()]
rho  = snap['PartType0/Density'][()] * 1e10
mgas = snap['PartType0/Masses'][()] * 1e10
hgas = (32 * mgas / (4./3.*np.pi * rho))**(1./3.)
BoxSize = snap['Header'].attrs['BoxSize']

# Define the SPH rendering function
def make_image(xx, yy, mm, hh, xmin, xmax, BoxSize, LevelMin, LevelMax, k=None):
    from sphviewer2 import sphviewer2
    sph = sphviewer2()
    image = sph.render(
        xx, yy, mm, 
        hh, xmin, xmax, 
        BoxSize, LevelMin, LevelMax, k
        )
    return image

nproc = 8  # Number of processes for parallel rendering

async_results = [] 
LevelMin = 8
LevelMax = 11

for i in range(nproc):
    # Dispatch the parallel task to one of the engines
    async_result = rc[i].apply_async(
        make_image, 
        pgas[i::nproc,0], pgas[i::nproc,1], 
        mgas[i::nproc], hgas[i::nproc], 0.0, 0.0, 
        BoxSize, LevelMin, LevelMax)
    
    # Store the result
    async_results.append(async_result)

# Combine the results
final_image = np.zeros([1<<LevelMax, 1<<LevelMax])  # Initialize an empty image of size 1024x1024

# Collect the results from the engines and sum the images
for i in range(nproc):
    final_image += async_results[i].get()

# Display the final combined image
plt.imshow(np.log10(final_image))
plt.show()

Conclusion

In this tutorial, we extended the single-process SPH rendering example to run in parallel using ipyparallel and MPI engines. The key steps were:

  1. Splitting the data across multiple processes.
  2. Distributing the tasks to parallel workers (engines).
  3. Collecting and combining the results to produce the final image.

Parallel processing is a powerful tool when dealing with large datasets like cosmological simulations, allowing for significant speed improvements when rendering complex fields like SPH gas densities.

By adapting this method, you can scale the computation for larger datasets or more complex simulations using a higher number of processes.