Exercises Module 3

Module-3: Examples of Efficient Data Structures for Analysis of Big Data

Exercise 1 - Efficient Data Structures - Linked List

1) Index the particles in the simulation using a Linked List data structure, dividing the space into 250 cells, so that each cell has a length of 100 kpc/h.

2) Use this structure to query the particles of the cell containing a FOF group (FOF 5, say). Include also the particles with ±1 cells around this cell.

3) Compare the speed gain relative to a brute-force algorithm.

Note: The speedup achieved for larger simulations is essentially unconstrained and can be arbitrarily high. I cannot emphasize enough the importance of using these data structures.

# Exercise 1 - Solution to exercise 1 should go here
#======================================================

#======================================================

Exercise 2 - Hash Tables to Cross-Match Particles Across Snapshots

1) Index the particles using Python’s built-in hashing mechanism, implemented in the dictionary data structure.

2) Use this hash table to cross-match particles belonging to a dark matter halo across two different snapshots.

3) Check the speed gain compared to, e.g., linear search.

4) Optional: Contruct your own hash table

# Exercise 2 - Solution to exercise 2 should go here
#======================================================

#======================================================

Alejandro Benitez-Llambay

Exercises Module 3

Module-3: Examples of Efficient Data Structures for Analysis of Big Data

Exercise 1 - Efficient Data Structures - Linked List

Exercise 2 - Hash Tables to Cross-Match Particles Across Snapshots