# Particle Merging¶

Be careful when using this module and read carefully the documentation.

Note that the nomenclature might change

The ability to merge macro-particles can speed-up the code efficiency and reduce the memory footprint in some specific simulation scenarii:

• When macro-particles accumulate in a fraction of the simulation domain hence strongly worsening the load imbalance (ex: Weibel collision shocks, laser wakefield electron acceleration).
• When macro-particles are generated in a large quantity due to some additional physical mechanisms (ionization, macro-photon emission, QED pair production…)
• When macro-particles travel in large quantities outside interesting physical regions.

Smilei’s merging method is inspired from that of M. Vranic [Vranic2005] with some additions.

Please refer to that doc for the definitions in the namelist file.

## Understand the method¶

The method of M. Vranic, illustrated (in 2D) in Fig. Fig. 41, consists in 3 main steps:

1. Decompose macro-particles into groups according to their location so that they have nearby positions. In Smilei, macro-particles are sorted by field cells, even though, in Vranic et al., the decomposition can be larger than just one cell.
2. Subdivide groups into sub-groups in momentum space so that macro-particles share close kinetic properties.
3. Merge macro-particles located in the same sub-group into 2 new macro-particles, while conserving charge, energy and momentum.

• it is relatively easy to understand and implement,
• it has a relatively low computational cost,
• it is efficient without impacting significantly the physical results.

Warning

This assumes that the parameters are adequately tuned. Otherwise, the macro-particle merging can affect the final simulation results.

### 1. Momentum sub-groups¶

The momentum ($$\mathbf p$$) space may be decomposed either in cartesian ($$p_x$$, $$p_y$$, $$p_z$$) or spherical ($$p$$, $$\theta$$, $$\phi$$) coordinates.

In each cell, for each coordinate $$\alpha$$:

• we compute the overall limits in momentum space.
• This space is divided in $$N_{\alpha}$$ sub-groups (as prescribed by the user) of length $$\Delta_{\alpha}$$.

The spherical components are related to the Cartesian momentum components by:

(41)$\begin{split}p = \sqrt{ p_x^2 + p_y^2 + p_z^2 }\\ \theta = \arctan{ \left( p_y / p_x \right)}\\ \phi = \arcsin{\left( p_z / p \right)}\end{split}$

Since macro-particle momentum components are defined in Cartesian geometry by default, the spherical discretization induces small additional computation. However, it makes the merging process more accurate. Indeed, in the Cartesian discretization, the maximum angle between the momentum directions of two macro-particles located in the same sub-group depends on the sub-group. For instance, in the cell adjacent to the origin $$p_x = p_y = p_z = 0$$, this angle equal up to $$\pi / 2$$ whatever the discretization. The spherical geometry ensures that the merging accuracy depends on the discretization and is similar for all momentum cells. The overhead induced by the change of geometry is a small fraction of the entire process.

### 2. Merging algorithm for massive macro-particles¶

In step 3, for each sub-group containing more than 4 macro-particles, the algorithm merges them into 2 macro-particles. Let us denote by $$k$$ the macro-particles in a given sub-group $$\mathrm{M}$$, their weights by $$w_k$$, their energies by $$\varepsilon_k$$ and their momenta by $$\mathbf{p}_k$$. We start by computing total quantities:

(42)$\begin{split}w_t = \sum_{k \in \mathrm{M}}{w_k}\\ \varepsilon_t = \sum_{k \in \mathrm{M}}{w_k \varepsilon_k}\\ \mathbf{p}_t = \sum_{k \in \mathrm{M}}{w_k \mathbf{p}_k}\\\end{split}$

In spherical geometry, the total angles are also defined:

(43)$\begin{split}\theta_t = \sum_{k \in \mathrm{M}}{w_k \theta_k}\\ \phi_t = \sum_{k \in \mathrm{M}}{w_k \phi_k}\end{split}$

Merging all the macro-particles into one cannot always conserve weight, energy and momentum. Vranic et al. propose to merge into 2 macro-particles:

(44)$\begin{split}w_t = w_a + w_b \\ \mathbf{p}_t = w_a \mathbf{p}_a + w_b \mathbf{p}_b \\ \varepsilon_t = w_a \varepsilon_a + w_b \varepsilon_b\end{split}$

The following energy-momentum relation has to be satisfied for both macro-particles a and b:

(45)$\varepsilon^2 = p^2 + 1$

To simplify the problem, Vranic et al assume that merged macro-particles have the same weight $$w_a = w_b = w_t / 2$$ and same energy $$\varepsilon_a = \varepsilon_b = \varepsilon_t / w_t$$.

As illustrated in Fig. 44, it follows that:

(46)$\begin{split}\mathbf{p}_a + \mathbf{p}_b = \frac{2 \mathbf{p}_t}{w_t} \\ \mathbf{p}_{a,\perp} = - \mathbf{p}_{b,\perp} \\ \mathbf{p}_{a,\parallel} = \mathbf{p}_{b,\parallel} = \mathbf{p_t} / w_t\end{split}$

We denote by $$\omega$$ the angle between $$\mathbf{p_a}$$ and $$\mathbf{p_t}$$ so that:

(47)$\cos{\omega} = \frac{\mathbf{p_t}}{w_t \mathbf{p_a}}$

We denote by $$\mathbf{d}$$ the coordinate vector of the sub-group where the macro-particles are located.

The plane $$(\mathbf{e_1},\mathbf{e_2})$$ is the plane made by the vectors $$\mathbf{p_t}$$ and $$\mathbf{d}$$. We choose that it contains $$\mathbf{p_a}$$ and $$\mathbf{p_b}$$ so that we have only one possible solution.

The vectors $$\mathbf{e_1}$$ and $$\mathbf{e_2}$$ are given by the following formula in the PIC code’s momentum frame:

(48)$\mathbf{e_1} = \mathbf{p_t} / p_t$
(49)$\mathbf{e_3} = \frac{ \mathbf{d} \times \mathbf{e_1} }{d}$
(50)$\mathbf{e_2} = \mathbf{e_1} \times \mathbf{e_3}$

Finally, the new macro-particle momentums are:

(51)$\begin{split}\mathbf{p_a} = p_a \left( \cos{\left( \omega \right)} \mathbf{e_1} + \sin{\left(\omega\right)} \mathbf{e_2} \right) \\ \mathbf{p_b} = p_b \left( \cos{\left( \omega \right)} \mathbf{e_1} - \sin{\left(\omega\right)} \mathbf{e_2} \right)\end{split}$

The new macro-particle positions are assigned the position of one of the merged macro-particles. We have tested to assign them randomly or to the first macro-particles of the merged list and we did not observe any difference.

This algorithm does not work when the total momentum $$\mathbf{p}_t$$ of the macro-particles to be merged is in the direction of $$\mathbf{d}$$. In this case $$|| \mathbf{e_3} || = 0$$ and the system cannot be solved. In this specific case, the merging is not performed.

### 3. Merging algorithm for macro-photons¶

Macro-photons can be merged with the same algorithm, the only difference being that the momentum norm is equal to the energy $$\varepsilon = p$$.

When the total momentum $$\mathbf{p}_t$$ is in the direction of $$\mathbf{d}$$, macro-photons can be merged into a single one, contrary to the massive macro-particles case, since $$\varepsilon_t = || \mathbf{p}_t ||$$. This specific situation is implemented in the code.

## Implementation details¶

The Vranic merging method is implemented with both Cartesian and the spherical discretizations in the directory src/Merging. It is considered as a particle operator and the merging algorithm is managed with a factory (MergingFactory.h) as any operator with multiple implementations. The Cartesian implementation is done in the class MergingVranicCartesian and the spherical one in MergingVranicSpherical.

For both methods, the implemented algorithm is very similar.

For each cell:

1. Initialization of the sub-group discretization
2. Computation of the direction vectors ($$\mathbf{d}$$): this step depends on the discretization and can be efficiently vectorized.
3. Computation of the sub-group indexes for each macro-particle. Efficiently Vectorizable.
4. Computation of the number of particles per sub-group. Not vectorizable because of random memory accesses.
5. Computation of the index of each sub-group in the sorted array of particles (only the particle indexes are sorted). Not vectorizable.
6. Sorting of the macro-particles per sub-groups, the index previously computed determines where each sub-group starts. Not vectorizable.

Then, for each sub-group:

1. Division of the macro-particles of the sub-groupinto packets (size depends on the user parameters merge_min_packet_size and merge_max_packet_size)
2. Merge of the packs using the previously described Vranic algorithm. Partly vectorized.
3. Creation of the merged macro-particles at the position of the previous ones
4. Tag of the macro-particles to be removed

Then, once the merging finished for a given patch:

1. Compression of the macro-particle list (remove hole in arrays let by removed and tagged particles). By cleaning the particle vector at the end, we limit the computational impact of this step.

### 1. Cartesian sub-group discretization¶

How to discretize momentum space is in fact one of the most important points. The user provides $$N_x$$, $$N_y$$ and $$N_z$$ via the namelist, but it may be slightly adjusted for algorithmic reasons:

• If the momentum space is very narrow in one direction, only one sub-group may be used.
• We force the origin ($$p = 0$$) to delimit two sub-groups so that a sub-group cannot contain two opposite momenta. This may require an extra sub-group to fit the whole momentum space.

### 2. Spherical sub-group discretization¶

The user provides $$N_p$$, $$N_\theta$$ and $$N_\phi$$ via the namelist, but adjustments may occur:

• If the momentum space is very narrow in one direction, only one sub-group may be used.
• The $$\Delta_{\alpha}$$ are modified by a factor 1.01 to include the maximum boundary.

### 3. Solid-angle correction in 3D¶

A rudimentary spherical discretization does not ensure that all sub-groups span similar solid-angles, as they becomes arbitrarily small at the poles.

To obtain a solid angle approximately constant, the discretization in $$\theta$$ is adjusted depending on the value of $$\phi$$. Denoting by $$\Omega_{0}$$ the solid angle at the smallest $$|\phi|$$, the sub-groups length $$\Delta_\theta$$ along $$\theta$$ varies to satisfy $$\Omega = \sin{(\phi)}\Delta \theta \Delta \phi = \Omega_{0}$$.

### 4. Accumulation effect¶

In one merging event, the strongest contribution comes from weightiest macro-particles, which mostly come from a previous merging event instead of the smaller macro-particles. Some macro-particles may thus become uncontrollably heavy and dominate others with little changes in their kinetic properties. This effect may be particularly strong with large sub-groups and broad momentum distributions, which are not well conserved.

To illustrate this phenomenon, let us consider a 3D magnetic shower benchmark: the domain is filled with an electron-positron plasma with all macro-particles initialized using the same Lorentz factor $$\gamma = 8125$$ in the same direction. They propagate orthogonally to a constant and uniform magnetic field of amplitude $$B = 1000 e/(m\omega)$$, corresponding to a quantum parameter of $$\chi = 20$$ for both electrons and positrons. The input script of this simulation is available here.

This accumulation effect creates peaks in the photon energy distribution as shown in Fig. 48 a).

Need to explain the correction.

Warning: the accumulation correction is not working with the logarithmic discretization.

### 5. Logarithmic scale¶

For the spherical discretization only, the momentum norm $$p$$ discretization may be chosen logarithmically scaled. Due to the logarithm computation, this option is slightly slower than the linear version. Nonetheless, it can handle more appropriately broad momentum distributions.

On the magnetic shower case presented in the previous section, the logarithmic discretization reproduces nicely the distribution obtained without merging.

Warning: the logarithmic discretization is subject to accumulation oscillations but is not compatible with the accumulation correction.

## Simulation results¶

In this section, the particle merging is tested with a 3D scenario of electron-positron pair cascading. Like the magnetic shower, a seed of electrons or positrons in a strong electromagnetic field generates high-energy photons that, in turn, decay into electron-positron pairs. The difference is that the seed particles and the newly created ones may gain some energy from the fields and participate to the generation of pairs. The production of electron-positron pairs can therefore be maintained (the cascade) as long as there is a source of energy. In such a scenario, we can expect an exponential growth of the number of particles leading to the creation of an electron-positron plasma.

A plasma of electrons and positrons (the seed) is initially irradiated by two counter-propagating lasers (strong field and source of energy), chosen as circularly-polarized plane waves.

When they collide, the two waves form a steady plane wave of very strong amplitude able to trigger strong QED effects. Detailed simulation parameters are available in this namelist.

An aggressive merging process is performed at every timestep with a relatively sparse momentum-space discretization. Note that, in 1D and 2D, it could use smaller sub-groups and be performed more often as there would be more particles per cell. The merging is applied on all species. All cases are run during a maximum of 1000 seconds.

As presented in Fig. 52, the merging process starts when the number of macro-particles is high enough: $$\sim 10^5$$ macro-photons. Between 10% and 20% more iterations are achieved, compared to the no merging case.

Each merging method does not exactly gives the same kinetic energy and weight evolution. As we will see, the merging processes modify the momentum distribution and influence the physical processes.

We now compare energy spectra at time $$t = 39 \omega^{-1}$$ (nearly when the no merging case saturates) in Fig. 53. All merging methods significantly affect the energy distributions, and oscillations are most visible in the photon distribution due to the accumulation effect.

Fig. 54 shows the $$k_x-k_y$$ momentum distribution of the photons. It clearly shows that, with their level of discretization, none of the merging processes can adequately reproduce the reference distribution.

Fig. 55 shows the $$p_x-p_y$$ momentum distribution of the electrons.

To compare the merging methods in term of performance, Fig. 56 shows the CPU time necessary to compute a numerical timestep as a function of time. The linear spherical discretization is the fastest method because the solid angle correction reduces the number of sub-groups. The logarithmic spherical discretization has the same advantage but it is slowed down by the computation of logarithms, and, in the end, similar to the original cartesian method described in [Vranic2005].

The following video illustrates the simulation benchmark in the case of a logarithmic spherical discretization. Specifically, it shows the time evolution of the electron, the positron and the photon densities in the plane x-y, integrating over z. It shows and exponential growth of photons and massive particles happening first in the y-z plane near the center of the domain then expanding longitudinally.

## References¶

 [Vranic2005] (1, 2) M. Vranic et al., CPC, 191 65-73 (2015)