Some additional development pathways and thoughts

While developing this paper there were some avenues we explored, which didn’t make it to the paper. Either due to time constraints (Connor needed to write up his thesis before a specific date), or due to computational complexity.

We have collected these additional developmental pathways, and notes, here in the hope that they will be useful in the future if developing this technique further.

PLEASE NOTE: The developmental code repositories are not up-to-date with the branches used for the final analyis. Some rebasing would be needed to use these.

Current: Repo

Currently the template bank is generated by choosing a random sky location/orientation for each proposal and testing the match using 5 harmonics. If the match is below the threshold then the proposal is kept and the sky location/orientation is discarded.

After generating the template bank a script is used to calculate the effective fitting factor and select the number of harmonics to use.

CIT: /home/ian.harry/aLIGO/O3/bank_example/bank_gen/phenompv2/num_comps_identifier

The bank is available at:

CIT: /home/ian.harry/aLIGO/O3/bank_example/bank_gen/phenompv2/num_comps_identifier/prec_bank.h5

Development: Repo

In order to generate a template bank we must choose the number of harmonics that we will use for each template. This was originally done by generating the bank using all 5 harmonics then choosing the number of components for each template later. We aim to do this within the template bank generation code in order to account for template close to each bank template.

The distribution of SNRs in Gaussian noise for a template will follow a chi-squared distribution with 2N degrees of freedom (where N is the number of harmonics). We therefore want to reduce the number of harmonics wherever possible.

In order to achieve this for each proposal we follow these steps (until continue)

Test if it is covered by an existing template in the bank, if it is then continue.
Test if it is covered by itself with 1 harmonic, if it is then add it to the bank with 1 harmonic and continue.
For N = 2, .., 5 test if it is covered by any templates with N harmonics, for templates where N is greater than it’s current number of harmonics. After each N check if it is covered by any templates with that number of harmonics, if it is then pick the template with the greatest match and update the number of harmonics in the bank to N.

There are two methods available for testing coverage, detailed below.

Random Sky Locations/Orientations: Commit

In this method for each proposal we

pick a random sky location/orientation and generate the proposal template
Calculate the match between the bank template (using N harmonics) and the proposal template
Check if the match is greater than the threshold

After adding a new template to the bank the sky location/orientation is discarded.

For each match this requires N FFTs.

Using this method may choose larger values for N than required. For example, if a template requires 2 harmonics for 99.9% of sky positions/orientations we may want to select 2 harmonics, however, if one of the 0.1% of sky positions/orientations is randomly drawn in the bank generation we would select > 2 harmonics. We may then end up increasing the number of harmonics to cover very rare events, reducing our sensitivity to common events by increasing the noise rate.

Averaged Sky Locations/Orientations: Commit

In this method we first generate M random sky positions/orientations, we then calculate the amplitude factors for each harmonic using get_comp_sigmasq (ignoring the template specific b factor).

For each proposal we then

Calculate the harmonic’s amplitudes for each sky position/orientation using the pre-calculated factors and the templates b value (code)
Calculate the match time-series between the N harmonics of the bank template and each harmonic of the proposal template (5N matches) (code)
For each sky position/orientation we scale the match time-series appropriately and find the match (code)
We use the match and the amplitude factors for each sky position/orientation to calculate a sky average fitting factor (weighted by the volume that each sky position/orientation is observable) (code).
Check if this value is greater than the threshold.

For each fitting factor calculation this require 5N FFTs, there are also 5N*M multiplications and additions here which will be the dominant cost if M > log(N).

As this averages over the sky position/orientations the number of harmonics will not be increased due to very unlikely events. However it is slow!

This method is currently slow and could possibly be optimised further

This also currently calculates the factor of b between each harmonic by calculating the angle between J and L at a reference frequency (code). However this angle will change as the binary evolves, we should therefore calculate the ratio of the sigmas of each harmonic and use this to calculate the relative amplitudes.

Search: Repo

The search will take the generated bank and calculate the SNRs using the selected number of harmonics for each template. There are four main areas of the search pipeline that will need to be altered to work:

Inspiral code will need to be changed to calculate the SNR and chisq using the harmonics, as well as storing the extra information.
The noise rate calculation will need to be updated for this method.
The signal rate histograms will need to be generated covering each harmonic.
The coincident statistic will need to updated to include points 2 and 3.

Inspiral: Code

This code is a copy of pycbc_inspiral editted to work for the harmonic search.

In the main executable the code has been updated to store all 5 possible SNRs and phases, and separate SNR threshold for template with different numbers of harmonics can now be specified.

Bank

The template bank now generates the specific number of harmonics for each template and then whitens and orthogonalises them with respect to each other.

The harmonic generation is controlled by the waveform generator here, this has been tidied slightly for readability and optimisation.

Matched Filter

The matched filter is controlled by this class which performs a matched filter separately for each harmonic and adds them in quadrature (this can be done as they have been orthogonalised).

Chisq Calculation

The chisq calculation controlled by this class.

To perform the chisq we reconstruct the maximum SNR template using the complex SNR for each harmonic (code). This reconstructed template will then be used to calculate the chisq bins which will be different for each trigger here.

As the bin boundaries are different for each trigger we would need to call the chisq code separately for each trigger, which would be slow. We therefore introduce a new function shift_multibin, which can take a 2D array of bin edges as well as the complex scaling factor for each harmonic used to reconstruct the maximum SNR template. This is then passed to the function point_chisq_multibin_code, which performs the same optimised chisq routine as point_chisq_code using the different bins and scaling factors for each trigger.

Noise Rate

The noise rate in the aligned-spin search is calculated by fitting the trigger rate for each template with an exponential above a threshold. The slope of the exponential and the overall rate is then averaged over nearby templates with some weighting.

The exponential fit for each template will still work when using more harmonics, the overall rate will be increased and the slope of the exponential shallower.

A modified version of the script which smooths the exponential fits over parameters is given here. This script follows the usual method except it splits the templates into those with 1, 2, 3, 4 or 5 harmonics and smooths over each set separately, this is necessary as each will have very different expected noise rates.

Signal Rate

For the aligned spin search the signal rate is calculated using histograms of the time and phase differences and SNR ratios between different detectors. This is done using the executable pycbc_dtphase. This code works by

Drawing random sky positions/orientations isotropically
Calculating the amplitude, phase and time of arrival for each detector
Finding the SNR ratio, phase difference and time difference for each detector combination
Discarding any point with SNR ratio outside a given range
Binning the remaining values into a histogram with 3n(n-1)/2 dimensions, where n is the number of IFOs (each sample has a weight proportional to the amplitude ** 3)
Smoothing the histogram using a Gaussian kernel to account for uncertainty assuming that the SNRs are roughly equal to the chosen reference SNR

In our case, for each detector we now have a set of N (number of harmonics) SNRs and phases, and 1 time. To copy the method above for each possible combination would have too many dimensions for the problem to be feasible. We will therefore split the problem into more manageable chunks.

Current

In the current search we perform a simple first step and compare the loudest harmonic across all detectors. This only requires a small modification to the aligned-spin method. As we are comparing the same harmonic across all detectors the factor of b will be the same across the network (ignoring differences in PSD) and it can therefore be ignored. The same is true for the initial precession phase. The script to generate the histograms for each harmonic are available here. In this script the effect of the inclination on the amplitude and phase is calculated here. We can see that in the case that we are observing the first harmonic this matches the aligned-spin case as we expect.

Development - Single detector: Code

It would be useful to use more harmonics when comparing the phase and amplitude differences. We could try and split the histograms in different chunks. Here we generate histograms comparing the amplitudes and phases first across a single detector, then across multiple.

Similar to the aligned-spin code we will randomly draw a set of sky positions/orientations and use these to calculate the amplitudes and phases of each harmonic. In the aligned-spin case all templates will have the same expected SNR ratio and phase difference due to a change in sky location. However, in our case the expected amplitudes for the different harmonics will change by a factor b ** k where b = tan(beta) and k is the number of the harmonic (0 to 4). We must therefore generate histograms for different values of b.

** Much like the template bank generation this currently assumes a constant value of b. It also assumes a factor of b which is the same between each harmonic, however, because each harmonic is at a different frequency the factor of b measured between each subsequent harmonic may be different! This will be difficult to handle if the effect is significant. **

The generation of the histograms will now be in a loop over s set of bins in b, for each of these bins we

Draw random sky locations/orientations isotropically
Use the function get_factors to generate the complex amplitude for each harmonic.
Scale the total SNR of the samples to the reference SNR chosen
Add Gaussian noise to the real and imaginary parts of the amplitudes to estimate the uncertainty in the SNR
For each harmonic take all the samples where that harmonic is the loudest and find the SNR ratios and phase differences with respect to that harmonic
For each harmonic bin all the samples where that harmonic is the loudest into a histogram (giving us N histograms)
Record the fraction of signals that are sorted into each histogram

In this method we chose to add Gaussian noise to the complex amplitude instead of smoothing using a Gaussian kernel. In the aligned-spin case the SNR ratio and phase difference uncertainty is taken as constant across the histogram, this is reasonable as there will be no SNRs below the SNR threshold used in the search, so if the reference SNR is set just above this threshold most SNRs will be close to the reference value. However, in our case we store any trigger where the total SNR across the harmonics is greater than the SNR threshold, this can lead to a very large range in the SNRs stored. By adding the noise directly to the complex amplitude we no longer need to pick a constant uncertainty in the SNR ratio and phase difference.

Similarly, if we chose to only store samples where SNR ratios fell within a certain range we would discard a huge number of trigger (particularly for the cases of b close to 0 and b » 1). We therefore split the samples over N histograms, one for each harmonic, and store each sample in the jth histogram where j is the number of the harmonic which is the loudest for that sample. Doing it this way we can store the SNR ratios from 0 to 1 in each case and store all samples. We also record the total weight of the samples stored in each histogram. We can think of this as each histogram representing P(theta

j loudest) and the recorded weight for each histogram as P(j loudest) so by multiplying we can get P(theta).

In order to keep the memory requirement lower for creating and reading the histograms we create one histogram for phase difference and a second for the SNR ratios.

Example plots of the amplitude and phase histograms for the case of 3 harmonics (where 0 is the loudest harmonic) with beta close to 1.

Development - Multi detector: Code

In the multi detector case if we compared each harmonic in one detector to each harmonic in another then there will be (N ** 2) + 1 dimension, even when splitting into separate SNR ratio and phase difference histograms (the +1 comes from the time difference). This takes huge amount of memory, even in the case of 3 harmonics. We therefore choose to only compare each harmonic to the same harmonic in the second detector (still splitting into separate phase and SNR histograms), this gives us dimensions of N + 1 for each histogram, which is manageable.

We still split into different b bins as this will greatly effect the uncertainty in the SNR ratios and phase differences recorded for different harmonics.

Draw random sky positions/orientations isotropically
Use the function get_factors_multi_ifo to generate the complex amplitude for each harmonic in each detector.
Scale the total SNR of the samples to the reference SNR chosen
Add Gaussian noise to the real and imaginary parts of the amplitudes to estimate the uncertainty in the SNR
Find the SNR ratio, phase difference and time difference for each detector combination
Discarding any point with SNR ratio outside a given range (as we are comparing harmonics with the same factor of b ** k this shouldn’t discard too many samples)
Binning the remaining values into histograms (each sample has a weight proportional to the amplitude ** 3)
Smooth the histogram over the time difference using a Gaussian kernel to account for uncertainty in the timing which isn’t estimated when we add Gaussian noise to the complex amplitude

Example plots of the amplitude and phase histograms for 2 detectors, 3 harmonics and b close to 0.

Coincidence

The coincident statistic is then calculated using the statistic ExpFitFgBgNormTHAStatistic. The noise rate is calculated using the same method as ExpFitFgBgNormStatistic, however the signal rate must be updated to read in the histograms for each harmonic.

We currently only compare the loudest harmonic in the loudest detector to the same harmonic across the network. The signal rate is calculates using PhaseTDTHAStatistic which is a modified version of PhaseTDStatistic. This reads in the histograms for each harmonic using each detector as the reference. When calculating the signal rate for each trigger it then split them into groups for each detector/harmonic and finds the appropriate histogram value.

** In the current search the network sensitive volume is calculated using the median sigmasq across each template in each detector. Currently in the inspiral code here the sigmasq value is set to 1. So the sensitive volume is constant for all detectors and templates. This will need to be updated before using three detectors! This will require an averaged sigmasq value over sky location/orientation, howver, this may be slow to compute **