NuRadioReco.modules.likelihood_reconstruction.likelihood_calculator module

class NuRadioReco.modules.likelihood_reconstruction.likelihood_calculator.LikelihoodCalculator(n_antennas, n_samples, sampling_rate, matrix_inversion_method='pseudo_inv', threshold_amplitude=0.01, increase_cov_diagonal=0, ignore_llh_normalization=True)[source]

Bases: object

Probabilistic description of band-limited noise in radio detectors. The noise is assumed to be a multivariate gaussian of dimension n_samples. The covariance matrix of the distribution can be calculated from the spectrum of the noise or directly from datasets containing only noise.

The main purpose of this class is to calculate the likelihood of a signal given a measured trace, which can be used for likelihood reconstruction as described in https://arxiv.org/abs/2510.21925. Additionally, the class can be used to estimate the Fisher information matrix for a parameterized signal model.

Parameters:
n_antennasint

Number of antennas

n_samplesint

Number of samples in each trace

sampling_ratefloat

The sampling rate of the antennas

matrix_inversion_methodstr, optional

If the covariance matrix is not full rank, the inverse, which is used to calculate the multivariate normal, does not exist. This is the case if any frequency amplitudes in the spectra are zero, which corresponds to a distribution in a sub-space which has lower number of dimensions than n_samples, i.e. some dimensions are degenerate. The correct mathematical way to describe the degenerate case is by using the pseudo-inverse and pseudo-determinant to calculate the multivariate normal PDF. This method is used if this parameters is set to “pseudo_inv”, and is numerically stable but slow. We thus also provide the option to use the regular inverse by setting this parameter to “regular_inv”, which works for full rank matrices and sometimes for lower rank matrices (see increase_cov_diagonal).

threshold_amplitudefloat, optional

Fraction of the maximum amplitude in the spectra below which the frequency amplitudes are considered zero. Also used as the threshold for the eigenvalues when calculating the pseudo-inverse and pseudo-determinant.

increase_cov_diagonalfloat, optional

Only used if matrix_inversion_method is “regular_inv”. Calculating the inverse of a (covariance) matrix can be numerically unstable, and it does not exist if the matrix is not full rank. By adding a small component to the diagonal of the covariance matrix it will be of full rank which can improve stability of the calculation. The parameter increase_cov_diagonal sets what fraction of the variance should be added to the diagonal of the covariance matrix only when calculating the inverse.

ignore_llh_normalizationbool, optional

We are generally only interested in likelihood ratios or delta log likelihood. In this case the normalization of the distribution cancels out and can just be set to 1. This speeds up initializing the class/covariance matrices.

Methods

initialize_with_spectra(spectra, Vrms=None)

Initialize the covariance matrices using the spectra of the noise and (optionally) the Vrms per channel.

initialize_with_data(data, method=”using_spectra”)

Initialize the covariance using traces containing noise.

calculate_minus_two_delta_llh(data, signal=None, frequency_domain=True)

Calculates the minus two delta log likelihood for signal (or no signal) given a set of measured traces, e.g, for forward-folding reconstruction.

calculate_minus_two_delta_llh_station(station, sim_station, time_grid=None, use_channels=None, frequency_domain=False, plot=True, return_traces=False)

Calculates the minus two delta log likelihood for signal in a sim_station for a set of measured traces station objects. Finds the best matching time offset of the signal.

calculate_fisher_information_matrix(signal_function, parameters_x0, dx, frequency_domain=False, ignore_parameters=[], plot=False)

Calculates the Fisher information matrix for a parametrized signal model and a given set of parameter values using the intialized probalistic noise model.

initialize_with_spectra(spectra, Vrms=None)[source]

Initialize the likelihood calculator using the spectra/filters of the noise

Parameters:
spectranumpy.ndarray

Array containing spectra with dimensions [n_antennas,n_frequencies] or [n_frequencies]. For the latter case, all antennas are assumed to have the same spectrum. The spectra are defined as the square root of the mean of the Fourier transforms of the noise traces absolute squared, np.sqrt(np.mean(abs(fft(traces))**2)), using the NuRadioReco fft module. Ignoring the normalization, this is equavalent to the filter that a trace with white noise has been filtered with, i.e, this method can be called with spectra=noise_filter and Vrms=noise_amplitude.

Vrmsnumpy.ndarray, optional

List of Vrms values for each antenna. If provided, the spectra will be normalized to these values. Otherwise the normalization of the spectra is used. Default value is None. If only one value is given, all antennas are assumed to have the same Vrms.

initialize_with_data(data, method='using_spectra')[source]

Initialize the likelihood calculator using traces containing noise

Parameters:
datanumpy.ndarray

Array containing traces with noise with dimensions [n_datasets,n_antennas,n_samples] or [n_datasets,n_samples] for one antenna

methodstr, optional

Method to calculate the covariance matrices. If set to “using_spectra”, the spectra are calculated from the data and the covariance matrices are calculated from the spectra. If set to “autocorrelation”, the covariance matrices are calculated directly from the data using numpy.cov()

calculate_delta_llh(data, signal=None, frequency_domain=False)[source]

Calculates delta log likelihood for the datasets relative to the most probable trace

Parameters:
datanumpy.ndarray

Array containing data with dimensions [n_datasets,n_antennas,n_samples] or [n_antennas,n_samples]. For one antenna, the shapes [n_datasets,n_samples] or [n_samples] are also allowed.

signalnumpy.ndarray, optional

Array containing neutrino signal of dimensions [n_antennas,n_samples]. If no signal is provided, it will be set to zeros.

frequency_domainbool, optional

If True, calculate the delta log likelihood in the frequency domain, which is faster.

Returns:
numpy.ndarray

The delta log likelihood for the data relative to the most probable noise

calculate_minus_two_delta_llh(data, signal=None, frequency_domain=True)[source]

Calculates the minus two delta log likelihood for the datasets relative to the most probable noise

Parameters:
datanumpy.ndarray

Array containing data with dimensions [n_datasets,n_antennas,n_samples] or [n_antennas,n_samples]

signalnumpy.ndarray, optional

Array containing neutrino signal signal of dimensions [n_antennas,n_samples]. If no signal is provided, it will be set to zeros.

frequency_domainbool, optional

If True, calculate the delta log likelihood in the frequency domain, which is faster.

Returns:
np.array

Minus two delta log likelihood for the data given the noise spectra

calculate_minus_two_delta_llh_station(station, sim_station, time_grid=None, use_channels=None, frequency_domain=False, plot=True, return_traces=False)[source]

Calculates the minus two delta log likelihood with a NuRadioReco station containing the data channels, i.e., with noise, and a sim station that contains the noiseless traces. This function should correctly loop over the antennas and add together different ray-tracing solutions in the readout window of the data with the desired time offset between the sim noiseless traces and data. By providing a time_grid, the likelihood can be calculated for different time offsets between the data and the signal.

Parameters:
stationNuRadioReco.framework.base_station.Station

Station containing the data channels

sim_stationNuRadioReco.framework.base_station.Station

sim station containing channels for the signal for all antennas and all ray-tracing solutions.

use_channelslist

List of channel ids of station to use in the calculation

time_gridnumpy.ndarray

Array (or single number) containing time offsets between the data and the signal to calculate the likelihood for. The time offset is the time between the start of the data channel of the first antenna and the start of the signal/sim channel (first solution) of the first antenna. If the time_grid is not provided, the function will calculate the likelihood for a coarse time grid and then refine the time grid around the best time offset.

frequency_domainbool, optional

If True, calculate the delta log likelihood in the frequency domain, which is faster.

plotbool, optional

If True, plot the data and signal for each time offset in the time_grid.

return_tracesbool, optional

If True, return the data traces and the signal traces for the best time offset.

Returns:
float

Best minus two delta log likelihood

float

Best time offset

numpy.ndarray

Array containing the minus two delta log likelihood for each time offset in the time_grid

numpy.ndarray, optional

Array containing the data traces. Only returned if return_traces is True.

numpy.ndarray, optional

Array containing the signal traces for the best time offset. Only returned if return_traces is True.

get_minus_two_delta_llh_function(data_to_fit, signal_function, frequency_domain=False)[source]

Convenience function. Returns a function that calculates the minus two delta log likelihood for signal given a dataset with noise. The returned function can be used directly in optimizers like scipy.optimize.minimize.

Parameters:
data_to_fitnumpy.ndarray

Array containing one dataset with dimensions [n_antennas,n_samples]

signal_functionfunction

Function which takes a vector containing parameters as input and returns a signal array with dimensions [n_antennas,n_samples].

frequency_domainbool, optional

If True, calculate the delta log likelihood in the frequency domain, which is faster.

Returns:
minus_two_delta_llh_funcfunction

Function that returns a minus two delta log likelihood for a set of parameters given a data realization and the noise spectra

resample_covariance_matrices(new_sampling_rate)[source]

Resample the covariance matrices to new sampling rate and the same trace duration.

Parameters:
new_sample_ratefloat

New sampling rate in GHz

calculate_fisher_information_matrix(signal_function, parameters_x0, dx, frequency_domain=False, ignore_parameters=[], plot=False)[source]

Calculate Fisher information matrix for a set of parameter values (parameters_x0) which generates a signal using the covariance matrices of the noise.

Parameters:
signal_functionfunction

Function that takes a list of parameter values as input and returns a neutrino signal with dimensions [n_antennas,n_samples]

parameters_x0list

Parameter values for which to calculate the Fisher information matrix

dxlist

Finite differences to use when calculating derivatives of signal_function with respect to the parameters

frequency_domainbool, optional

If True, calculate the Fisher information matrix in the frequency domain.

ignore_parameterslist, optional

List of parameters indicies of signal_function to ignore when calculating the Fisher information matrix, e.g. [2, 5]. The method then returns a lower dimensional (len(parameters_x0) - len(ignore_parameters)) Fisher information matrix.

Returns:
fisher_information_matrixnumpy.array

Fisher information matrix for the parameters given the noise spectra

get_dof()[source]

Get number of degrees of freedom based on the noise spectra and threshold amplitude.

plot_covariance_matrix(cov, plot_range=None, make_new_figure=True)[source]

Plots a covariance matrix

Parameters:
covnumpy.ndarray

Covariance matrix of dimensions [n_samples,n_samples]

plot_rangefloat

Range along x- and y-axes (self.t_array) to plot in units of nanoseconds

make_new_figurebool

If True create a new figure

plot_covariance_matrix_first_row(cov, plot_range=None, linestyle_and_color='auto', make_new_figure=True)[source]

Plots the first row of a covariance matrix

Parameters:
covnumpy.ndarray

Covariance matrix of dimensions [n_samples,n_samples]

plot_rangefloat

Range along x-axis (self.t_array) to plot in units of nanoseconds

linestyle_and_colorstr, optional

String specifying linestyle and color, e.f. “k-” for black solid line, or “b–” for blue dashed line. If set to “auto”, matplotlib will set the color and style.

make_new_figurebool

If True create a new figure

plot_llh_distribution(data, n_dof=None, signal=None, frequency_domain=False, make_new_figure=True)[source]

Calculate the llh values for many datasets and plot the distribution alongside a chi2 distribution with dof equal to the degrees of freedom of the covariance matrices

Parameters:
datanumpy.ndarray

Array containing data with dimensions [n_samples]

n_dofint, optional

Number of degrees of freedom for the chi2 distribution. If not provided, it is set equal to the number of samples

frequency_domainbool, optional

If True, calculate the delta log likelihood in the frequency domain, which is faster.

make_new_figurebool

If True create a new figure