Data Structure

Note

This page explains the structure of NuRadioReco .nur files, which contain event-level data from simulation or reconstruction. For more information on the simpler .hdf5 files produced in a NuRadioMC simulation, see the page on HDF5 structure.

.nur Files and How to Use Them

Philosophy and Basic Structure

NuRadioReco comes with its own input and output format, called .nur. With the obvious exception of reading in data from other file formats, like CoREAS or the SnowShovel format of the ARIANNA experiment, every processing step in an event reconstruction is done in this format. The big advantage this provides, is that at any point the process can be interrupted and the current state of the event data can be saved. This makes it easy to split a reconstruction into several steps and to check the state of the data structure after every step.

../../_images/event_structure.png

A NuRadioReco event is organized hierarchical, with an Event object at the top. Elements further down the hierarchy can be accessed via get functions or iterators from their parent object. For example, accessing the traces of a station’s channels would work like this:

#get station with ID 42
station = event.get_station(42)
# iterate over all channels in station
for channel in station.iter_channels():
  trace = channel.get_trace()

Reading and Writing .nur Files

Reading and writing .nur files is done by dedicated IO modules. Writing events is done by the eventWriter module. To save disk space it offers the option to not store channel and electric field traces, in case only the higher-level parameters are needed. It is also possible to write the detector description onto a *.nur* file.

import NuRadioReco.modules.io.eventWriter

event_writer = NuRadioReco.modules.io.eventWriter.eventWriter()
event_writer.begin('output_filename.nur')
event_writer.run(event, mode='full')

To read .nur files, two different modules can be used: NuRadioRecoio is a general-purpose reader that provides different ways to access events e.g. by ID or by event number. The eventReader is a more streamlined wrapper around NuRadioRecoio that provides an iterator over all events. Both modules provide as way to read the detector description from a *.nur* file.

import NuRadioReco.modules.io.NuRadioRecoio
nuradioreco_io = NuRadioReco.modules.io.NuRadioRecoio.NuRadioRecoio(['path/to/file', '/path/to/other/file'])
# get event with run number 0 and event ID 5
event_1 = nuradioreco_io.get_event([0,5])
# get second event in files (counting starts at 0)
event_2 = nuradioreco.io.get_event_i(1)
# iterate over all events
for event in nuradioreco_io.get_event():
  station = event.get_station(42)

import NuRadioReco.modules.io.eventReader
event_reader = NuRadioReco.modules.io.eventReader.eventReader()
event_reader.begin(['path/to/file', 'path/to/other/file'])
# iterate over events
for event in event_reader.run():
  station = event.get_station(42)

Additionally, .nur files store higher-level parameters in their headers, which makes them easily accessible for all events in a file. For example, if one wanted to make a histogram of the zenith angles in a given file, it would work like this:

import matplotlib.pyplot as plt
from NuRadioReco.framework.parameters import stationParameters as stnp
from NuRadioReco.utilities import units
import NuRadioReco.modules.io.NuRadioRecoio
nuradioreco_io = NuRadioReco.modules.io.NuRadioRecoio.NuRadioRecoio(['path/to/file'])
header = nuradioreco_io.get_header()
station_id = 42
zeniths = header[station_id][stnp.zenith]
plt.hist(zeniths/units.deg)
plt.show()

The way that writing and reading .nur files is handled internally is that every class in the framework has a serialize function that writes all information stored in the object into a pickle object and a deserialize function that writes the data from such a pickle into a class object. To write an event to disk, each object calls the serialize function on its child objects, stores the pickles they return and then serializes itself. The resulting pickle can then be written to disk. To read a .nur file the same is done in reverse, with each object calling the deserialize function on its children. Thanks to this implementation, it is easy to extend the framework, since all that has to be done is to define serialize and deserialize functions and adjust the ones of the parent object.

Parameter Storage

NuRadioReco offers a flexible way to store properties in the data structure via parameter storage. Certain classes (Particle, Station, SimStation, Channel, ElectricField, RadioShower and HybridShower) provide get_parameter and set_parameter functions that allow parameters to be stored in those objects along with their uncertainties and correlation to any other parameters. The parameters are defined in an enumerated type enum, so to add a new parameter, it just needs to be added to the list of parameters.

For Developers

New parameters should always be added to the bottom of the list. Do not re-use old Enums! A description should be added to each new parameter with a comment docstring starting with #:.

Additionally, parameters can be written and accessed via indexing, like one would do to a dictionary:

from NuRadioReco.framework.parameters import stationParameters as stnp
from NuRadioReco.utilities import units

# both ways to set the parameter are equivalent
station.set_parameter(stnp.cr_zenith, 45 * units.deg)
station[stnp.cr_zenith] = 45 * units.deg
# set parameter uncertainty
station.set_parameter_error(stnp.cr_zenith, 2 * units.deg)
# 2 ways of accessing parameters:
zenith = station.get_parameter(stnp.cr_zenith)
zenith = station[stnp.cr_zenith]
# get parameter uncertainty
zenith_uncertainty = station.get_parameter_error(stnp.cr_zenith)

List of Data Classes

Event

The Event is the upper-most element of the event structure and holds all simulated and reconstructed showers and stations as well as the event ID and run number.

Radio Shower

A Radio Shower is used to hold reconstructed shower parameters via the parameter storage. It should only be used for properties reconstructed from the radio signal, for properties from a simulated shower or reconstructed from another detector, the SimShower or HybridShower should be used, respectrively.

It can be accessed by the get_showers and get_first_shower methods of the Event class.

SimShower

A Sim Shower is used to hold parameters of simulated showers via the parameter storage. They are the same class as RadioShower, but are stored separately to distinguish between simulated and reconstructed properties.

It can be accessed by the get_sim_showers method of the Event class.

SimEmitter

The SimEmitter class is used to hold parameters of simulated emitters via the parameter storage. The concept is similar to the SimShower, but is used when NuRadioMC is used to simulate emitters (and) not particle showers.

It can be accessed by the get_sim_emitters method of the Event class. We allow for multiple emitters per event analogous to the multiple showers per event.

Particle

The Particle class stores information related to the particle that initiated the radio emission, such as flavour, energy and direction. A single Event may contain multiple particles, e.g. in the case of tau regeneration.

Station

A Station is used to hold event properties reconstructed at the station level, i.e. reconstructed from the data of a single station.

It can be accessed by the get_station and get_stations methods of the Event class

Trigger

The Trigger contains information about the station trigger - trigger type, threshold, trigger time and whether the trigger condition was satisfied.

SimStation

A SimStation can hold the same properties as the Station (and inherits from it), but is used for the MC truth of the simulation. This also implies that events from measured data typically do not have a SimStation.

It can be accessed by the get_sim_station method of the Station class.

BaseTrace

The BaseTrace class is used to store waveforms, both for voltages in the channels and electric fields. While internally traces are stored in the time domain, where they can be accessed via the get_trace and set_trace method, it is also possible access the waveform in the frequency domain via the get_frequency_spectrum and set_frequency_spectrum method. In that case, a Fourier transformation is done automatically by the Trace. The times and frequencies corresponding to the waveforms returned by the get_trace and get_frequency_spectrum methods can be accessed via the get_times and get_frequencies methods. The times are defined relative to the time of the parent Station and can be changes using the set_trace_start_time method, which changes the starting time of the trace.

The add operator (+) is defined for 2 BaseTrace objects. It will return a new BaseTrace object containing the sum of both traces. The length of the new trace is chosen so that it is long enough to contain both traces. If the traces have different sampling rates, the one with the lower sampling rate will be upsampled to match the other one. Since this property is inherited, + is defined for both channels and electric fields.

The Trace class is not used by itself, but serves as parent class for both the Channel and ElectricField classes.

Electric Field

The ElectricField is used to store information about electric fields, which can be accessed via the parameter storage and methods inherited from the BaseTrace class.

Since radio stations for neutrino detection are often so spread out that the electric field is not the same at all channels, each electric field is associated with one or more channels, whose IDs have to be passed to the Constructor function and can be accessed by the get_channel_ids method. Since pulses may reach a channel via different paths through the ice, multiple ElectricField objects may be associated with the same channel. Since typically multiple channels are used to reconstruct the electric field, each ElectricField can be associated with multiple channels. To avoid ambiguity, the ElectricField also has a position (accessed via get_position) relative to the station.

A Station ´s or SimStation ´s ElectricField objects can be accessed via the get_electric_fields method or the get_electric_fields_for_channels method, which allows to filter by channel IDs and ray path types.

Channel

The Channel is used to store information about the voltage traces recorded in a channel, which can be accessed via the parameter storage and methods inherited from the BaseTrace class.

Hybrid Information

As many radio detectors are built as part of a hybrid detector whose data may be used in the radio event reconstruction, a way to make this data accessible in NuRadioReco is needed. The HybridInformation class provides this functionality and sections the information from the other detectors off from the radio part to avoid confusion. Despite its name, it does not hold any data from the other detectors itself, but offers access to HybridShower objects in which this data is stored. For each additional detector (or set of detector data), a HybridSHower object can be added via the add_hybrid_shower method or accessed via the get_hybrid_shower or get_hybrid_showers methods.

It can be accessed via the get_hybrid_information'' method of the ``Event class.

Hybrid Shower

The HybridShower is used to store information about a shower that was reconstructed with a complementary detector, mainly via the parameter storage.

It can be accessed via the get_hybrid_shower and get_hybrid_showers methods of the HybridInformation class.

Hybrid Detector

A HybridDetector can be used to store more detailed and experiment-specific information about a complementary detector. The diversity of hybrid radio detectors makes it impractical to provide this functionality inside NuRadioReco itself, but a custom HybridDetector class can be impemented inside an independent repository. This class can be slotted into the data structure via the set_hybrid_detector method of the HybridShower class and accessed via its get_hybrid_detector method.

A HybridDetector class is required to have a constructor that does not accept any parameters as well as a serialize and a deserialize function equivalent to the other framework elements.

An example for the implementation of a custom HybridDetector can be found in the NuRadioReco/example folder.