[1]:
# %load_ext autoreload
# %autoreload 2
# %matplotlib notebook
import numpy as np
from stingray.pulse.search import epoch_folding_search, z_n_search
import matplotlib.pyplot as plt
import seaborn as sb
import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (10, 6)
Simulate a dataset¶
Let us simulate a pulsar: we create a sinusoidal light curve and use Stingray’s event simulator (in Eventlist.simulate_times
) to simulate an event list with that light curve.
[2]:
def sinusoid(times, frequency, baseline, amplitude, phase):
return baseline + amplitude * np.sin(2 * np.pi * (frequency * times + phase))
[3]:
from stingray import Lightcurve
period = 1.203501
mean_countrate = 50
pulsed_fraction = 0.2
bin_time = 0.01
obs_length = 3000
t = np.arange(0, obs_length, bin_time)
# The continuous light curve
counts = sinusoid(t, 1 / period, mean_countrate,
0.5 * mean_countrate * pulsed_fraction, 0) * bin_time
lc = Lightcurve(t, counts, gti=[[-bin_time / 2, obs_length + bin_time / 2]],
dt=bin_time)
[4]:
from stingray.events import EventList
# use the light curve above to simulate an event list for this pulsar.
events = EventList()
events.simulate_times(lc)
Pulsation search with epoch folding.¶
Let us assume we have already an estimate of the pulse period, for example because we found a candidate in the power density spectrum with a period of ~1.2. We search around that period with the epoch folding.
Epoch folding consists of cutting the light curve at every pulse period and summing up all the intervals obtained in this way. We get an average pulse profile. In this example, where the pulse was plotted twice for visual clarity. If the candidate pulse frequency was even slightly incorrect, we would have obtained a much shallower pulse profile, or no pulse profile at all.
[5]:
from stingray.pulse.pulsar import fold_events
from stingray.pulse.search import plot_profile
nbin = 32
ph, profile, profile_err = fold_events(events.time, 1/period, nbin=nbin)
_ = plot_profile(ph, profile)
ph, profile, profile_err = fold_events(events.time, 1/1.1, nbin=nbin)
_ = plot_profile(ph, profile)
Therefore, typically we try a number of frequencies around the candidate we found with the power spectrum or other means, and search for the frequency that gives the “best” pulsed profile. How do we evaluate this best frequency? We use the chi squared statistics.
We use a flat pulsed profile (no pulsation) as model, and we calculate the chi square of the actual pulsed profile with respect to this flat model:
If there is no pulsation, the chi squared will assume a random value distributed around the number of degrees of freedom \(n - 1\) (where \(n\) is the number of bins in the profile) with a well defined statistical distribution (\(\chi^2_{n - 1}\)). If there is pulsation, the value will be much larger. Stingray has a function that does this: stingray.pulse.search.epoch_folding_search
.
For the frequency resolution of the periodogram, one usually chooses at least the same frequency resolution of the FFT, i. e., \(df_{\rm min}=1/(t_1 - t_0)\). In most cases, a certain degree of oversampling is used.
[6]:
# We will search for pulsations over a range of frequencies around the known pulsation period.
df_min = 1/obs_length
oversampling=15
df = df_min / oversampling
frequencies = np.arange(1/period - 200 * df, 1/period + 200 * df, df)
freq, efstat = epoch_folding_search(events.time, frequencies, nbin=nbin)
# ---- PLOTTING --------
plt.figure()
plt.plot(freq, efstat, label='EF statistics')
plt.axhline(nbin - 1, ls='--', lw=3, color='k', label='n - 1')
plt.axvline(1/period, lw=3, alpha=0.5, color='r', label='Correct frequency')
plt.xlabel('Frequency (Hz)')
plt.ylabel('EF Statistics')
_ = plt.legend()
A peak is definitely there. Far from the peak, the periodogram follows approximately a :math:`chi^2` distribution with :math:`n - 1` degrees of freedom, where \(n\) is the number of bins in the pulse profile used to calculate the statistics. In fact, its mean is \(n-1\) as shown in the figure.
But close to the correct frequency, as described in Leahy et al. 1983, 1987 the peak in the epoch folding periodogram has the shape of a sinc squared function (whose secondary lobes are in this case barely visible above noise).
Z-squared search¶
The epoch folding statistics has no information on the actual shape of the profile.
A better method is the :math:`Z^2` statistics (Buccheri et al. 1983), which is conceptually similar to the Epoch folding but has high values when the signal is well described by a small number of sinusoidal harmonics.
\(Z^2_n = \dfrac{2}{N} \sum_{k=1}^n \left[{\left(\sum_{j=1}^N \cos k \phi_j\right)}^2 + {\left(\sum_{j=1}^N \sin k \phi_j\right)}^2\right]\)
Where \(N\) is the number of photons, \(n\) is the number of harmonics, \(\phi_j\) are the phases corresponding to the event arrival times \(t_j\) (\(\phi_j = \nu t_j\), where \(\nu\) is the pulse frequency).
The \(Z_n^2\) statistics defined in this way, far from the pulsed profile, follows a \(\chi^2_n\) distribution, where \(n\) is the number of harmonics this time.
Stingray implements the \(Z\) search in stingray.pulse.search.z_n_search
. The standard \(Z^2\) search calculates the phase of each photon and calculates the sinusoidal functions above for each photon. This is very computationally expensive if the number of photons is high. Therefore, in Stingray, the search is performed by binning the pulse profile first and using the phases of the folded profile in the formula above, multiplying the squared sinusoids of the phases of the pulse
profile by a weight corresponding to the number of photons at each phase.
\(Z^2_n = \dfrac{2}{\sum_j{w_j}} \sum_{k=1}^n \left[{\left(\sum_{j=1}^m w_j \cos k \phi_j\right)}^2 + {\left(\sum_{j=1}^m w_j \sin k \phi_j\right)}^2\right]\)
Since the sinusoids are only executed on a small number of bins, while the epoch folding procedure just consists of a very fast histogram-like operation, the speedup of this new formula is obvious. Care must be put into the choice of the number of bins, in order to maintain a good approximation even when the number of harmonics is high. As a rule of thumb, use a number of bins at least 10 times larger than the number of harmonics.
[7]:
# We will search for pulsations over a range of frequencies around the known pulsation period.
nharm = 1
freq, zstat = z_n_search(events.time, frequencies, nbin=nbin, nharm=nharm)
# ---- PLOTTING --------
plt.figure()
plt.plot(freq, (zstat - nharm), label='$Z_2$ statistics')
plt.plot(freq, efstat - nbin + 1, color='gray', label='EF statistics', alpha=0.5)
plt.axvline(1/period, color='r', lw=3, alpha=0.5, label='Correct frequency')
plt.xlim([frequencies[0], frequencies[-1]])
plt.xlabel('Frequency (Hz)')
plt.ylabel('Statistics - d.o.f.')
plt.legend()
plt.figure(figsize=(15, 5))
plt.plot(freq, (zstat - nharm), label='$Z_2$ statistics')
plt.plot(freq, efstat - nbin + 1, color='gray', label='EF statistics', alpha=0.5)
plt.axvline(1/period, color='r', lw=3, alpha=0.5, label='Correct frequency')
plt.xlabel('Frequency (Hz)')
plt.ylabel('Statistics - d.o.f. (Zoom)')
plt.ylim([-15, 15])
_ = plt.xlim([frequencies[0], frequencies[-1]])
Thresholding¶
When can a peak in the EF or \(Z_n^2\) periodogram be considered a pulsation?
Since both the EF and \(Z_n^2\) of noise follow precise statistical distributions (\(\chi^2_{\rm nbin}\) in one case, \(\chi^2_n\) in the other), we can use the inverse survival functions of these statistical distributions to find the peaks that are not expected by noise.
In Stingray, the thresholds are defined in stingray.stats.fold_detection_level
and stingray.stats.z2_n_detection_level
respectively.
The ntrial
parameter should be set to an estimate of the statistically independent frequencies in the periodogram. A good estimate can be
, where \(f_{\rm min}\) and \(f_{\rm max}\) are the maximum and minimum frequencies of the periodogram, \(df_{\rm min}\) was defined above and \(t_0\) ans \(t_1\) the start and end of the observation.
Moreover, the stingray.pulse.search.search_best_peaks
helps finding the best value for nearby candidates.
[8]:
from stingray.pulse.search import search_best_peaks
from stingray.stats import fold_detection_level, z2_n_detection_level
ntrial = (frequencies[-1] - frequencies[0]) / df_min
z_detlev = z2_n_detection_level(n=1, epsilon=0.001, ntrial=len(freq))
ef_detlev = fold_detection_level(nbin, epsilon=0.001, ntrial=len(freq))
cand_freqs_ef, cand_stat_ef = search_best_peaks(freq, efstat, ef_detlev)
cand_freqs_z, cand_stat_z = search_best_peaks(freq, zstat, z_detlev)
# ---- PLOTTING --------
plt.figure()
plt.axhline(z_detlev - nharm, label='$Z^2_1$ det. lev.')
plt.axhline(ef_detlev - nbin + 1, label='EF det. lev.', color='gray')
plt.plot(freq, (zstat - nharm), label='$Z^2_1$ statistics')
plt.plot(freq, efstat - nbin + 1, color='gray', label='EF statistics', alpha=0.5)
for c in cand_freqs_ef:
plt.axvline(c, ls='-.', label='EF Candidate', zorder=10)
for c in cand_freqs_z:
plt.axvline(c, ls='--', label='$Z^2_1$ Candidate', zorder=10)
plt.axvline(1/period, color='r', lw=3, alpha=0.5, label='Correct frequency')
plt.xlim([frequencies[0], frequencies[-1]])
plt.xlabel('Frequency (Hz)')
plt.ylabel('Statistics - d.o.f.')
plt.legend()
plt.figure(figsize=(15, 5))
plt.plot(freq, (zstat - nharm), label='$Z_2$ statistics')
plt.plot(freq, efstat - nbin + 1, color='gray', label='EF statistics', alpha=0.5)
plt.axvline(1/period, color='r', lw=3, alpha=0.5, label='Correct frequency')
plt.axhline(z_detlev - nharm, label='$Z^2_1$ det. lev.', zorder=10)
plt.axhline(ef_detlev - nbin + 1, label='EF det. lev.', color='gray', zorder=10)
for c in cand_freqs_ef:
plt.axvline(c, ls='-.', label='EF Candidate', color='gray', zorder=10)
for c in cand_freqs_z:
plt.axvline(c, ls='--', label='$Z^2_1$ Candidate', zorder=10)
plt.xlabel('Frequency (Hz)')
plt.ylabel('Statistics - d.o.f. (Zoom)')
plt.ylim([-15, ef_detlev - nbin + 3])
_ = plt.xlim([frequencies[0], frequencies[-1]])
Note that the side lobes of the sinc squared-like shape are producing spurious candidates here. For now, we do not have a method to eliminate these fairly obvious patterns, but it will be implemented in future releases
Fit peak with Sinc-squared and Gaussian functions¶
stingray.pulse.modeling.fit_sinc
.if
obs_length
is the length of the observation. If it is defined, it fixes width to \(1/(\pi*obs length)\), as expected from epoch folding periodograms. The other two free parameters areamplitude
andmean
.if it is not defined, the
width
parameter can be used.
On the other hand, if the pulse frequency varies slightly, the peak oscillate and the integrated profile is a bell-shaped function. We can fit it with a Gaussian function (stingray.pulse.modeling.fit_gaussian
) with the standard parameters: amplitude
, mean
, stddev
.
We also provide the user with the constrains fixed
, tied
, bounds
, in order to fix, link and/or constrain parameters.
[19]:
from stingray.pulse.modeling import fit_sinc
fs=fit_sinc(freq, efstat-(nbin-1),amp=max(efstat-(nbin-1)), mean=cand_freqs_ef[0],
obs_length=obs_length)
[10]:
# ---- PLOTTING --------
plt.figure()
plt.plot(freq, efstat-(nbin-1), label='EF statistics')
plt.plot(freq, fs(freq), label='Best fit')
plt.axvline(1/period, lw=3, alpha=0.5, color='r', label='Correct frequency')
plt.axvline(fs.mean[0], label='Fit frequency')
plt.xlabel('Frequency (Hz)')
plt.ylabel('EF Statistics')
plt.legend()
plt.figure(figsize=(15, 5))
plt.plot(freq, efstat-(nbin-1)-fs(freq))
plt.xlabel('Frequency (Hz)')
_ = plt.ylabel('Residuals')
On the other hand, if we want to fit with a Gaussian:
[11]:
from stingray.pulse.modeling import fit_gaussian
fg=fit_gaussian(freq, efstat-(nbin-1),amplitude=max(efstat-(nbin-1)),
mean=cand_freqs_ef[0], stddev=1/(np.pi*obs_length))
[12]:
# ---- PLOTTING --------
plt.figure()
plt.plot(freq, efstat-(nbin-1), label='EF statistics')
plt.plot(freq, fg(freq), label='Best fit')
plt.axvline(1/period, alpha=0.5, color='r', label='Correct frequency')
plt.axvline(fg.mean[0], alpha=0.5, label='Fit frequency')
plt.xlabel('Frequency (Hz)')
plt.ylabel('EF Statistics')
plt.legend()
plt.figure(figsize=(15, 5))
plt.plot(freq, efstat-(nbin-1)-fg(freq))
plt.xlabel('Frequency (Hz)')
_ = plt.ylabel('Residuals')
Phaseogram¶
Let us now calculate the phaseogram and plot it with the pulse profile. We do that with the functions phaseogram
, plot_profile
and plot_phaseogram
from stingray.pulse.search
[13]:
from stingray.pulse.search import phaseogram, plot_phaseogram, plot_profile
from matplotlib.gridspec import GridSpec
# Calculate the phaseogram
phaseogr, phases, times, additional_info = \
phaseogram(events.time, cand_freqs_ef[0], return_plot=True, nph=nbin, nt=32)
# ---- PLOTTING --------
# Plot on a grid
plt.figure(figsize=(15, 15))
gs = GridSpec(2, 1, height_ratios=(1, 3))
ax0 = plt.subplot(gs[0])
ax1 = plt.subplot(gs[1], sharex=ax0)
mean_phases = (phases[:-1] + phases[1:]) / 2
plot_profile(mean_phases, np.sum(phaseogr, axis=1), ax=ax0)
# Note that we can pass arguments to plt.pcolormesh, in this case vmin
_ = plot_phaseogram(phaseogr, phases, times, ax=ax1, vmin=np.median(phaseogr))
Examples of interactive phaseograms¶
First: shift the rows of the phaseogram interactively¶
[14]:
def shift_phaseogram(phaseogr, tseg, delay_fun):
"""Shift the phaseogram rows according to an input delay function.
Parameters
----------
phaseogr : 2-d array
The phaseogram, as returned by ``phaseogram``
freq : float
The pulse frequency
tseg : float
The integration time for each row of the phaseogram
delay_fun : function
Function that gives the delay (in seconds) for each row of the
phaseogram
Returns
-------
phaseogram_new : 2-d array
The shifted phaseogram
"""
# Assume that the phaseogram is repeated twice in phase
nbin = phaseogr.shape[0] / 2
ntimes = phaseogr.shape[1]
times = np.arange(0, tseg * ntimes, tseg)
phase_delays = delay_fun(times) # This gives the delay in units of time!
delayed_bins = np.array(np.rint(phase_delays * nbin), dtype=int)
phaseogram_new = np.copy(phaseogr)
for i in range(ntimes):
phaseogram_new[:, i] = np.roll(phaseogram_new[:, i],
delayed_bins[i])
return phaseogram_new
def interactive_phaseogram(phas, binx, biny, df=0, dfdot=0):
import matplotlib.pyplot as plt
from matplotlib.widgets import Slider, Button, RadioButtons
fig, ax = plt.subplots()
plt.subplots_adjust(left=0.25, bottom=0.30)
tseg = np.median(np.diff(biny))
tobs = tseg * phas.shape[0]
delta_df_start = 2 / tobs
df_order_of_mag = int(np.log10(delta_df_start))
delta_df = delta_df_start / 10 ** df_order_of_mag
delta_dfdot_start = 8 / tobs ** 2
dfdot_order_of_mag = int(np.log10(delta_dfdot_start))
delta_dfdot = delta_dfdot_start / 10 ** dfdot_order_of_mag
pcolor = plt.pcolormesh(binx, biny, phas.T, cmap='magma')
l, = plt.plot(np.ones_like(biny), biny, zorder=10, lw=2, color='w')
plt.xlabel('Phase')
plt.ylabel('Times')
plt.colorbar()
axcolor = 'lightgoldenrodyellow'
axfreq = plt.axes([0.25, 0.1, 0.5, 0.03], facecolor=axcolor)
axfdot = plt.axes([0.25, 0.15, 0.5, 0.03], facecolor=axcolor)
axpepoch = plt.axes([0.25, 0.2, 0.5, 0.03], facecolor=axcolor)
sfreq = Slider(axfreq, 'Delta freq x$10^{}$'.format(df_order_of_mag),
-delta_df, delta_df, valinit=df)
sfdot = Slider(axfdot, 'Delta fdot x$10^{}$'.format(dfdot_order_of_mag),
-delta_dfdot, delta_dfdot, valinit=dfdot)
spepoch = Slider(axpepoch, 'Delta pepoch',
0, biny[-1] - biny[0], valinit=0)
def update(val):
fdot = sfdot.val * 10 ** dfdot_order_of_mag
freq = sfreq.val * 10 ** df_order_of_mag
pepoch = spepoch.val
delay_fun = lambda times: (times - pepoch) * freq + \
0.5 * (times - pepoch) ** 2 * fdot
new_phaseogram = shift_phaseogram(phas, tseg, delay_fun)
pcolor.set_array(new_phaseogram.T.ravel())
l.set_xdata(1 + delay_fun(biny - biny[0]))
fig.canvas.draw_idle()
resetax = plt.axes([0.8, 0.020, 0.1, 0.04])
button = Button(resetax, 'Reset', color=axcolor, hovercolor='0.975')
def reset(event):
sfreq.reset()
sfdot.reset()
spepoch.reset()
pcolor.set_array(phas.T.ravel())
l.set_xdata(1)
button.on_clicked(reset)
sfreq.on_changed(update)
sfdot.on_changed(update)
spepoch.on_changed(update)
spepoch._dummy_reset_button_ref = button
plt.show()
return
[15]:
# f0 = 0.0001
# fdot = 0
# delay_fun = lambda times: times * f0 + 0.5 * times ** 2 * fdot
# new_phaseogr = shift_phaseogram(phaseogr, times[1] - times[0], delay_fun)
# _ = plot_phaseogram(new_phaseogr, phases, times, vmin=np.median(phaseogr))
[16]:
interactive_phaseogram(phaseogr, phases, times, df=0, dfdot=0)
Second: overplot a line with a pulse frequency solution, then update the full phaseogram¶
This interactive phaseogram is implemented in HENDRICS
, in the script HENphaseogram
[17]:
class InteractivePhaseogram(object):
def __init__(self, ev_times, freq, nph=128, nt=128, fdot=0, fddot=0):
import matplotlib.pyplot as plt
from matplotlib.widgets import Slider, Button, RadioButtons
self.df=0
self.dfdot=0
self.freq = freq
self.fdot = fdot
self.nt = nt
self.nph = nph
self.ev_times = ev_times
self.phaseogr, phases, times, additional_info = \
phaseogram(ev_times, freq, return_plot=True, nph=nph, nt=nt,
fdot=fdot, fddot=fddot, plot=False)
self.phases, self.times = phases, times
self.fig, ax = plt.subplots()
plt.subplots_adjust(left=0.25, bottom=0.30)
tseg = np.median(np.diff(times))
tobs = tseg * nt
delta_df_start = 2 / tobs
self.df_order_of_mag = int(np.log10(delta_df_start))
delta_df = delta_df_start / 10 ** self.df_order_of_mag
delta_dfdot_start = 2 / tobs ** 2
self.dfdot_order_of_mag = int(np.log10(delta_dfdot_start))
delta_dfdot = delta_dfdot_start / 10 ** self.dfdot_order_of_mag
self.pcolor = plt.pcolormesh(phases, times, self.phaseogr.T, cmap='magma')
self.l1, = plt.plot(np.zeros_like(times) + 0.5, times, zorder=10, lw=2, color='w')
self.l2, = plt.plot(np.ones_like(times), times, zorder=10, lw=2, color='w')
self.l3, = plt.plot(np.ones_like(times) + 0.5, times, zorder=10, lw=2, color='w')
plt.xlabel('Phase')
plt.ylabel('Time')
plt.colorbar()
axcolor = 'lightgoldenrodyellow'
self.axfreq = plt.axes([0.25, 0.1, 0.5, 0.03], facecolor=axcolor)
self.axfdot = plt.axes([0.25, 0.15, 0.5, 0.03], facecolor=axcolor)
self.axpepoch = plt.axes([0.25, 0.2, 0.5, 0.03], facecolor=axcolor)
self.sfreq = Slider(self.axfreq, 'Delta freq x$10^{}$'.format(self.df_order_of_mag),
-delta_df, delta_df, valinit=self.df)
self.sfdot = Slider(self.axfdot, 'Delta fdot x$10^{}$'.format(self.dfdot_order_of_mag),
-delta_dfdot, delta_dfdot, valinit=self.dfdot)
self.spepoch = Slider(self.axpepoch, 'Delta pepoch',
0, times[-1] - times[0], valinit=0)
self.sfreq.on_changed(self.update)
self.sfdot.on_changed(self.update)
self.spepoch.on_changed(self.update)
self.resetax = plt.axes([0.8, 0.020, 0.1, 0.04])
self.button = Button(self.resetax, 'Reset', color=axcolor, hovercolor='0.975')
self.recalcax = plt.axes([0.6, 0.020, 0.1, 0.04])
self.button_recalc = Button(self.recalcax, 'Recalculate', color=axcolor, hovercolor='0.975')
self.button.on_clicked(self.reset)
self.button_recalc.on_clicked(self.recalculate)
plt.show()
def update(self, val):
fdot = self.sfdot.val * 10 ** self.dfdot_order_of_mag
freq = self.sfreq.val * 10 ** self.df_order_of_mag
pepoch = self.spepoch.val + self.times[0]
delay_fun = lambda times: (times - pepoch) * freq + \
0.5 * (times - pepoch) ** 2 * fdot
self.l1.set_xdata(0.5 + delay_fun(self.times - self.times[0]))
self.l2.set_xdata(1 + delay_fun(self.times - self.times[0]))
self.l3.set_xdata(1.5 + delay_fun(self.times - self.times[0]))
self.fig.canvas.draw_idle()
def recalculate(self, event):
dfdot = self.sfdot.val * 10 ** self.dfdot_order_of_mag
dfreq = self.sfreq.val * 10 ** self.df_order_of_mag
pepoch = self.spepoch.val + self.times[0]
self.fdot = self.fdot - dfdot
self.freq = self.freq - dfreq
self.phaseogr, _, _, _ = \
phaseogram(self.ev_times, self.freq, fdot=self.fdot, plot=False,
nph=self.nph, nt=self.nt, pepoch=pepoch)
self.l1.set_xdata(0.5)
self.l2.set_xdata(1)
self.l3.set_xdata(1.5)
self.sfreq.reset()
self.sfdot.reset()
self.spepoch.reset()
self.pcolor.set_array(self.phaseogr.T.ravel())
self.fig.canvas.draw()
def reset(self, event):
self.sfreq.reset()
self.sfdot.reset()
self.spepoch.reset()
self.pcolor.set_array(self.phaseogr.T.ravel())
self.l1.set_xdata(0.5)
self.l2.set_xdata(1)
self.l3.set_xdata(1.5)
def get_values(self):
return self.freq, self.fdot
[18]:
times_delayed = events.time + 0.5 * (events.time - events.time[0]) ** 2 * 3e-8 / cand_freqs_ef[0]
ip = InteractivePhaseogram(times_delayed, cand_freqs_ef[0], nt=32)
An evolved implementation of this interactive phaseogram is implemented in HENDRICS (command line tool HENphaseogram
)