cirada-tools / rm-tools Goto Github PK

View Code? Open in Web Editor NEW

42.0 42.0 22.0 55.33 MB

RM-synthesis, RM-clean and QU-fitting on polarised radio spectra

License: MIT License

Python 99.23% Batchfile 0.77%

rm-tools's People

Contributors

Stargazers

Watchers

rm-tools's Issues

Docstrings

Currently there are many functions (particularly within util_RM) which lack complete docstrings. This makes a lot them very difficult to use without diving into the source. I think some standard format should be used. For example in #8 I used Google-style docstrings in the do_ scripts.

Issues with 3D (and 1D) just doing example in readme

Just copying the lines from the example in the readme file produces several errors (using python 3.7.2)

In the 1D example for it to run properly
./do_RMsynth_1D.py data/Source1.dat -p --> ./do_RMsynth_1D.py data/Source1.dat -p -S

Otherwise the next line of cleaning won't work.

For the 3D case:

./mk_test_cube_data.py catalogue.csv : runs ok

./do_fitIcube.py data/StokesI.fits data/freqs_Hz.dat -b 50 : gives error
Traceback (most recent call last):
File "./do_fitIcube.py", line 311, in
main()
File "./do_fitIcube.py", line 104, in main
buffCols = args.buffCols)
File "./do_fitIcube.py", line 220, in make_model_I
f.write("\0")
TypeError: a bytes-like object is required, not 'str'

After running that and it dying, inside the data folder it created is:
freqs_Hz.dat Imodel.fits Inoise.dat IskyMask.fits IsrcMask.fits StokesI.fits StokesQ.fits StokesU.fits

./do_RMsynth_3D.py data/StokesQ.fits data/StokesU.fits data/freqs_Hz.dat : runs ok

./do_RMsynth_3D.py data/StokesQ.fits data/StokesU.fits data/freqs_Hz.dat -n data/Inoise.dat -o "WithNoise_" : gives errors
Traceback (most recent call last):
File "./do_RMsynth_3D.py", line 632, in
main()
File "./do_RMsynth_3D.py", line 597, in main
rmsArr = readFitsCube(args.noiseFile, verbose)[1]
File "./do_RMsynth_3D.py", line 471, in readFitsCube
data = pf.getdata(file)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/convenience.py", line 189, in getdata
hdulist, extidx = _getext(filename, mode, args, kwargs)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/convenience.py", line 1024, in _getext
hdulist = fitsopen(filename, mode=mode, kwargs)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 151, in fitsopen
lazy_load_hdus, kwargs)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 390, in fromfile
lazy_load_hdus=lazy_load_hdus, kwargs)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1064, in _readfrom
read_one = hdulist._read_next_hdu()
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1139, in _read_next_hdu
hdu = _BaseHDU.readfrom(fileobj, kwargs)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/base.py", line 328, in readfrom
kwargs)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/base.py", line 393, in _readfrom_internal
header = Header.fromfile(data, endcard=not ignore_missing_end)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/header.py", line 447, in fromfile
padding)[1]
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/header.py", line 516, in _from_blocks
raise OSError('Header missing END card.')
OSError: Header missing END card.

./do_RMsynth_3D.py data/StokesQ.fits data/StokesU.fits data/freqs_Hz.dat -i data/Imodel.fits -o "WithI_" : gives errors

Traceback (most recent call last):
File "./do_RMsynth_3D.py", line 632, in
main()
File "./do_RMsynth_3D.py", line 593, in main
dataI = readFitsCube(args.fitsI, verbose)[1]
File "./do_RMsynth_3D.py", line 471, in readFitsCube
data = pf.getdata(file)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/convenience.py", line 192, in getdata
data = hdu.data
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/utils/decorators.py", line 726, in get
val = self.fget(obj)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/image.py", line 243, in data
data = self._get_scaled_image_data(self._data_offset, self.shape)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/image.py", line 709, in _get_scaled_image_data
raw_data = self._get_raw_data(shape, code, offset)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/hdu/base.py", line 475, in _get_raw_data
return self._file.readarray(offset=offset, dtype=code, shape=shape)
File "/apps/python/3.7.2/lib/python3.7/site-packages/astropy/io/fits/file.py", line 334, in readarray
buffer=self._mmap)
TypeError: buffer is too small for requested array

./do_RMclean_3D.py -c 0.12 data/FDF_dirty.fits data/RMSF.fits : gives errors
File does not exist: 'data/FDF_dirty.fits'

ls data/ yields
FDF_im_dirty.fits
FDF_maxPI.fits
FDF_peakRM.fits
FDF_real_dirty.fits
FDF_tot_dirty.fits
freqs_Hz.dat
Imodel.fits
Inoise.dat
IskyMask.fits
IsrcMask.fits
RMSF_FWHM.fits
RMSF_im.fits
RMSF_real.fits
RMSF_tot.fits
StokesI.fits
StokesQ.fits
StokesU.fits

Speedup loops with JIT

Currently many of the core algorithms rely on pure python loops - which are slow. Due to the complexity of these algorithms, straight vectorisation with numpy as hard. I'd suggest we look at numba for acceleration with minimal code changes.

Here's a demo for speeding up RM synthesis by 16x

import numpy as np
import numba as nb
from numba_progress import ProgressBar
from RMutils.util_RM import do_rmsynth_planes

# Create some fake data
def screen(lsq, RM, psi, phi=None):
    P = 0.7*np.exp(2j*(np.deg2rad(psi)+RM*lsq))
    return P

rm = np.pi

start = 744
stop = 1032
step = 1
f_racs = np.arange(start, stop, step)*1e6
lsq_racs = (2.998e8/f_racs)**2

p_racs = screen(lsq_racs, rm, 10)
data = [
    f_racs,
    p_racs.real,
    p_racs.imag,
    np.ones_like(p_racs.real)*0.001,
    np.ones_like(p_racs.imag)*0.001,
]
phis = np.linspace(-1000, 1000, 10000)

# Run normal rmsynth
%%timeit
fdf, _ = do_rmsynth_planes(
    dataQ=p_racs.real,
    dataU=p_racs.imag,
    lambdaSqArr_m2=lsq_racs,
    phiArr_radm2=phis,
    weightArr=None,
    verbose=False
)
# 212 ms ± 9.46 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

# Rework rmsynth to use numba

@nb.njit(parallel=True, fastmath=True,)
def _rmsynth(FDFcube, phiArr_radm2, nPhi, KArr, pCube, a, progress):
    for i in nb.prange(nPhi):
        arg = np.expand_dims(np.expand_dims(np.exp(-2.0j * phiArr_radm2[i] * a), axis=-1), axis=-1)
        FDFcube[i,:,:] =  KArr * np.sum(pCube * arg, axis=0)
        progress.update(1)
    return FDFcube

def do_rmsynth_planes_fast(dataQ, dataU, lambdaSqArr_m2, phiArr_radm2, 
                      weightArr=None, lam0Sq_m2=None, nBits=32, verbose=False,
                      log=print):
    """Perform RM-synthesis on Stokes Q and U cubes (1,2 or 3D). This version
    of the routine loops through spectral planes and is faster than the pixel-
    by-pixel code. This version also correctly deals with isolated clumps of
    NaN-flagged voxels within the data-cube (unlikely in interferometric cubes,
    but possible in single-dish cubes). Input data must be in standard python
    [z,y,x] order, where z is the frequency axis in ascending order.

    dataQ           ... 1, 2 or 3D Stokes Q data array
    dataU           ... 1, 2 or 3D Stokes U data array
    lambdaSqArr_m2  ... vector of wavelength^2 values (assending freq order)
    phiArr_radm2    ... vector of trial Faraday depth values
    weightArr       ... vector of weights, default [None] is Uniform (all 1s)
    nBits           ... precision of data arrays [32]
    verbose         ... print feedback during calculation [False]
    log             ... function to be used to output messages [print]
    
    """
    
    # Default data types
    dtFloat = "float" + str(nBits)
    dtComplex = "complex" + str(2*nBits)

    # Set the weight array
    if weightArr is None:
        weightArr = np.ones(lambdaSqArr_m2.shape, dtype=dtFloat)
    weightArr = np.where(np.isnan(weightArr), 0.0, weightArr)
    
    # Sanity check on array sizes
    if not weightArr.shape  == lambdaSqArr_m2.shape:
        log("Err: Lambda^2 and weight arrays must be the same shape.")
        return None, None
    if not dataQ.shape == dataU.shape:
        log("Err: Stokes Q and U data arrays must be the same shape.")
        return None, None
    nDims = len(dataQ.shape)
    if not nDims <= 3:
        log("Err: data dimensions must be <= 3.")
        return None, None
    if not dataQ.shape[0] == lambdaSqArr_m2.shape[0]:
        log("Err: Data depth does not match lambda^2 vector ({} vs {}).".format(dataQ.shape[0], lambdaSqArr_m2.shape[0]), end=' ')
        log("     Check that data is in [z, y, x] order.")
        return None, None
    
    # Reshape the data arrays to 3 dimensions
    if nDims==1:
        dataQ = np.reshape(dataQ, (dataQ.shape[0], 1, 1))
        dataU = np.reshape(dataU, (dataU.shape[0], 1, 1))
    elif nDims==2:
        dataQ = np.reshape(dataQ, (dataQ.shape[0], dataQ.shape[1], 1))
        dataU = np.reshape(dataU, (dataU.shape[0], dataU.shape[1], 1))
    
    # Create a complex polarised cube, B&dB Eqns. (8) and (14)
    # Array has dimensions [nFreq, nY, nX]
    pCube = (dataQ + 1j * dataU) * weightArr[:, np.newaxis, np.newaxis]
    
    # Check for NaNs (flagged data) in the cube & set to zero
    mskCube = np.isnan(pCube)
    pCube = np.nan_to_num(pCube)
    
    # If full planes are flagged then set corresponding weights to zero
    mskPlanes =  np.sum(np.sum(~mskCube, axis=1), axis=1)
    mskPlanes = np.where(mskPlanes==0, 0, 1)
    weightArr *= mskPlanes
    
    # Initialise the complex Faraday Dispersion Function cube
    nX = dataQ.shape[-1]
    nY = dataQ.shape[-2]
    nPhi = phiArr_radm2.shape[0]
    FDFcube = np.zeros((nPhi, nY, nX), dtype=dtComplex)

    # lam0Sq_m2 is the weighted mean of lambda^2 distribution (B&dB Eqn. 32)
    # Calculate a global lam0Sq_m2 value, ignoring isolated flagged voxels
    K = 1.0 / np.sum(weightArr)
    if lam0Sq_m2 is None:
        lam0Sq_m2 = K * np.sum(weightArr * lambdaSqArr_m2)
    if not np.isfinite(lam0Sq_m2): #Can happen if all channels are NaNs/zeros
        lam0Sq_m2=0.
    
    # The K value used to scale each FDF spectrum must take into account
    # flagged voxels data in the datacube and can be position dependent
    weightCube =  np.invert(mskCube) * weightArr[:, np.newaxis, np.newaxis]
    with np.errstate(divide='ignore', invalid='ignore'):
        KArr = np.true_divide(1.0, np.sum(weightCube, axis=0))
        KArr[KArr == np.inf] = 0
        KArr = np.nan_to_num(KArr)
        
    # Do the RM-synthesis on each plane
    a = lambdaSqArr_m2 - lam0Sq_m2
    with ProgressBar(total=nPhi, desc="Running RM-synthesis by channel", disable=not verbose) as progress:
        FDFcube = _rmsynth(FDFcube, phiArr_radm2, nPhi, KArr, pCube, a, progress)
    # Remove redundant dimensions in the FDF array
    FDFcube = np.squeeze(FDFcube)

    return FDFcube, lam0Sq_m2

%%timeit
fdf_fast, _ = do_rmsynth_planes_fast(
    dataQ=p_racs.real,
    dataU=p_racs.imag,
    lambdaSqArr_m2=lsq_racs,
    phiArr_radm2=phis,
    weightArr=None,
    verbose=False
)
# 11.9 ms ± 384 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

np.array_equal(fdf, fdf_fast)
# True

error message of "TypeError: Object of type float32 is not JSON serializable" on RMtools_1D/do_RMsynth_1D.py

Hi,

Execution on pops out the error message "TypeError: Object of type float32 is not JSON serializable".
json.dump() seems to not support the float/int of NumPy. It can be fixed by converting the np float/int to the Python float/int as below.

Ruolan

saveOutput function

for k, v in outdict.items():
        if isinstance( v, np.float32 ) or isinstance( v, np.float64 ):
            outdict[k] = float(v)
        if isinstance( v, np.int32 ) or isinstance( v, np.int64 ):
            outdict[k] = int(v)
         json.dump(dict(outdict), open(outFile, "w"))

Taking compressed Stokes cubes as input

I would like the option of RM synthesis being able to take compressed Stokes cubes as input (compressed by fpack https://heasarc.gsfc.nasa.gov/fitsio/fpack/). In addition, having the option of whether or not to save the RM cubes to disk, and using floating point precision rather than double precision if I want to save the RM cubes.

Frequency input for RMsynth_3D

The default way that RMsynth_3D should read frequency information is from the FITS header. An externally supplied frequency list should be provided as a non-default option, since it is always preferable to get key metadata from the data object. In particular RMsynth_3D should provide support for the -TAB convention, in which an irregularly gridded axis has an associated FITS extension table giving the frequencies (for instance) corresponding to each channel. See Greisen et al (2006) DOI: 10.1051/0004-6361:20053818.

Queery on RMsynth_3D output with -i option

I noticed with both the test data and some cubes I was working on that with the -i option (to include a stokes I cube which normalizes Q and U), the final FDF output ends up filled with a lot of random looking noise in the background. Additionally, where the sources are it looks like the values in the FDF_maxPI file for example are not given as fractions (but have similar values to the file output if I don't include the stokes I cube). What is this output exactly, as is doesn't look like fractional polarisation?
Thanks

Fix imports for running RM package scripts

Individual scripts run fine (e.g. python3 RMtools_1D/do_RMsynth_1D.py ... will work) but some packages in the module itself can't be run due to the way imports are handled (e.g. python3 -m RM.RMtools_1D.do_RMsynth_1D won't work). The imports in these scripts need to be package imports instead, such as from . import cl_RMsynth_1d rather than import cl_RMsynth_1d.

(I'd submit a pull request but I'm unsure whether such a change would break intended usage somehow)

Location-dependent RMCLEAN thresholding in 3D RM synthesis/RMCLEAN

RM synthesis is often applied to fractional Stokes (Q/I and U/I) data. For heavily resolved objects, the fractional noise threshold will vary considerably across the field, and it is therefore desirable to be able to run 3D RM synth + clean with a location-dependent (preferably spaxel-dependent) clean threshold. Suggestion is that this could be implemented via a 2D array, whose pixels contain the real-valued clean threshold for each location in the q/u datacubes.

RMsynth1D: does not specify Stokes I function in outputs

The output file with all the results (_RMsynth.dat) does not list the fitting function used for Stokes I, which makes the reported polynomial coefficients ambiguous. While you (I?) are at it, maybe check that there isn't anything else useful being accidentally left out of the file.

Incorrect fit to real component of RMSF

util_RM.py -> Line 318: Fit should be made to RMSFArr not RMSFcube. See diff for correction below:

diff --git a/RMutils/util_RM.py b/RMutils/util_RM.py
index 7a3cbcf..435cdc8 100644
--- a/RMutils/util_RM.py
+++ b/RMutils/util_RM.py
@@ -315,7 +315,7 @@ def get_rmsf_planes(lambdaSqArr_m2, phiArr_radm2, weightArr=None, mskArr=None,
             if verbose:
                 log("Fitting Gaussian to the main lobe.")
             if fitRMSFreal:
-                mp = fit_rmsf(phi2Arr, RMSFcube.real)
+                mp = fit_rmsf(phi2Arr, RMSFArr.real)
             else:
                 mp = fit_rmsf(phi2Arr, np.abs(RMSFArr))
             if mp is None or mp.status<1:

interp_images in util_misc.py returning NaNs

Hi Cameron,

Been playing around with the code recently and think I've noticed an issue. It might be just because of the cubes I'm using (parts of POSSUM cubes cut out with CASA), but thought I would flag it just in case.

The part of do_RM_synth.py which starts with 'if dataI is not None:' (line 193 or somewhere around there) calls the interp_images function from util_misc. For my particular I cube, freqArr_Hz[idx]>freq0_Hz and so Ifreq0Arr = interp_images(dataI[idx-1, :, :], dataI[idx, :, :], f=0.5) is called.

However this just seems to return NaNs, meaning the overall output cubes becomes NaNs. When I instead force the code to do Ifreq0Arr = dataI[idx, :, :] by removing the ifs, it seems to be fine and I get sensible final outputs. It seems like the interp_images function for me just returns an array of NaNs, which then messes up the rest of everything.

From my understanding, this function just seems to want to grab the I image from the central frequency channel, or if there are two central ones to interpolate between them? Would there be a way to use the continuum-averaged I image instead? Is there a potential consistency issue with the packages used in interp_images?

Many thanks,

Emma

QA test b2 1D fails

The check on the error in peak PI failes in test 2b. I'm guessing this might be due to an update in the error calculation at some point.

Error message is:

======================================================================
FAIL: test_b2_1D_synth_values (__main__.test_RMtools)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "QA_tests.py", line 223, in test_b2_1D_synth_values
    self.assertEqual(mDict['dPhiPeakPIchan_rm2'], 0.15784329777032025, 'dPhiPeakPIchan_rm2 differs from expectation.')
AssertionError: 0.2303510495843434 != 0.15784329777032025 : dPhiPeakPIchan_rm2 differs from expectation.

----------------------------------------------------------------------

FDF cubes should have properly set (degenerate) Stokes axis

Per conversations on the POSSUM Slack, the FITS header for FDF cubes should accurately encode the Stokes parameter and Faraday depth axis.

The Faraday depth axis is probably fine -- it currently uses CTYPE = 'FDEP', so we're going to convince CADC that that is preferable to their currently supported 'FARADAY'. So no action is probably needed there. Paddy Leahy argues that it's not length-constrained and we should write it out fully as 'Faraday depth'.

But we should properly encode the real vs imaginary component of the FDF inside the FITS header; right now this is only stored in the filename (bad practice!). The obvious place to store it is in the (degenerate) Stokes axis. Since real and imaginary map directly to Stokes Q and U, and there is a standard convention for Stokes axis values, we should follow that.

But the interesting question is what to do with the polarized intensity 'tot' FDF cubes -- what Stokes value is a assigned for polarized intensity? It doesn't appear to be part of the FITS standard, but maybe there is a convention others have used/defined?

In terms of actual code changes to RM-Tools: we should make sure that all 3D RM-synthesis/CLEAN products have proper, informative FITS headers, including all units, coordinate types, pixel units, etc. This also applies to the 2D map outputs from the peak-mapping routine.

Weird RMclean3D output fits files, errors in writing the results?

Hello,
I used these commands on a fits cube with RA-Dec pixels equal to 1801x1801 and frequency on the fourth axis:
rmsynth3d -i i_cube.fits -l 2000 -d 10 -o RMTOOLS_ q_cube.fits u_cube.fits freqs_Hz.dat
rmclean3d -c 1e-4 -o CLEAN_ -v RMTOOLS_FDF_tot_dirty.fits RMTOOLS_RMSF_tot.fits

RMTOOLS_* file fits are ok but CLEAN_* file fits are weird. There are some problem in writing these outputs cubes.
I'll copy the header of CLEAN_FDF_clean_tot.fits below where you can see that the fourth axis is missing while corresponding keywords have been written. From the axis dimensions (NAXIS2 and NAXIS3) it seems that RA-Dec correspond to axis 2 and 3 but again the keywords below suggest that these are Dec and Stokes respectively.
`SIMPLE = T / conforms to FITS standard
BITPIX = -32 / array data type
NAXIS = 3 / number of array dimensions
NAXIS1 = 201
NAXIS2 = 1801
NAXIS3 = 1801

BUNIT = 'Jy/beam ' / Brightness (pixel) unit
EQUINOX = 2.000000000000E+03
RADESYS = 'FK5 '
LONPOLE = 1.800000000000E+02
LATPOLE = -5.612308055556E+01
PC01_01 = 1.000000000000E+00
PC02_01 = 0.000000000000E+00
PC03_01 = 0.000000000000E+00
PC04_01 = 0.000000000000E+00
PC01_02 = 0.000000000000E+00
PC02_02 = 1.000000000000E+00
PC03_02 = 0.000000000000E+00
PC04_02 = 0.000000000000E+00
PC01_03 = 0.000000000000E+00
PC02_03 = 0.000000000000E+00
PC03_03 = 1.000000000000E+00
PC04_03 = 0.000000000000E+00
PC01_04 = 0.000000000000E+00
PC02_04 = 0.000000000000E+00
PC03_04 = 0.000000000000E+00
PC04_04 = 1.000000000000E+00
CTYPE1 = 'RA---SIN'
CRVAL1 = 3.105707416667E+02
CDELT1 = -5.555555555556E-04
CRPIX1 = 4345.
CUNIT1 = 'deg '
CTYPE2 = 'DEC--SIN'
CRVAL2 = -5.612308055556E+01
CDELT2 = 5.555555555556E-04
CRPIX2 = -1217.
CUNIT2 = 'deg '
CTYPE3 = 'STOKES '
CRVAL3 = 2.000000000000E+00
CDELT3 = 1.000000000000E+00
CRPIX3 = 1.
CUNIT3 = ' '
CTYPE4 = 'FDEP ' / Faraday depth (linear)
CRVAL4 = -1000.0 / [rad/m^2] Coordinate value at reference point
CDELT4 = 10.0 / [rad/m^2] Coordinate increment at reference poi
CRPIX4 = 1.
CUNIT4 = 'rad/m^2 '
`
The input files for rmsynth3d have 4 axis which is not an issue for rmsynth3d but maybe it is for rmclean3d? My apologies if this was already documented.

Exclusion of user specified phi-range from RM synthesis band-averaged images

I would like to have the option to exclude a specified range in Faraday depth during peak finding in the RM cube for the creation of the band-averaged maps. This is useful for observations where the wavelength-squared coverage is large enough (or the target RM high enough) such that the instrumental polarization peak is separated from the real polarized emission by more than one FWHM of the RMSF.

Read IQU image planes from directories, rather than in a Stokes cube.

I want to be able to use 3D RM synthesis on image planes where the frequency information is read from the fits header.
In this case the Stokes Q and U (and/or Stokes I) input images are stored in separate FITS files, in eg. separate directories labelled stokes_q, stokes_u, stokes_i.

rmclean3d fails with IndexError

rmclean3d fails with the following error. I generated FDF_dirty.fits and RMSF.fits with the -f option in rmsynth3d.

Traceback (most recent call last):         ] 0%
  File "/fpra/polmag/01/sarrvesh/NGC_2403/selfcal/RMSynth/python/bin/rmclean3d", line 8, in <module>
    sys.exit(main())
  File "/fpra/polmag/01/sarrvesh/NGC_2403/selfcal/RMSynth/python/lib/python3.6/site-packages/RMtools_3D/do_RMclean_3D.py", line 427, in main
    verbose = verbose)
  File "/fpra/polmag/01/sarrvesh/NGC_2403/selfcal/RMSynth/python/lib/python3.6/site-packages/RMtools_3D/do_RMclean_3D.py", line 123, in run_rmclean
    chunksize        = chunksize)
  File "/fpra/polmag/01/sarrvesh/NGC_2403/selfcal/RMSynth/python/lib/python3.6/site-packages/RMutils/util_RM.py", line 499, in do_rmclean_hogbom
    output.append(rmc.cleanloop(pix))
  File "/fpra/polmag/01/sarrvesh/NGC_2403/selfcal/RMSynth/python/lib/python3.6/site-packages/RMutils/util_RM.py", line 556, in cleanloop
    return self._cleanloop(*args)
  File "/fpra/polmag/01/sarrvesh/NGC_2403/selfcal/RMSynth/python/lib/python3.6/site-packages/RMutils/util_RM.py", line 565, in _cleanloop
    fwhmRMSFArr = self.fwhmRMSFArr[yi, xi]
IndexError: index 289 is out of bounds for axis 1 with size 227

NAXIS (1,2,3) in FDF_dirty.fits and RMSF.fits are (290,227,2001) and (290,227,4003). Could it be that the RA and DEC indices are swapped in util_RM.py?

QU fitting running for hours on Modelled data.

Hello!

I am trying to run QU fitting on a model data set I generated for 3C286 and it is taking hours to run and I had to force stop the code. I still don't get the results from the QU fitting with the model data. I was able to fit the different available models to another data file I have and obtain fits under a couple of min. I am trying to fit m1 model to the simulated data I have and QU fitting is taking hours, I have another instance of the fitting running in the background now and it has been going on for 2 hours for now and the verbose messages on the screen has been the same. I don't know if the fit is failing or why it is taking such a long time. I was able to use the same simulated data with another QU fitting code and obtain good fits. Your insight in this matter would be really helpful

I am attaching the on screen output and the data file I am using with RM tools QU fitting code.

Thank you for yoour time and assistance!

![RM_QU_error](https://use
3C286_Model_RMtoolsQU_input_RM0.zip
r-images.githubusercontent.com/90620397/149141031-89107883-85fc-4325-92f3-e15ad15c3fc2.png)
!

QUfitting requests X display

I got this error when running qufit within slurm jobs:

qt.qpa.screen: QXcbConnection: Could not connect to display pearcey-login:48.0
Could not connect to any X display.

The options I used were qufit -i -m $model $file

Subprocess calls in tests

Hey @Cameron-Van-Eck,

I just noticed that QA_tests.py uses subprocess calls, e.g.

returncode=subprocess.call('rmsynth1dFITS simdata/3D/Q_cube.fits simdata/3D/U_cube.fits 25 25 -l 600 -d 3 -S',shell=True)
self.assertEqual(returncode, 0, 'RMsynth1D_fromFITS failed to run.')

I think it is best practice to avoid shell=True. See Python docs or this stack post.

Annoyingly, there are some changes in the behaviour of subprocess depending on the python version. In any case, I'd recommend something like the following to replace the above snippet:

import shlex
import subprocess
# Splits str into list of str for subprocess
command = shlex.split('rmsynth1dFITS simdata/3D/Q_cube.fits simdata/3D/U_cube.fits 25 25 -l 600 -d 3 -S')
try:
    # Raises CalledProcessError if returncode isn't 0
    proc = subprocess.check_output(command)
except subprocess.CalledProcessError:
    print('RMsynth1D_fromFITS failed to run')

QU-fitting: Handling of multimodal distributions

Currently QU-fitting reports the median of the posterior distributions as the 'best fit' value. This is fine for single component cases, but is problematic for multi-component fits which are often multi-modal.

I'm currently investigating a solution using a KDE to find the modal value, but I thought I should raise the issue for book-keeping purposes.

Noise in Q and U, and MAD vs theoretical

Hey @Cameron-Van-Eck,

I've been looking at how best to compute errors in Q and U after RM-synthesis. I've been digging into the util_RM.measure_FDF_parms function. For this purpose, I've pulled out the guts from the code to directly play with the values, and used the following as an input (as if it were running in do_RMsynth_1D.py:

QArr = np.ones(288)
UArr = np.zeros(288)
IArr = np.ones(288)
dQArr = np.ones(288) * 0.1
dUArr = np.ones(288) * 0.1
dIArr = np.ones(288) * 0.1
QArr += np.random.normal(loc=0, scale=0.1, size=288)
UArr += np.random.normal(loc=0, scale=0.1, size=288)
IArr += np.random.normal(loc=0, scale=0.1, size=288)
fit_function='log'
verbose = True
fitRMSF = True
debug = False
polyOrd = 2
log = print
C = 2.997924538e8 # Speed of light [m/s]
freqArr_Hz = np.linspace(744, 1032, 288) * 1e6
modStokesI = np.ones_like(freqArr_Hz)
nBits = 32
dtFloat = "float" + str(nBits)
dtComplex = "complex" + str(2*nBits)
phiMax_radm2 = 1000
dPhi_radm2 = None
nSamples = 10
weightArr = np.ones_like(freqArr_Hz)

Here's the result of RMsynth:

realFDF = FDF.real
imagFDF = FDF.imag
plt.plot(phiArr, realFDF, label="Stokes Q")
plt.plot(phiArr, imagFDF, label="Stokes U")
plt.plot(phiArr, absFDF, label="PI")
plt.legend()
plt.xlabel("$\phi$ [rad/m/m]")
plt.ylabel("Flux density [fake units]")

Here is what the MAD, rms, and theoretical values for the FDF noise came out to be:

print(f"{dFDFcorMAD=}", f"{dFDFrms=}", f"{dFDF=}")
# dFDFcorMAD=0.014828366 dFDFrms=0.034472786 dFDF=0.005892556509887897

Now I'll subtract the rmsf from the centre - i.e. a gain 1, n 1 RM-CLEAN but dumb. Now we can compare this residual to the noise estimators

rmsf_crop = RMSFArr[len(RMSFArr)//4+1:3*len(RMSFArr)//4]
for noise, name in zip((dFDFcorMAD, dFDFrms, dFDF), ("MAD", "RMS", "theory")):
    plt.figure()
    plt.plot(phiArr, realFDF - rmsf_crop.real, label="Stokes Q")
    plt.plot(phiArr, imagFDF - rmsf_crop.imag, label="Stokes U")
    plt.plot(phiArr, absFDF - np.abs(rmsf_crop), label="PI")
    plt.fill_between(phiArr, -noise, noise, alpha=0.2, label=name, color='k')
    plt.hlines(-0.01, phiArr[0], phiArr[-1], linestyles='dashed', color='k')
    plt.hlines(0.01, phiArr[0], phiArr[-1], linestyles='dashed', color='k', label="Input noise (frequency)")
    plt.ylim(-0.05, 0.05)
    plt.legend()
    plt.xlabel("$\phi$ [rad/m/m]")
    plt.ylabel("Residual flux density [fake units]")

Noting here that I've plotted the ±0.1 levels (the 1sigma noise in frequency), but I don't think we should see that exactly in the FDF, right?

Now as a sanity check, I input a pure noise signal:

QArr = np.random.normal(loc=0, scale=0.1, size=288)
UArr = np.random.normal(loc=0, scale=0.1, size=288)
IArr = np.random.normal(loc=0, scale=0.1, size=288)
dQArr = np.ones(288) * 0.1
dUArr = np.ones(288) * 0.1
dIArr = np.ones(288) * 0.1

Running the same as above, but not doing the dumb CLEAN:

print(f"{dFDFcorMAD=}", f"{dFDFrms=}", f"{dFDF=}")
# dFDFcorMAD=0.003147664 dFDFrms=0.0074483543 dFDF=0.005892556509887897
for noise, name in zip((dFDFcorMAD, dFDFrms, dFDF), ("MAD", "RMS", "theory")):
    plt.figure()
    plt.plot(phiArr, realFDF , label="Stokes Q")
    plt.plot(phiArr, imagFDF , label="Stokes U")
    plt.plot(phiArr, absFDF, label="PI")
    plt.fill_between(phiArr, -noise, noise, alpha=0.2, label=name, color='k')
    plt.hlines(-0.01, phiArr[0], phiArr[-1], linestyles='dashed', color='k')
    plt.hlines(0.01, phiArr[0], phiArr[-1], linestyles='dashed', color='k', label="Input noise (frequency)")
    plt.ylim(-0.05, 0.05)
    plt.legend()
    plt.xlabel("$\phi$ [rad/m/m]")
    plt.ylabel("Noise flux density [fake units]")

My takeaways:

In the presence of a signal, the MAD and RMS estimators seem to overestimate the noise. In my test case here, the was a factor of 3 between the MAD and the theoretical value.
Confusingly, for a pure noise spectrum the three estimators disagreed.
The theoretical noise value seems like a good 1sigma value for Stokes Q and U. Is this expected? If so, I would suggest we use it to output uncertainty in Q and U. i.e. we have peakFDFimagChan, peakFDFimagFit etc., we should also provide dPeakFDFimagChan, dPeakFDFimagFit etc.

P.S. We could also adapt the RMS/MAD estimators to run on the real/imaginary spectra quite easily too, if we trust their outputs.

Check RMSF FWHM equation

There are different versions of the prediction for RMWF FWHM. B&dB05 uses 3.47 in the numerator, Schnitzeler 2009 uses 3.8, Dickey 2019 uses 3.8.

Check the fitting results to see which is most accurate, then adjust the theoretical prediction to be more useful!

Polarization angles use wrong convention

Currently, the polarization angles are in the range (-90,90], but the RMTable convention asks for angles in the range [0,180). It would be better to be consistent.
This is only for the band-center polarization angles; the derotated angles seem to be OK already? This should be confirmed.

Adjust Rmsynth3D workflow to save memory

Right now, RMsynth3D computes the FDF and RMSF, then saves both of them. This ends up occupying a lot more memory than is strictly needed, because there's no real need to hold both of them in memory at the same time (for clean, yes, for synth, no).

It could be useful someday to refactor RMsynth3D to do them sequentially: computing and saving the first, freeing up the memory, then doing the second (if needed).

Stretch goal: add an option to compute ONLY the RMSF (right now there's options to compute FDF + RMSF, or just FDF)? Possibly minimal use case for it, although Shannon just asked me about it....

QUfitting outputs

The current QUfitting outputs are not very good. There should be json/dat key-value files with all the results saved in a clear form.

rmsynth3d TypeError: 'numpy.float64' object cannot be interpreted as an integer

when I run rmsynth3d Q.fits U.fits freq.dat, I run into the following error: https://pastebin.com/9vBcyr7g

Installation was successful, it showed: Successfully installed RM-tools astropy-3.2.3 cycler-0.10.0 kiwisolver-1.1.0 matplotlib-3.0.3 numpy-1.18.1 pyparsing-2.4.6 python-dateutil-2.8.1 scipy-1.4.1 six-1.14.0

Migrate QU fitting away from pymultinest

Adding this with the knowledge this will require a good deal of development + testing. But I think it'd be worth it for in the long run for both speed and maintainability.

Pymultinest is a python wrapper around the compiled multinest code. Unfortunately, the documentation on both pymultinest and multinest are rather sparse. This makes bug-fixing and development around it very difficult.

Further, running multinest can be very slow. Confusingly, it sometimes slows down when used with MPI, as well. This restricts its use for large area surveys e.g. POSSUM. It also prohibits any use on diffuse images of any kind - i.e. fitting for every pixel will take far too long.

There now exist several libraries which offer the same outcomes of multinest, but with much better interfaces, documentation, and speed. A couple that come to mind are:

UltraNest -- Note this is from the same developer as pymultinest, but seems to be receiving much more recent development!
dynesty

These libraries also have built in mode decomposition which should be a lot more robust than what we currently have implemented.

QU fitting incompatible with Py3.5

The QU fitting code uses the new Python format strings, which were added in Python 3.6. Currently the official version requirement is 3.5, so I either need to update the version requirement or remove the format strings. I'm inclined towards the later, but open to other opinions.

RM-Clean does too many iterations

Small but fairly critical bug in the main CLEAN loops

RM-Tools/RMutils/util_RM.py

Line 578 in d260212

while (np.max(np.abs(residFDF)) >= self.cutoff and iterCount <= self.maxIter):

RM-Tools/RMutils/util_RM.py

Line 614 in d260212

    
           while (np.ma.max(np.ma.abs(residFDF_mask)) >= self.window and iterCount <= self.maxIter):

The iteration count should use iterCount < self.maxIter rather than iterCount <= self.maxIter, leading to one too many iterations. Simple PR incoming.

Outputs of 1D RMsynth/RMclean

It would be great to have a reference for the meaning of all the output values from the 1D RM-synth and RM-CLEAN.

Currently, POSSUM report 70 list some of these, but it appears this list is now out of date with many more entries now appearing. Additionally, this list is internal to POSSUM. I think it would be best to have this in a table on the RM-tools wiki for all users.

Noise characterization for 3D RM-synth / RM-clean outputs

Some tools for characterizing the noise in the RM cubes (those produced by RM-synthesis and/or RM-clean) would be useful for a number of purposes, e.g. source-finding, combating Ricean bias.

Plotting for QUfitting and RMclean1d

The way plots are handled between RMsynth and RMclean (and QUfitting) aren't consistent. I think the RMsynth method handles things better, so why not make it consistent across all bits of code?

Power-law fitting to Stokes I

Emailed to me:

Dear Cameron,
I am Kamlesh Rajpurohit working as a postdoc at University of Bologna. I am writing you
regarding the RM-Tool code for QU-fitting.

For Stokes I spectra, it would be helpful to have the option of a power-law fit.
Currently, the code provides a polynomial fit.

Thank you.

best regards,
Kamlesh

Suggest deleting swap file

The file .do_RMsynth_1D.py.swp causes confusing vi error messages when one tries to open do_RMsynth_1D.py. I suggest this can be safely removed from the repo.

Add progress bar to 3D RM synthesis

Requested by Chris Riseley.

Previous versions had a progress bar, but this was broken in the Python 2 -> 3 transition.

Possible typo

In the following line of the RMsynth1D code, it looks like the intended range for the np.arrange is from -phi_max to +phi_max. If not I don't quite understans why there is an addition of 1e-6 to phi_max. Please look into this. :)

RM-Tools/RMtools_1D/calculate_RMSF.py

Line 114 in 0d0f9ce

phi_array=np.arange(-1*phi_max,phi_max+1e-6,dphi)

New Stokes I fitting routine

There are concerns that the current Stokes I fitting code may not be the best currently available (since it's based on the FORTRAN MINPACK-1 package from 1980). Some experimenting is needed to determine if there's a better option.

Ideas to consider:

Same fitter, but add constraints (which are apparently supported?)
scipy.optimize, with or without constraints
Other packages?

What constraints would be valid? For the power law model, amplitude > 0 certainly. Constraints on spectral index? Not clear.

Side note: the scipy.optimize.least_squares method='lm' uses the same algorithm as MINPACK-1.

A minor problem on RMtools_3D/mk_test_cube_data.py

Hi Cameron,

It is just a minor issue of RMtools_3D/mk_test_cube_data.py.
The "clobber" of hdul.writeto might have been replaced by 'overwrite' in the newer version of Astropy (mine is v 5.1).
From

hduI.writeto(fitsFileOut, clobber=True)

hduI.writeto(fitsFileOut, overwrite=True)

Ruolan

QU-fitting Stokes I model failing for many double-component RM models (m11 and m4)

Quick Summary:

I’m using qufit to run a bunch of models on ~60 polarized spectra.
For the majority of my sources, the Stokes I model generated by qufit seems to be failing, with the model coefficients being defined as the following in the output files:
- Imodel=0.0,0.0,0.0,0.0,0.0,1.0
For the exact same sources, this doesn’t occur when running any of the single-component models.
The result of this is we have a fractional polarization fit that is extremely low and doesn’t represent the source well, but the shape seems reasonable, and the fit polarization angle modelling the data seems to describe the data pretty well.

More details:

I double-checked that I was inputting the exact same text file to RM tools for the single component models, and the component models and I still get this weird issue with the Stokes I model.
For these models where the Stokes I model seems to fail, we get something reasonable for a reduced chi^2, indicating a good fit.
I re-ran the double component models, and the same error occurred the second time around.
I’m attaching here plots of a source where we modelled m1 and m11, and their output model files, where we are seeing this issue (but note that we are seeing this for most sources). I can provide other examples if necessary.

I see this issue with m11 and m4, whereas m1 and m2 both look okay, but I'm attaching the simpler m11 and m1 models here since it might be easier to figure out what's going on with the simpler case.

0345-5036.m11.nestle.txt
0345-5036.m1.nestle.txt

Fractional polariation error bugs and other things from Shinsuke

Feedback from Shinsuke Ideguchi:

While I was working on pymultinest, I just happened to find a possible bug in RMutils/util_misc.py at lines 364-367, where there are calculations of the errors of fractional polarization spectra (It looks like this is left over from Cormac’s version, but I am contacting you because RM-tools is now maintained by you). The point is that “qArr” and “uArr” can be negative and thus, their errors (“dqArr” and “duArr") can also be negative. I think “qArr” and “uArr” should be taken as the absolute values when their errors are calculated.

After finding this, I started worrying about the weight (or the window function) of RM synthesis (I guess this bug doesn’t affect the QU-fitting because the error usually appears as squared value). Then, I realized the other point (may be a bug?) in a code. In the 1D code (do_RMsynth_1D.py) at line 243, the weight is calculated by using “dQUArr” which is from the error of Stokes Q and U, not that from the fractional polarization. The point is that this value is used to calculate the weight even when the fractional polarization is the input of RM synthesis. I guess this should be something like “dquArr = (dqArr+duArr)/2.0”, but not “dQUArr = (dQArr+dUArr)/2.0”?

Also, although this is not a bug, the 3D code (do_RMsynth_3D.py) uses the uniform weight unless the “noiseFile” is given. This means 3D code basically provides different results from 1D code. Do you think it is useful to add an option to calculate the weight from Stokes Q and U (and I) automatically in 3D code as well? Maybe is it too computationally heavy?

Fracpol not bias-corrected

Putting a to-do item for later:

If I'm remembering correctly, the fractional polarization values are not corrected for bandwidth depolarization. That's probably bad and should be fixed soon...

Scaling 3D routines with Dask

Given the large data requirements for upcoming surveys, such as POSSUM, I thought I'd look into scaling the 3D routines with Dask. Although, in principle this could also help scale a lot of the 1D routines too!

I've done a test with the 3D rmsynthesis stage. Note that this was run on my laptop. Dask can scale very easily, and integrates with schedulers like Slurm. Basically, we could throw enormous data sets and a supercomputing environment, and the performance will just scale.

Here's what is currently implemented with numpy:

import numpy as np
from RMutils.util_misc import progress 
from astropy import constants as const

# Create some fake data input
freq = np.arange(744,1032,1)*1e6
lsq = (const.c.value / freq)**2
phis = np.linspace(-100,100,10)
dataQ = np.random.random(size=(len(freq),1000,1000))
dataU = np.random.random(size=(len(freq),1000,1000))
lambdaSqArr_m2 = lsq
phiArr_radm2 = phis
weightArr=None
lam0Sq_m2=np.mean(lsq)
nBits=32
verbose=True
log=print

%%time

# Default data types
dtFloat = "float" + str(nBits)
dtComplex = "complex" + str(2*nBits)

# Set the weight array
if weightArr is None:
    weightArr = np.ones(lambdaSqArr_m2.shape, dtype=dtFloat)
weightArr = np.where(np.isnan(weightArr), 0.0, weightArr)

# Sanity check on array sizes
if not weightArr.shape  == lambdaSqArr_m2.shape:
    log("Err: Lambda^2 and weight arrays must be the same shape.")
    #return None, None
if not dataQ.shape == dataU.shape:
    log("Err: Stokes Q and U data arrays must be the same shape.")
    #return None, None
nDims = len(dataQ.shape)
if not nDims <= 3:
    log("Err: data dimensions must be <= 3.")
    #return None, None
if not dataQ.shape[0] == lambdaSqArr_m2.shape[0]:
    log("Err: Data depth does not match lambda^2 vector ({} vs {}).".format(dataQ.shape[0], lambdaSqArr_m2.shape[0]), end=' ')
    log("     Check that data is in [z, y, x] order.")
    #return None, None

# Reshape the data arrays to 3 dimensions
if nDims==1:
    dataQ = np.reshape(dataQ, (dataQ.shape[0], 1, 1))
    dataU = np.reshape(dataU, (dataU.shape[0], 1, 1))
elif nDims==2:
    dataQ = np.reshape(dataQ, (dataQ.shape[0], dataQ.shape[1], 1))
    dataU = np.reshape(dataU, (dataU.shape[0], dataU.shape[1], 1))

# Create a complex polarised cube, B&dB Eqns. (8) and (14)
# Array has dimensions [nFreq, nY, nX]
pCube = (dataQ + 1j * dataU) * weightArr[:, np.newaxis, np.newaxis]

# Check for NaNs (flagged data) in the cube & set to zero
mskCube = np.isnan(pCube)
pCube = np.nan_to_num(pCube)

# If full planes are flagged then set corresponding weights to zero
mskPlanes =  np.sum(np.sum(~mskCube, axis=1), axis=1)
mskPlanes = np.where(mskPlanes==0, 0, 1)
weightArr *= mskPlanes

# Initialise the complex Faraday Dispersion Function cube
nX = dataQ.shape[-1]
nY = dataQ.shape[-2]
nPhi = phiArr_radm2.shape[0]
FDFcube = np.zeros((nPhi, nY, nX), dtype=dtComplex)

# lam0Sq_m2 is the weighted mean of lambda^2 distribution (B&dB Eqn. 32)
# Calculate a global lam0Sq_m2 value, ignoring isolated flagged voxels
K = 1.0 / np.sum(weightArr)
if lam0Sq_m2 is None:
    lam0Sq_m2 = K * np.sum(weightArr * lambdaSqArr_m2)
if not np.isfinite(lam0Sq_m2): #Can happen if all channels are NaNs/zeros
    lam0Sq_m2=0.

# The K value used to scale each FDF spectrum must take into account
# flagged voxels data in the datacube and can be position dependent
weightCube =  np.invert(mskCube) * weightArr[:, np.newaxis, np.newaxis]
with np.errstate(divide='ignore', invalid='ignore'):
    KArr = np.true_divide(1.0, np.sum(weightCube, axis=0))
    KArr[KArr == np.inf] = 0
    KArr = np.nan_to_num(KArr)

# Do the RM-synthesis on each plane
if verbose:
    log("Running RM-synthesis by channel.")
    progress(40, 0)
a = lambdaSqArr_m2 - lam0Sq_m2
for i in range(nPhi):
    if verbose:
        progress(40, ((i+1)*100.0/nPhi))
    arg = np.exp(-2.0j * phiArr_radm2[i] * a)[:, np.newaxis,np.newaxis]
    FDFcube[i,:,:] =  KArr * np.sum(pCube * arg, axis=0)

# Remove redundant dimensions in the FDF array
FDFcube = np.squeeze(FDFcube)

Running RM-synthesis by channel.
  [========================================] 100%
CPU times: user 34.8 s, sys: 24.1 s, total: 58.9 s
Wall time: 1min 2s

Now with Dask! Note that da is basically a drop-in for np in most cases.

import dask.array as da
import numpy as np
from RMutils.util_misc import progress 
from astropy import constants as const

# Create some fake data input
freq = np.arange(744,1032,1)*1e6
lsq = (const.c.value / freq)**2
phis = np.linspace(-100,100,10)
dataQ = da.random.random(size=(len(freq),1000,1000))
dataU = da.random.random(size=(len(freq),1000,1000))
lambdaSqArr_m2 = lsq
phiArr_radm2 = phis
weightArr=None
lam0Sq_m2=da.mean(lsq)
nBits=32
verbose=True
log=print



%%time

dtFloat = "float" + str(nBits)
dtComplex = "complex" + str(2*nBits)

# Set the weight array
if weightArr is None:
    weightArr = da.ones(lambdaSqArr_m2.shape, dtype=dtFloat)
weightArr = da.where(da.isnan(weightArr), 0.0, weightArr)
# Sanity check on array sizes
if not weightArr.shape  == lambdaSqArr_m2.shape:
    log("Err: Lambda^2 and weight arrays must be the same shape.")
if not dataQ.shape == dataU.shape:
    log("Err: Stokes Q and U data arrays must be the same shape.")
nDims = len(dataQ.shape)
if not nDims <= 3:
    log("Err: data dimensions must be <= 3.")
if not dataQ.shape[0] == lambdaSqArr_m2.shape[0]:
    log("Err: Data depth does not match lambda^2 vector ({} vs {}).".format(dataQ.shape[0], lambdaSqArr_m2.shape[0]), end=' ')
    log("     Check that data is in [z, y, x] order.")


#dataQ = da.from_array(dataQ)
#dataU = da.from_array(dataU)
# Reshape the data arrays to 3 dimensions
if nDims==1:
    dataQ = da.reshape(dataQ, (dataQ.shape[0], 1, 1))
    dataU = da.reshape(dataU, (dataU.shape[0], 1, 1))
elif nDims==2:
    dataQ = da.reshape(dataQ, (dataQ.shape[0], dataQ.shape[1], 1))
    dataU = da.reshape(dataU, (dataU.shape[0], dataU.shape[1], 1))
    
# Create a complex polarised cube, B&dB Eqns. (8) and (14)
# Array has dimensions [nFreq, nY, nX]
pCube = (dataQ + 1j * dataU) * weightArr[:, np.newaxis, np.newaxis]

# Check for NaNs (flagged data) in the cube & set to zero
mskCube = da.isnan(pCube)
pCube = da.nan_to_num(pCube)

# If full planes are flagged then set corresponding weights to zero
mskPlanes =  da.sum(da.sum(~mskCube, axis=1), axis=1)
mskPlanes = da.where(mskPlanes==0, 0, 1)
weightArr *= mskPlanes

# Initialise the complex Faraday Dispersion Function cube
nX = dataQ.shape[-1]
nY = dataQ.shape[-2]
nPhi = phiArr_radm2.shape[0]
FDFcube = da.zeros((nPhi, nY, nX), dtype=dtComplex)

# lam0Sq_m2 is the weighted mean of lambda^2 distribution (B&dB Eqn. 32)
# Calculate a global lam0Sq_m2 value, ignoring isolated flagged voxels
K = 1.0 / da.sum(weightArr)
if lam0Sq_m2 is None:
    lam0Sq_m2 = K * da.sum(weightArr * lambdaSqArr_m2)
if not np.isfinite(lam0Sq_m2): #Can happen if all channels are NaNs/zeros
    lam0Sq_m2=0.

# The K value used to scale each FDF spectrum must take into account
# flagged voxels data in the datacube and can be position dependent
weightCube =  da.invert(mskCube) * weightArr[:, np.newaxis, np.newaxis]
with np.errstate(divide='ignore', invalid='ignore'):
    KArr = da.true_divide(1.0, da.sum(weightCube, axis=0))
    KArr[KArr == np.inf] = 0
    KArr = da.nan_to_num(KArr)

# Do the RM-synthesis on each plane
if verbose:
    log("Running RM-synthesis by channel.")
    progress(40, 0)
a = lambdaSqArr_m2 - lam0Sq_m2
FDFcube = []
for i in range(nPhi):
    if verbose:
        progress(40, ((i+1)*100.0/nPhi))
    arg = da.exp(-2.0j * phiArr_radm2[i] * a)[:, np.newaxis,np.newaxis]
    FDFcube += [KArr * da.sum(pCube * arg, axis=0)]
FDFcube = da.squeeze(da.stack(FDFcube, axis=0))
output = FDFcube.compute()

Running RM-synthesis by channel.
  [========================================] 100%
CPU times: user 757 ms, sys: 632 ms, total: 1.39 s
Wall time: 21.4 s

network.pdf

Progressbar slows progress

RM-synth without progress:

%%timeit
mDict, aDict = do_RMsynth_1D.run_rmsynth(
    data=data,
    polyOrd=2,
    phiMax_radm2=None,
    dPhi_radm2=None,
    nSamples=10,
    weightType="variance",
    fitRMSF=True,
    noStokesI=False,
    modStokesI=None,
    nBits=32,
    saveFigures=False,
    showPlots=False,
    verbose=False,
    debug=False,
    fit_function="log",
    prefixOut=None,
)
496 ms ± 3.78 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

and with

%%timeit
mDict, aDict = do_RMsynth_1D.run_rmsynth(
    data=data,
    polyOrd=2,
    phiMax_radm2=None,
    dPhi_radm2=None,
    nSamples=10,
    weightType="variance",
    fitRMSF=True,
    noStokesI=False,
    modStokesI=None,
    nBits=32,
    saveFigures=False,
    showPlots=False,
    verbose=True,
    debug=False,
    fit_function="log",
    prefixOut=None,
)
1.45 s ± 40.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Turning on the progress here adds a 3x slowdown!

Not sure exactly where the overhead is here. As a possible solution, I would suggest tqdm as a light-weight drop-in progressbar.

Rmsynth 3D returning zero output

Hello,

This is not exactly an issue with RMsynth but some unexpected output I don't know how to interpret.

I am using RMsynth3D on a set of fits files for a target (T200) in my dataset. I am using stokes I,Q and U cubes as input. Image is 100x100 pixels in size. When I run rmsynth3D with the necessary freq file on this target it runs with no major errors and seems to finish. The output FDF_tot_dirty image is just a blank image with 0 value. hen I try to open the image I get the message saying it has bad data, i.e, maxis not geater than min as max=min=0.

I am attaching a screenshot showing the rmsynth log and the consecutive attempt to open the image.
I do not have this problem with my other target(C215). I am trying to understand why this is happening. For T200 there is almost no signal in the Q channel cubes, the U images however have a strong signal. For C215 there is a strong signal in both Q and U, could this be a reason why RMsynth seems to be unable to process the T200 cubes?

Your insight would be really helpful!

Unit testing and test coverage

Are there any plans for unit tests in RM tools to check sections of code working as expected?

This would help find bugs when someone changes sections of code or introduces anything new.

3Dsynth Issue: 'ValueError: Maximum allowed size exceeded'

Sorry to bother you, but I ran into this issue when running do_RMsynth_3D.py ...

When running the line:
' ./do_RMsynth_3D.py "myStokesQfitsfile" "myStokesUfitsfile" "MyFreqdatfile" '
I receive the error:
' ValueError: Maximum allowed size exceeded '

Terminal in full:

' (base) Jacobs-MacBook-Pro:RMtools_3D jacobaskew$ ./do_RMsynth_3D.py /Users/jacobaskew/Desktop/CSIRO/Data/P2G24/J093120+582050/Images/J093120+582050_Q_cm_pb_cube_sub.fits /Users/jacobaskew/Desktop/CSIRO/Data/P2G24/J093120+582050/Images/J093120+582050_U_cm_pb_cube_sub.fits /Users/jacobaskew/Desktop/CSIRO/Data/P2G24/J093120+582050/data/P2G24_J093120+582050_freqs.dat
Traceback (most recent call last):
File "./do_RMsynth_3D.py", line 632, in
main()
File "./do_RMsynth_3D.py", line 616, in main
not_rmsf = args.not_RMSF)
File "./do_RMsynth_3D.py", line 129, in run_rmsynth
phiArr_radm2 = np.linspace(startPhi_radm2, stopPhi_radm2, nChanRM)
File "<array_function internals>", line 6, in linspace
File "/anaconda3/lib/python3.7/site-packages/numpy/core/function_base.py", line 143, in linspace
y = _nx.arange(0, num, dtype=dt).reshape((-1,) + (1,) * ndim(delta))
ValueError: Maximum allowed size exceeded
(base) Jacobs-MacBook-Pro:RMtools_3D jacobaskew$ '

The data file sizes are as follows: (I ran ls -l *)

' -rw-r--r-- 1 jacobaskew staff 225288000 6 Dec 15:40 J093120+582050_Q_cm_pb_cube_sub.fits '

' -rw-r--r-- 1 jacobaskew staff 225288000 6 Dec 15:40 J093120+582050_U_cm_pb_cube_sub.fits '

If you need more information or there is anything else I can do to speed up the process of solving this issue then let me know!

Thanks so much I hope this is very simple and I am doing a silly!

Auto-flagging input Stokes cube channels

I would like the option of auto-flagging bad I, Q, U channels after loading in the Stokes cube or the Stokes image planes. For example, a user specified threshold based on the median rms noise in all channels.

Add dphi,Phimax to output dictionaries for rmsynth1d

It would be handy to be able to reference those directly, when they're determined by the program and not the inputs. It should be easy to add them to the output mDict in rmsynth1d

cirada-tools / rm-tools Goto Github PK

rm-tools's People

Contributors

Stargazers

Watchers

Forkers

rm-tools's Issues

saveOutput function

Recommend Projects

Recommend Topics

Recommend Org