Code Monkey home page Code Monkey logo

Comments (6)

scottransom avatar scottransom commented on August 16, 2024

Yes, this is expected behavior. And the quick reason why is that we are missing some data to be able to output the full-duration of the data because you are de-dispersing.

Since you are dedispering to a positive DM, you are assuming that the low-frequency data is arriving later than the high-frequency data. That's fine at the beginning of the observation where we can simply grab later samples at lower freqs to add them to the initial sample at the highest freq. But that doesn't work at the end of the observation where we have no low freq data recorded! So since we don't know what to add in order to properly dedisperse, we simply ignore parts of the original file where we don't have all frequencies we need.

This problem gets worse and worse (meaning you lose a bigger and bigger fraction of your input data) when 1) the observation duration is relatively short and 2) when the total amount dispersion (in time) is long between the top and bottom of the observing band.

In your case, for a DM of 300, the differential delay across the band (1200-1800MHz) is 0.48 seconds, and so you will always lose at least that amount of data.

However, there is another issue in that prepsubband (and prepdata) work on blocks of data of (by default) 2400 samples (see the Spectra/Subint number in the output) for filterbank data. So you will often get an integer number of samples of that duration out.

And finally, both of those commands automatically choose a highly factorable number of data points to output so that FFTs on that data will be efficient.

For your initial case you are seeing all of these effects. You can mitigate some of this (but you can't get back data you don't have because of dispersion) by specifying the number of output data points to use with the -numout option. If you specify -numout to equal the number of input data points, trying looking at the end of the resulting .dat file with the exploredat command and you will see how the noise in the data changes as we have less and less data available to do the full and proper dedispersion.

Hope this helps!

from presto.

Newtonlml avatar Newtonlml commented on August 16, 2024

Thank you very much for your response.

I understood your explanation and it makes sense. I tried your suggestion of looking into the .dat file but I don't see anything weird. Is there a way to predict how much data I will lose? Because like you said, with the DM and frequency range I can obtain the least amount of data that I will lose, in the example you mentioned it was ~0.5 seconds but I am losing around a minute of data. I assume this is due to the integer number of blocks that you mentioned. When executing prepsubband it says that for my original number of samples the "good number of samples" to work with is 30240 but the data written only gets to 24000. Is there a way to know how many blocks I will lose?

from presto.

scottransom avatar scottransom commented on August 16, 2024

Have you tried specifying -numout? And where does it say "30240"?

from presto.

Newtonlml avatar Newtonlml commented on August 16, 2024

Yes I have tried specifying -numout to be equal to the original samples.
I changed the PRESTO version to a more recent one so now the outputs look like this:
-prepsubband -nobary -lodm 300 -dmstep 1 -numdms 1 -downsamp 1 -nsub 2048 -runavg -numout 29868 -o prep_output 2022-09-10_11_15_25.404305.fil
image
And this is without the -numout (where the 30240 comes from):
image
As you can see, the data points written in both cases is 24000 and is padding the rest, here is a screenshot of exploredat of whren the -numout is specified but it looks the same when is not specified.
image

from presto.

scottransom avatar scottransom commented on August 16, 2024

One other request for you to try: can you try using prepdata to make a single time series rather than prepsubband? The latter is quite a bit more complicated (since it allows you to output mutiple time series at once), so there might be a bit of extra overhead there.

from presto.

Newtonlml avatar Newtonlml commented on August 16, 2024

I tried prepdata and it doesn't remove as much samples as prepsubband. Specifying -numout or not, it writes 28800 samples instead of the 24000 from prepsubband.
prepdata_output

This amount of data lost makes more sense giving the amount of samples per data block and the least amount of data I should lose in the process. So it seems that prepsubband may be removing more samples in the process to optimize the output of many time series? In that case prepdata seems to be the best command for my case.
Thank you very much for your help :)

from presto.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.