lsstdesc / descqa Goto Github PK

View Code? Open in Web Editor NEW

8.0 25.0 30.0 47.47 MB

A validation framework and tests for mock galaxy catalogs and beyond

Home Page: https://portal.nersc.gov/cfs/lsst/descqa/v2/

License: BSD 3-Clause "New" or "Revised" License

Python 86.66% Shell 0.25% JavaScript 0.47% HTML 1.46% CSS 0.43% Fortran 10.61% Makefile 0.12%

nersc descqa validation python lsst web-app

descqa's Issues

new subdirectory `descqagen` to host code that generates validation datasets

We propose to create a new subdirectory, named descqagen, which will host code that generates validation datasets, for example, code to query the HSC database.

descqagen would hence be part of, but mostly independent from other components of the DESCQA framework.

@duncandc agrees to be the first volunteer to push his code in.

galaxy ellipticity distribution test

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

dN/dmag test

See LSSTDESC/DC2-production#21 for detail and validation data from HSC.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

remove custom plotting styles (colors, fig size, font size, etc)

Tests should refrain from customizing plotting styles (colors, fig size, font size, etc). This will help the plot style look more consistent among different tests. If later we want to adjust plotting styles, we can do it for all tests in descqa/plotting.py.

Each test can still do minor adjustments (e.g., mark styles and sizes and line styles) to fit its need.

Radial profiles of red-sequence cluster members

Moving the conversation started in #10 here. The idea is that we would like to make sure that the CL WG goal of investigating miscentering in DC2 is possible by validating the radial profiles of cluster members.

One way to do this is just to directly measure the radial profiles in cluster mass halos. This should be easy to implement, but is going to be difficult to compare to data, since and profiles measured there will have significant projection effects.

The other thing to do is to measure color dependent clustering, which will be easier to find validation data for, but less directly tests what we care about.

Opinions @rmandelb @aphearin @yymao @erozo @anjavdl?

color cut in CLF test should work on catalogs that have redshift blocks

This issue that @evevkovacs encountered was due to the fact that the color cut implementation in the CLF test assumes the catalog does not iterate over redshift blocks. When a catalog iterates over redshift blocks, some blocks may not have the redshift ranges needed to determine the color cut, and hence results in error.

This can be fixed if the test first determine the color cut using all blocks.

Consistency Check for Convergence and Magnification

Since these are related, we should check that the shears, convergence and magnification are self consistent (within machine precision). The test is that magnification should be satisfy
'1/magnification = (1-kappa)2 - shear12 - shear2**2)'
(kappa = convergence)

Add SEDs to readiness test

@yymao Please take a look at https://github.com/evevkovacs/descqa/blob/ready_SEDS/SED_test.ipynb
The distributions are not as unreasonable looking as I expected, if we convert the SEDs to magnitudes. I am thinking that we can add a keyword to the yaml file such as function: -2.5*np.log10, which tells the readiness test to first evaluate the function on the catalog data before plotting it or computing any other statistics. What do you think?

Create tag for Run 1.1

I'm going to create a tagged version of DESCQA for Run 1.1 (while the DESCQA is not directly used in the run, it documents the validation tests that the catalog going it has passed, so I think it is good to have a tagged version too.)

Is there any issues/PRs should be resolved before we make this tag? Maybe #60 and #64? Do we think it is reasonable to close them before Friday? Or should we leave them for the next tag (i.e., not included in Run 1.1)?

red sequence colors (mean, scatter) as a function of redshift

@erykoff have made some plots on red sequence colors (mean, scatter) as a function of redshift for protoDC2 and compared them with DES data.

@erykoff, can you share your plots here, and then we can continue to make a validation test from what you have done?

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

color-color diagram

As discussed in #40, we open this issue to discuss implementing a color-color diagram test in DESCQA. According to @janewman-pitt-edu this would not be a required test but it would be nice to have for visual inspection. (cc'ing @morriscb @sschmidt23)

@nsevilla, it seems to me that you have implemented this test to some extent. Will you be able to port it into DESCQA? Let me know if you need any help.

code to reduce mock data
code that works within DESCQA framework

Stellar mass distribution

The purpose of this issue is to understand the shape of the distribution of galaxy stellar masses in protoDC2 catalogues as we have been discussing this with @reneehlozek. In the previous version, there was a small bump at about 10^10 solar masses that was not obvious to me why it was there. In the most recent version, the bump is still there as well as another skewness at lower masses around 10^8 solar masses. Maybe this second bump is because that the peak of the distribution was about 10^8 solar masses in version 2 while the peak is currently at much lower masses around 10^(6-7) solar masses. The histograms are shown below:

I checked the stellar mass distribution from SDSS galaxies from Maraston et al. 2013 that plotted the distributions for SDSS BOSS and CMASS galaxies that were fit with different templates. These distribution do not have any obvious bumps. The figures from the paper are shown below for BOSS and CMASS catalogues:

I also know that the CMASS galaxies are more biased toward higher mass galaxies but it would be good to know why protoDC2 galaxies peak at such small stellar masses.

2pt correlation (e.g. ellipticity-direction) for IA model testing

@EiffL, @patricialarsen, and @jablazek have been working on getting 2pt correlation (e.g. ellipticity-direction) for IA model testing.

Can one of you list the items that we should test? And we can see if we need separate one issue for each of them.

Some techniques used here are related to #10 #35 but the purpose and validation datasets would be different.

P.S. @patricialarsen I cannot assign you. Please register your GitHub account to the DESC roster.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Tree ring test

@cwwalter requested in #127 a test to check the amplitude of tree rings in 1.2i. We have code from @karpov-sv here: https://github.com/karpov-sv/lsst-misc/blob/master/Tree_Rings_Analysis.ipynb

We'd need to adapt this code to use the e-image reader in GCR. @andrewkbradshaw, would you like to try to adapt this code to DESCQA?

extend redshift range in N(z) test

Currently the redshift distribution test, N(z), only goes to z = 1. We should extend the redshift range to z = 3 to make sure things look reasonable beyond z > 1. This is particular important for cosmoDC2.

The validation data is valid in the redshift range up to around z = 1.5, so we can plot the redshift range to z = 3, but don't use the data beyond z = 1.5 to calculate the chi^2.

cc @evevkovacs @rmandelb

Verification Test to expose catalog meta-data

Users should have a way to query the meta-data to find out definitions such as the ellipticity definition etc., or just to see what meta-data is available.

stellar mass function

I know @evevkovacs is working on the SMF test so I opened this issue to track progress.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

shear sign convention test

This is to test if the catalog follow the shear sign convention.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Check uniqueness for columns that contain unique identifiers in the readiness test

As proposed in LSSTDESC/cosmodc2#19, we'd like to add the functionality in the readiness test to check uniqueness for columns that contain unique identifiers, such as galaxy_id and halo_id.

(cc @evevkovacs @aphearin @danielsf)

testing luminosity function of cluster galaxies (satellites, centrals)

@rmandelb is the idea to do a conditional luminosity function using true halos?

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

position angle distribution

This is a required validation test from the WL WG @msimet. The distribution of position angle should be a uniform distribution.

This is a very simple test and is certainly a new-comer-welcome task.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

galaxy size distribution

This is a required validation test from the WL WG @msimet. This test is to check the distribution (mean, scatter, etc) of galaxy sizes. This test is somewhat related to the size-luminosity test #13.

Proposed validation data set is COSMOS.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

N(z) test

I believe @evevkovacs is already working on this. This issue is to track progress.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

color distribution test

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Image generation pipeline sanity test

The purpose of this test will be to check the consistency of the instance catalog creation and image generation with the input quantities in the protoDC2 catalog. This can be done by generating a small instance catalog and one exposure with ImSim and checking the measured sizes and ellipticities of the galaxies in the field. The statistics will not be very high but it will catch obvious failures.
This test will also be able to check for obvious failures in the model for complex morphologies slated to be included in the ImSim version of DC2

Test equivalent width distributions in H alpha and [O III] at z<1

See LSSTDESC/DC2-production#31 for detail.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Tests for DESCQA using the DM products

DC2 validation test brainstorming by @rmjarvis and @fjaviersanchez:

Image level (@rmjarvis):

Check that the images contain some pixels above 10sigma level.
Calculate gain and read noise and compare with prediction.
Check masked (saturated) bits of the images.
Check masked (bad/dead) pixels -> PhoSim.

Catalog (visit level):

use stars and use PSFmag to compute the CheckAstroPhoto test. (Using standalone test check in DC2-production #259). Update 09/09/18: Done in standalone code.
size stars vs magnitude at different epochs should be flat (use HSM size/sdssShape). Use scatter plot for every single star. Update 09/09/18: Done in standalone code.
given a calexp, select a clean stellar sample, check the PSF on each location (position of the star) and check the stacked difference (low priority).
select a set of calexps and check that the input seeing is correlated with the size of the stars appearing in them: Update 09/10/18: Done in standalone code
DCR test: translate the shape of the star to get the shape on the zenith direction for a bunch of good stars, separate per band, and check this as a function of airmass.
DCR test: repeat that splitting the sample into redder and bluer stars.

Catalog (coadd level):

~~Separate stars and galaxies and use them in CheckAstroPhoto~~.
In CheckAstroPhoto add the input N(m) and the output N(m), check ratio and see when they start to separate from each other (in progress, see here).
~~Check that galaxy density decreases with MW extinction (#140)~~.
~~Check color-color diagram for input and output for several colors (inspect to validate)~~ -> (#141)
Red sequence test (#41 and #101)
Add input-true size as a function of true size.
Count the number of objects around a central galaxies in a given aperture (1 arcmin) and represent that as a function of the cluster richness (something in the input???).

create flowcharts to illustrate the code structure of descqarun and descqaweb

To help future maintainers, we will create flowcharts to illustrate the code structure of descqarun and descqaweb. The flowcharts will provide a high-level picture of how different components of the DESCQA framework are connected.

(cc'ing people who might be interested in this issue: @tomuram @evevkovacs @ehneilsen)

size-luminosity relation test

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

color magnitude diagram

During the Sprint Week, @DouglasLeeTucker and @saharallam have made progress on creating color magnitude diagrams for GCR Catalogs (protoDC2 and Buzzard).

@DouglasLeeTucker and @saharallam, can you share some plots you made with us here, and then we can continue to make a validation test from what you have done?

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

finalize the acceptable quantity ranges for the readiness test on protoDC2

Now that we have a readiness test in place (you can see a demo here), we need to finalize the acceptable ranges for each quantities in protoDC2, especially those used by the image generation.

All the acceptable ranges are set in this YAML config file. @evevkovacs and I have put in many quantities, but I don't think we have exhaust them. Some of them do not have have acceptable ranges specified.

Once this is done, it means we can sign off protoDC2 2.1.2. I assign this to @evevkovacs and @dkorytov but I think we might also need help from @danielsf and @abensonca. Also cc'ing @rmandelb @katrinheitmann @jchiang87 for their information.

color-dependent galaxy clustering

As discussed in #10 and #63, we'd like to make sure the catalog has reasonable color-dependent galaxy-galaxy clustering signals.

This should be relatively easy to implement: we can use the current galaxy-galaxy clustering and just implement color-selection. We can use SDSS data as validation data. Validation criteria TBD.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Density vs extinction test

This issue is to track the progress on the implementation of the test that checks the density as a function of the MW extinction. In principle, the test will be most relevant for the DC2 object catalogs. However, in principle, it can be applied to any catalog.

bug in dN/dz test: normalizations, errorbars

This issue is for some bugs (at least two) in the dN/dz test. @yymao and I have been discussing this on slack, but it needs a bit more digging, so we agreed to open an issue (currently assigned to the two of us).

At the DC2 telecon today it was clear that the errorbars on the dN/dz test were wrong (an order of magnitude larger than the scatter in the data points). Yao and I identified a few things going on:

This is not a bug, but simply misuse of the test: the latest runs were trying to set the number of z bins using the wrong parameter, so it wasn't getting set (it was setting Nbins when it should have been N_zbins). The configuration being used was such that the errorbar determination could not be particularly stable -- it should be the case that number of jackknife regions > number of data points. This was not the case with the settings that were being used. He and I have since been exploring use of 15 regions and 8 redshift bins, which should be more stable.
One bug is that when normed=True, the errorbars have some problems (see run here with the above configuration issues fixed). I suspect the histogram normalization scheme may be inconsistent between the jackknife calculation and the overall histogram, which would cause the covariance to be misestimated.
Another bug is that when normed=False, the data and sims have inconsistent histogram normalization schemes, so they cannot be compared. You can see this in another run here. However, note that in this case the errorbars on the simulation histogram actually seem a lot more reasonable! So this justifies the conclusion above that the normed=True plots have a bug in the errorbar calculation.

shear-shear correlation

~~For testing IA models~~, it would be useful to test the 2-pt functions. We plan to use treecorr for the calculation. ~~We plan to start with using analytical calculation as the validation dataset (@jablazek @patricialarsen).~~ (Edited by Rachel to reflect the focus on cosmological shear-shear, not intrinsic alignments.)

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

A new test result category: "for reference"

To create a new test result category "for reference" (needs a fancier name), for tests that may generate plots and outputs but do not intend to implement specific validation criteria, and not sensible being called passed/failed.

galaxy bias/clustering test

See more details in LSSTDESC/DC2-production#20.

Note that the wp(rp) code in v1 does not work on light cones. When we have proto-dc2 snapshots we can use the old code. In the meantime we should find new correlation code for light cones.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Check N(m) and color distributions for non-optical magnitudes

There have been some conversations about potentially using the CosmoDC2 input catalogs to also seed WFIRST image simulations, in order to enable a tests of doing the joint analysis of images from both surveys. Along the way, it has become evident that the current selected population is overall around 1/4 to 1/2 magnitude too bright in J and H bands.

In the future, we should include NIR and maybe other colors in the selection criteria (cf. LSSTDESC/cosmodc2#36) to make sure we aren't producing a subtly unrealistic galaxy population. This should be coupled with corresponding tests in descqa that things like N(m) and colors are reasonable for non-optical bandpasses.

Add reader and tests for strong lensing catalogs

The DC2 sprinkler will need a strongly lensed AGN catalog and a strongly lensed SNe catalog. We need to write a reader and some tests for verification of these catalogs.

add a verification test for the galaxies in truth catalog

See this slide for context. We want to make sure the truth catalog faithfully reproduce the galaxies in the extragalactic catalog. This verification test will simply compare the galaxy properties in the truth catalog against the extragalactic catalog.

Since both extragalactic and truth catalogs are now available in GCRCatalogs. This should be pretty straight-forward.

(cc @fjaviersanchez @danielsf @jbkalmbach @katrinheitmann @evevkovacs)

redshift-dependent galaxy bias

As discussed in #10, the PZ WG wants to check if the galaxy bias has reasonable redshift dependence. @morriscb @sschmidt23 can provide validation data and criteria. The test itself still need to be implemented.

code to reduce mock data
code that works within DESCQA framework
validation data
validation criteria

Test of gg lensing for different lens properties

This issue is for developing tests to understand how well the gg lensing signal changes as a function of lens properties (stellar mass, luminosity, colors etc.).

A few references that are worth exploring:

SDSS: https://arxiv.org/abs/astro-ph/0511164
CFHTLenS: https://arxiv.org/pdf/1304.4265.pdf
COSMOS: https://arxiv.org/pdf/1104.0928.pdf

This would be similar to this earlier test: https://github.com/LSSTDESC/descqa/blob/master/descqa/DeltaSigmaTest.py#L51, but we like it to be more general...

modify the sample selections (and move them from code to the config file so they can be changed easily)
make sure the test would run on different catalogs (buzzard, protoDC2, cosmoDC2; though for now cosmoDC2 does not have shear yet)
make sure the code can read in different validation datasets

readiness test (to catch bugs before catalog goes into image pipeline)

Now that we are preparing for Run 1.1, we should make sure there is no obvious bug (like incorrect units, labels, etc) in the catalog before the catalog goes into the image pipeline. @evevkovacs @katrinheitmann and I think it would be a good idea to have a "readiness test" that checks if the quantities that are used by CatSim have reasonable distributions (for example, max, min, mean, std, and histogram for visual inspection).

This test needs to be done by 1/19 (hopefully well before that).

BPT Diagram Validation Test

Incorporate into DESCQA the work done in PZ WG to plot BPT diagram for emission lines.

Make DESCQA an importable Python package

Make DESCQA an importable Python package to make developing validation tests more easily and conveniently. Here's the proposed module structure:

descqa/
    __init__.py
    master.py
    archiver.py
    tests/
        __init__.py
        base.py
        utils.py
        ...
        configs/

update sign convention and validation criteria in shear-shear test

@patricialarsen (cc @EiffL) following the update to the shear sign convention in the GCRCatalogs (see
LSSTDESC/gcr-catalogs#111) we should make sure the shear-related tests also follow the same convention (that is, the treecorr/GalSim convention).

We need to, for example, remove the minus sign in
https://github.com/LSSTDESC/descqa/blob/master/descqa/shear_test.py#L172
https://github.com/LSSTDESC/descqa/blob/master/descqa/shear_test.py#L273
(and maybe other places; I haven't done an extensive check).

Using COSMOS ellipticity distribution to validate catalogs

I was looking at one of the DESCQA runs here and was wondering if it is possible to have the actual COSMOS catalog to test the scaling of the ellipticity distribution with type and luminosity (using more absolute magnitude bins and different magnitude cuts).

validate instance catalogs to filter out offending AGN

Some AGNs are excessively bright and cause issues in image simulation, hence, we should implement a validate test to filter out the instance catalogs that contain offending AGNs (magNorm being too small).

We can apply the test to /global/projecta/projectdirs/lsst/production/DC2/Run1.2p/phosimInput/

One example of instance catalog that contains offending AGNs is DC2-R1-2p-WFD-u/000054/.

(cc @katrinheitmann)

Update CheckColors test to be compatible with DM outputs

As it is now, the test checks for mag_lsst. We can do a quick update to CheckColors so it is compatible with the DM outputs.

Suite of validation tests on DC2 extragalactic catalogs

This epic issue serves as the general discussion thread for all validation tests on the extragalactic catalogs in the DC2 era.

Note: Please feel free to edit the tables in this particular comment of mine since we will use them to keep track of the progresses of validation tests

➡️ Required tests that we have identified (for DC2):

Test	WGs	Implemented	Validation Data	Criteria	"Eyeball" Check by WG	Issue
p(e)	WL	✔️ @evevkovacs	✔️ COSMOS	✔️	✔️ (WL @rmjarvis )	#14 #81
p(position angle)	WL	✔️ @msimet	✔️ uniform	✔️	✔️ (WL @msimet)	#76 #82
size distribution	WL	✔️ @msimet	✔️ COSMOS	✔️	✔️ (WL @msimet)	#77 #80
size-luminosity	WL	✅ @vvinuv	✔️ vdW+14, COSMOS	✔️	✔️ (WL @msimet)	#13 #56
shear 2pt corr.	WL	✔️ @patricialarsen	✔️ camb	✔️	✔️ (WL @patricialarsen)	#35 #54
N(z)	PZ, LSS	✔️ @evevkovacs	✔️ DEEP2	✔️	✔️ (PZ @sschmidt23 )	#11 #107
dN/dmag	WL, LSS	✔️ @duncandc	✔️ HSC	✔️	✔️ (WL @rmandelb )	#7 #47
red sequence colors	CL	✅ @j-dr	✔️ DES Y1	✔️	✔️ (CL @erykoff )	#41 #101
CLF	CL	✅ @chto	✔️ SDSS	❔	❔	#9 #102
galaxy-galaxy corr	WL, LSS, TJP	✔️ @vvinuv @morriscb	✔️ SDSS, DEEP2	✖️	✔️ (LSS @slosar )	#10 #38
color-dependent clustering	CL	✔️ @yymao	✔️ SDSS	✖️	✔️ (LSS @slosar )	#73 #100
galaxy bias(z)	PZ	✅ @fjaviersanchez	✔️ CCL	❔	❔	#75 #87
color distribution	PZ, CL, LSS	✔️ @rongpu	✔️ SDSS, DEEP2	❔	✔️ (PZ @sschmidt23 )	#15 #89
shear-galaxy corr.	TJP, WL	✔️ @EiffL	✔️ SDSS	❔	❔	#118
stellar mass function	-	✔️ @evevkovacs	✔️ PRIMUS	❔	❔	#49
cluster stellar mass distribution	CL, SL	✅ @Andromedanita	✔️ BOSS, CMASS	❔	❔	#109
color-color diagram	PZ	✔️ @nsevilla	✔️ SDSS	❔	❔	#74 #88

➡️ Tests that are not currently required but good to have:

Test	WGs	Implemented	Validation Data	Validation Criteria	Issue
color-mag diagram	PZ, CL	❓ @DouglasLeeTucker @saharallam	✔️ SDSS / not required	not required	#40
Cluster radial profiles	CL	❔	❔	❔	#63
IA 2-pt corr.	TJP	✖️ @EiffL	❓ @jablazek	❔	#42
emission line galaxies	PZ, LSS	✖️ @adam-broussard	❓ DEEP2	❔	#12

Analysis WGs are encouraged to join this discussion and to provide feedback on these validation tests. This epic issue is assigned to the Analysis Coordinator @rmandelb, and will be closed when the Coordinator deems that we have implemented a reasonable set of validation tests and corresponding criteria for DC2.

@yymao, @evevkovacs, and @katrinheitmann can provide support to the implementation of these validation tests in the DESCQA framework. In addition to GitHub issues, discussions can also take place on the #desc-qa channel on LSSTC Slack.

P.S. The corresponding issue in DC2_Repo is LSSTDESC/DC2-production#30

lsstdesc / descqa Goto Github PK

descqa's Issues

Recommend Projects

Recommend Topics

Recommend Org