Code Monkey home page Code Monkey logo

gzcandels_datapaper's People

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

gzcandels_datapaper's Issues

Discussion of resolution issues

There isn't one, and there should be. At what redshifts are we resolving what features? What constraints does a "smooth" classification put on feature sizes? For extended sources, what does a plot of size versus p_artifact look like?

Add pie charts to paper

I'm thinking we should start the "Use of classifications in practice" section with this, as in "this is overall, but it contains no appropriate selections on redshift, luminosity, mass, etc.; here's what you do if you want to do a specific study".

What do you think? @willettk

triple-check for compliance with definition of terms in S3.1

Check that we haven't used any of the following terms wrongly (with what we should use in parentheses):

  • volunteer (classifier)
  • user (classifier)
    The only uses of these should be exceptions based on the context.

Also check to make sure all these easily-confused terms are being used correctly:

  • task vs question (a task is a unit in a workflow and can contain a question, but not vice-versa)
  • response vs answer (out of N possible responses to a question in a task, a classifier may only select 1 answer)

Fill in acknowledgments

Send emails asking people for their acknowledgments
I have an Einstein acknowledgment now...

Section 4: use Spearman's rho, not Pearson r

Spearman's is more robust even if the data is not that well-behaved (e.g. not Gaussian distributed etc). And the values are nearly identical, so report the one that's marginally more appropriate.

Need and example figure with colour?

Early on Fig 4 is referenced as showing the colour images. It doesn't. Also you can't point to Fig 4 before Fig 1, 2, 3. MNRAS won't allow it....

double-check consistencies

I have spot-checked my way through several consistency calculations, but am I really really sure of this? the consistency distribution has 5 people with consistency < 0.2 and otherwise cuts off really sharply at just above 0.3. GZ2 didn't seem to do that. That's troubling.

Electronic format

What's the ultimate release format of the data? The paper says it'll be on http://data.galaxyzoo.org, which we should do, but I want to include it in more formats for posterity's sake.

Possible options:

  • http://data.galaxyzoo.org (CSV/FITS/HDF5/other)
  • supplementary material to astro-ph
  • supplementary material to MNRAS
  • VizieR/CDS
  • MAST
  • CANDELS data access page
  • other?

Figure 6: stack 2D plots? interactive 3D plots?

The current F6 is not successful at convincing people of the quality of fits to \Delta f_value as a function of f_value and surface brightness. A simple fix is to just plot the planes as two line plots for each response (and probably include fewer responses), but we could explore having interactive 3D plots (which apparently MNRAS supports). Does anyone know how to do this? @willettk @rjsmethurst @chrislintott @CKrawczyk etc?

If not I'm tempted to just do the 2 2D plots because I don't want to get bogged down in this.

smooth_disks as flag in catalog, not separate table

I suspect the referee will dislike Table 5 as much as I do now that I'm in the post-submission clarity phase. Since we aren't publishing B/Tot ratios in this paper (those are for @BorisHaeussler to publish as he sees fit), those galaxies should just be identified via a flag in the main catalog.

Further checking for errant bots

Did we look at the CANDELS classifications for signs of errant/non-human behavior aside from the star/artifact question? If so, would it be quick to run? I think it'd be good if we had the ability to write a sentence in 3.4 akin to "We have also analyzed the percentages of the remaining top-level categories (smooth and features/disk) for all users and find no/some/lots of evidence for bot-based classifications".

verify image-creation description & stats with Jeyhan

Need to check that I've written up the correct details of how the subject images were created (linearity, stretch, etc.). Those are in Section 2.1.

Also check with Jeyhan the status of the UDS classification paper (listed as in-prep in S4).

Figure 8: compare with z=0 results

Use Willett et al. (2013) and Lackner & Gunn (2012) to add the z=0 comparison to the figure.

I don't suppose @willettk might want to take this on?
(I have the Lackner & Gunn tables if they're not easily accessible online.)

Section 4: new paragraph on translating from one set of classifications to the other

For ease of usability of the classifications, some CANDELS team members have pointed out it would be useful to add some additional discussion to S4 discussing how to translate between classification systems, and when this is and is not a good idea.

This might be a good opportunity to go into further detail on e.g. how to use both together to select merging systems, and how the differences in clumpy classifications might be used to do interesting science.

double-check all numbers

(now that we have new weighted classifications for task T00), re-check numbers, particularly in:

  • Section 3.5
  • Section 4 (all)
  • Section 5

double-check figures

now that we have a new set of weighted classifications for task T00, double-check (or just re-make) Figures:

  • 3 (iterative distributions of user consistencies)
  • 4 (example images; see #5)
  • 5 (depth corrections; this needs bigger axis labels too)
  • 6 (comparison with K15 classifications)
  • 7 (B/Tot for smooth and featured)

Upper limits on p-values in Section 4

We quote a lot of p-values in Section 4 for the CANDELS team-GZ comparisons. I think an upper limit (p < 2e-16) is more appropriate than putting p~0, since posterity won't necessarily know what our machine precision was.

Labels to histograms in Fig 4

I think we should label the rows in FIg 4 - perhaps in the white space in each histogram - with shorthand of the question being answered.

author list

  • collaborators who've commented on past drafts
  • GZ-CANDELS builders (both collabs)
  • overall CANDELS builders
  • GOODS-S photometric catalog builders
  • COSMOS photometric catalog builders
  • UDS photometric catalog builders
  • GOODS-S specz builders
  • COSMOS specz builders
  • UDS specz builders
  • GOODS-S photometric redshift builders
  • COSMOS photometric redshift builders
  • UDS photometric redshift builders
  • CANDELS team visual classification builders
  • Bulge/Disk decomposition builders

Figure 3: zoom in on a relevant part to show details

(In addition to the full-size plot - zoom in somewhere to show convergence.)

Side note: I've always been sort of uncomfortable with the way the convergence here is quite sudden - the consistencies change a lot between the second and third iteration and then there is mostly no change between 3-4 and 4-5. I've double-checked it all and if something went wrong, I can't find it... so I think it looks real.

Just noting this, though, in case someone can a) find the trouble, or b) reassure me...

Section 4: new CANDELS visual parameters to compare to

Dan McIntosh pointed out that there are now some parameters that add value to the K15 visual classification raw fractions, e.g. p_merger that combines the various merger votes into one value that goes from 0 to 1, a p_diskiness and also an artifact metric. These would be really interesting comparisons and could resolve some of the issues we had with combining parameters. It's worth exploring adding them as an additional plot. (Dan has sent me the info as I think some of these are currently unpublished).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.