Code Monkey home page Code Monkey logo

mgcv-esa-workshop's Introduction

mgcv-esa-workshop

mgcv-esa-workshop's People

Contributors

eric-pedersen avatar gavinsimpson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mgcv-esa-workshop's Issues

Need feedback on extended example

Hi folks, @dill @gavinsimpson

I just pushed a commit with a new extended example for the second part of the day, focusing on non-linear time series data. The compiled script is "example-nonlinear-timeseries.html". It's partially based of the nonlinear trend analysis from Gavin's blog, and partially off the nonlinear analysis of the lynx data set, derived from Cosma Shalizi's blog. I still have to add a reference section to acknowledge both of those; that's on my todo list.

I was looking for feedback on it. Does this seem like a reasonable format for the extended examples? Does it need more explanatory text? Do the exercises make sense?
Also, putting this together took a fair bit of time, so we may need to trim down the number of extended examples back a bit.

Template for slides

Can one of you, perhaps @dill as you have used the HTML slide classes in RStudio more than me, push a file with a basic YAML header with everything needed for using the HTML slides class we want to use for the materials?

If we're not using the same slide classes then we can ignore this and I'll just use my metropolis-themed beamer slides template.

Example

@eric-pedersen here are some things that we talked about... on the examples:

Could we use the BBS data for the spatial and time series data, so that we did:

  • time series at one site
  • spatial snapshot at one time

then we don't have to introduce a bunch of different data sets? (Need to check with Dave Harris about modelling this -- ignore detectability.)

Do you have strong feelings about this?

Things to talk about, 3 August

Just wanted to put together an approximate agenda for this afternoon...

  • where are we at? (how far are we down this list?)
    • Dave
    • Gavin
    • Eric
  • What examples are left? Which do we pick?
  • Divide up remaining shared slides
  • What stuff needs to go into the preamble slides?
  • What packages do people need to have installed prior to attending?
  • What time are we all arriving?
    • Dave 19:20 (JetBlue 69)
    • Gavin 18:32 (Delta DL 2064)
    • Eric 21:41 (American Airlines 610)
  • What time do we need to be at the room on Saturday? 0730!
  • How many people are coming? 20!
  • Do we have participant e-mail addresses? 100%!
  • Questionnaire? Ask in the break
  • What additional equipment do we need? (Gavin has USB sticks, Eric has a phone)

Please add any additional stuff you think we need to think about!!

what do we actually have to do?

I guess it's important to also know what ESA want from us... From their call for proposals, they would like to know:

  • Title of the session
  • Description of the session (appears online only; 250 words max.)
  • Summary sentence (appears in print only; 50 word max.)
  • Name and contact information (affiliation, email) for the lead organizer and any co-organizers
  • Minimum and maximum number of participants (to assist in room assignment).
  • Requested scheduling
  • Any additional A/V equipment (standard A/V setup is a screen, LCD projector, and laptop)
  • Room set-up desired: theater, conference, hollow square, rounds, other
  • Food and beverage requests
  • Underwriting of workshop costs by a group or agency
  • Is the session intended to be linked to a scientific session?
  • Is the session intended to be linked to a business meeting or mixer?
  • Describe any known (workshop/event) scheduling conflicts (what should it follow/precede/not conflict with?)

Should we do a quick mailout

To, say, ECOLOG to advertise the course. Just saw one pop up and realised we've not done much advertising so far...

Happy to write some copy for it, suggestions welcome.

Ask for Simon's blessing

It would be polite to ask Simon Wood if he minds this happening and see if he would like to be involved.

Equipment

If there isn't wifi, should we plan for this problem and bring some USB sticks/drives with CRAN mirrors on or somesuch? Maybe packrat can help? I also have a spare wireless router I can bring.

Time schedule

According to https://github.com/eric-pedersen/mgcv-esa-workshop/blob/master/course_outline.md we think everything will happen in the morning... Obviously this is wrong...

Currently we have 2x 4 hour blocks: 8am-12pm and 1pm-5pm set up. I think @gavinsimpson and I implicitly thought of the following structure (tell me if I'm wrong Gavin!):

Morning

  • Intro (what is a GAM etc)
  • Model checking
  • Model selection
  • Beyond the exponential family

Afternoon

  • Extended examples/demos (all)
  • Smoother zoo

Does this make sense?

If so, we had previous attributed "1 session" to the intro and "1/2 sessions" to the other three sections in the morning. If we then split 4 hours into 2.5 "sessions", we could get to 1.5 hours for the intro, then, 30-45 mins for the other three, we'd have at least a 15 min coffee break. That being said, I think that leaves things a little tight, especially if we want to have some time for practical exercises. I'm going to try to put my intro slides together this weekend/early next week, so I'll have a better idea of how long that first part will be soon, then we can re-jig a bit from there.

Afternoon I think is a bit simpler, as I think the smoother zoo stuff is probably an hour tops, the rest takes as long as it takes but we need to ensure there is plenty of coffee!

Feedback on proposal

I've committed my changes. Sorry if it seems like a changed a lot -- I liked the structure, I just tried to tighten things up a little, given our limited word count. I hope this doesn't cause any offense! We now come in at 217, so you can add more details if you think I cut too much! I'll add other thoughts to #2 now...

Software requirements

Deadline lunch EDT 4 August

  • Up to date R
  • RStudio beneficial
  • latest mgcv
  • ggplot2
  • dplyr
  • tidyr

Workshop plan; topics, ordering, etc.

A rough outline and ordering of topics. This is just my basic off-top-of head idea so I won;t be offended if you don't agree.

I was thinking that the theory with some small examples to illustrate and practice would take the morning (given those are often shorter slots, say up to and including Model Checking. The rest is for the PM part with more practical/hands on stuff.

  • Introduction
    • what are (generalised) additive models?
    • what are splines?
    • what is penalised regression?
    • Is this where we discuss basis-penalty setup?
  • Basic model fitting
    • s() and its arguments
    • (What the heck is k if you don't specify it?)
    • Intro to some of the main functions.mgcv methods people will use later (gam(), anova(), summary(), plot(), ...)
  • Smooth toolbox
    • Introduce the basic types and variations of smoothers
      • Thinplate splines
      • Cubic splines (& cyclic versions)
      • P-splines
    • 2-d isotropic smoothing via s()
    • tensor product smooths
    • smooth-factor interaction (by terms)
    • random effects splines
  • Model checking
    • gam.check()
    • concurvity
    • randomised quantile residuals
    • using gams for testing nonlinearity in other models
  • Model selection
    • shrinkage via shrinkage smooths
    • shrinkage via select = TRUE
    • AIC corrected for estimated smoothness params
    • approximate p values here?
  • Extended GLM models
    • Beta regression (betar)
    • Tweedie (esp for continuous data with non-constant variance and zeroes)
    • Negative-binomial & ZIP
    • cox.ph? I think a fair number of ecologists are dealing with this kind of data
  • Extended examples/demos
    • Spatial modelling
    • Time series modelling
    • Spatio-temporal models
    • Location-scale models
  • Other stuff
    • type = "lpmatrix"
    • paraPen
    • Markov random fields
    • Functional data
    • Mixed effect models via gamm4
    • ???

Comment, but also (if you can, if not @eric-pedersen can you make David and I have admin/commit rights to this repo only?) if you want to suggest things for removal, use strikethrough (e.g. ~~text~~ so we can discuss if someone has strong desire to include a topic)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.