datacarpentry / semester-biology Goto Github PK
View Code? Open in Web Editor NEWForkable teaching materials for course on working with data in R
Home Page: http://datacarpentry.org/semester-biology
License: Other
Forkable teaching materials for course on working with data in R
Home Page: http://datacarpentry.org/semester-biology
License: Other
Rather than have students use a database with some problems for several weeks and potentially internalize poor structure, we've switched over to using the Portal Project Teaching Database for the main SQL exercises. This means we need messy data for database structure/tidy data problems.
Data Carpentry has messy data designed for looking at this problem. See, e.g.,
http://datacarpentry.github.io/spreadsheet-ecology-lesson/01-format-data.html
We should concert the current database structure problems to something based on this data, or add new "Tidy Data" problems based on this data and tweak the existing database structure problems to have the students download a the original database file that has all of the structural issues in it.
On both ethanwhite.github.io and my local host, the Output links return a 404: Page not found error.
I made sure that the links match the repo directory, so I;m not sure what's going on:
http://localhost:4000/solutions/Combining-the-basics-2-Python.txt
http://ethanwhite.github.io/solutions/Expressions-variables-1-R.txt
Determine a standard use of bullet and numbered list. Edit to standard indentation.
We could include commentary on code style in markdown, also.
Reduce length of code blocks to match web translation.
Code chunks that take up a whole line should be placed in a code block.
(Subissue #1 )
There are nine Lists exercises, but only Lists-1 is used in an assignment. Are we going to use them?
Lists-2 and Lists-3 follow-up on Lists-1.
Lists-5 can introduce matrices.
Also, we should come up with an exercise to introduce lists. I use them when I have tables or lists of multiple data types that is entered from the script. Data with multiple data types in .csv get entered as data frames.
***I've been meaning to have this 'what to do with unused exercises` chat more broadly.
Add these links to:
As per #203.
(Subtask of #1 )
OLD: [Functions 5]({{ site.baseurl }}/exercises/Functions-5/)
NEW: [Functions 5]({{ site.baseurl }}/exercises/Functions-5-R/)
I will complete this after all of the PRs with assignment changes are closed to avoid conflict.
A nice example of looping that applies to a lot of folks' research is to loop a bunch of different files doing the same thing to each file. We should develop a problem around this. If there's something in the Genomics lesson space (see: http://datacarpentry.github.io/lessons/) that works it would be nice to add a genomics exercise or two.
(Subissue #1)
(For down the road.)
I saw the newest Software Carpentry lesson (http://swcarpentry.github.io/web-data-python/) and it made me think about the way we format our exercises and how that will look when rendered by Jeckyl.
I'd like to look through a couple examples to gather some thoughts and chat with you sometime. We can also look through some of the Data Carpentry lessons, though they like look they are still mostly 'generic' github wiki pages.
As in #2 it is useful to show the students what they should be getting as output.
I called a last minute audible and switched to using:
http://figshare.com/articles/Portal_Project_Teaching_Database/1314459
for the database. All of the solutions are still based on the full dataset on Ecological Archives, so we'll need to update the solutions.
Outside of the repository for the moment
Something got mixed up a little in the formatting of assignments. Compare:
I think this is happening because the assignment name is capitalized. See the commit message for my solution to this:
ethanwhite/progbio@79c5bbc
This means that the assignments need to start with lower case letters and the exercises start with capital letters. Yes, it is awful.
(Subissue #85, Related PR #91)
assignments/index.md directs Jekyll to arrange find a list of exercises and arrange an assignments page using the exercise titles. Python and R assignments share titles, which means that currently the R assignments list is populated by Python exercises. Will have to code in the language from yaml here or change the titles throughout.
(Subissue of #12)
Ran into subtle differences for output from Functions-5 and Functions-6
Now, that I have an explanation I wonder if/where it fits in the curriculum.
https://twitter.com/ZackBrym/status/595651701945794561
It might fit along with our introduction of dplyr
[Documentation Link].
(sub issue #1; related to #46)
I have gone through the advanced course exercises and chosen a small(ish) set of topics and exercises I think would be worth considering for inclusion in the project. My idea is that these would provide an opportunity for classroom students to continue on from the course and have a direction for what is next if they are to continue pursuit of scientific programming and for at-home students to learn a handful of important, but a bit more complicated, skills.
After completing this list, #1 will be complete. We can also decide any or all are not worth it, and can be done with #1 now.
The list of exercises breaks into two categories.
While looking through the schedule.md, it struck me that we could organize the order of videos/readings to follow the order of the exercises. In my mind, the structure would look like:
Outside of the repository for the moment
For both R and Python exercises we need a way to help both self-directed learners and university students check their work, but without giving them answers in code that they could just cut and paste for assignments. By showing them what the outcome of successfully running the code should look like, we both clarify the intent of the question and help students check their work. This also begins to introduce the benefits of testing.
The result here would be a new folder containing the "solutions" (i.e., what the output should look like) for each exercise, using the same naming structure as the associated exercise. Separate solutions will be necessary for R and Python since the details of the output won't be the same.
(Subissue #1)
Loops-4 seems like a useful extension of Loops-2 (old name: 'Loops-3')
Not sure what Loops-5 is about.
@ethanwhite I'll need some direction before I can make progress on this. Is there a package to work from? What kind of exercises do you have in mind?
Re: Basic-Python2.md exercise comments.
I include a mention of built-in functions in this exercise. I'm not sure that custom functions are required here, so maybe it would be best to introduce the idea in a later lesson.
We should introduce the various data classes (character, factor, numeric) and organizational structures (list, matrix, array, data frame) in an early lesson. Not sure where is the best place.
(subissue #1)
'Graphing 3' used to be called 'Graphing adult size vs newborn size'. Simplifying the descriptive title to a number made me wonder if all of the problems should have a descriptive title. The descriptive title would identify the new problem/solution presented in the exercise. One of the strings exercises might get a descriptive title of 'Basic stringr
functions'. A making choices exercise might get a descriptive title of 'Using mathematical operators' or 'if else statements'.
I think it makes sense to organize the directory using the current Name-X 'titles' and add a 'descriptive title' or 'subtitle' to the exercise yaml.
ProgBio: http://www.programmingforbiologists.org/
Stat545: https://stat545-ubc.github.io/
For database exercises that don't involve Reports and Forms, convert these exercises to SQLite.
All necessary modules are now available in Python 3 and it will make teaching easier by removing common points of confusion like integer division. The most common thing we'll been to fix is changing print statements from:
print x
to
print(x)
Once datacarpentry/sql-ecology-lesson#34 or datacarpentry/sql-ecology-lesson#31 goes in we should update this material to use the Data Carpentry material for pre-class reading.
(subissue #1)
I will skip this exercise for now because it is not in the assignments list, but I'd like to revisit it as it looks like a strong exercise.
Outside of the repository for the moment
Strings-4-R.md vs Functions-3-R.md
The link to Deryll Dewald is redirected to an unrelated page.
There is some language about committing that isn't appropriate when using this as a stand alone SQL problem.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.