datacarpentry / image-processing Goto Github PK

View Code? Open in Web Editor NEW

95.0 16.0 120.0 164.19 MB

Image Processing with Python

Home Page: https://datacarpentry.org/image-processing

License: Other

HTML 100.00%

carpentries data-carpentry lesson python english image-processing computer-vision imageio scikit-image skimage

image-processing's Introduction

Image Processing with Python

A lesson teaching foundational image processing skills with Python and scikit-image.

Lesson Content

This lesson introduces fundamental concepts in image handling and processing. Learners will gain the skills needed to load images into Python, to select, summarise, and modify specific regions in these image, and to identify and extract objects within an image for further analysis.

The lesson assumes a working knowledge of Python and some previous exposure to the Bash shell. A detailed list of prerequisites can be found in learners/prereqs.md.

Contribution

Make a suggestion or correct an error by raising an Issue.

Code of Conduct

All participants should agree to abide by the The Carpentries Code of Conduct.

Lesson Maintainers

The Image Processing with Python lesson is currently being maintained by:

The lesson is built on content originally developed by Mark Meysenburg, Tessa Durham Brooks, Dominik Kutra, Constantin Pape, and Erin Becker.

image-processing's People

Contributors

Stargazers

Watchers

Forkers

jduckles shanwai1234 ibrahim-r subinz barnakiss lsxinh2 icompbioutc manoj652 k-dominik constantinpape dmwelch bdspoke-utc-spelman-tuskegee-wvu yetiracing4als ttm62 scarlet0ice qin-courses orivera1967 tischi myedibleenso liu499169 manishjaan pwnega wenyi-zone monycky marsuniversity mahendraphd manupeco rahulisaac jeffd27 agdatanomad yhb8r4 ndenguti georgeosborn su-khan95 akmalds mysterb erickmartins fahadmahdi lightsoull 400lbhacker sh4zkh4n sw55555 sdave-connexion greeneley elliewix chiayan liten82 connectthefuture engjawad thanushipeiris quist00 geoffreyporto uschille schiller-lab qyfeng-del zafrinferdousmira minafawzi guyer chenine sarivaz sapoorhashemi ray1945 morteza-waskasi dariadami ravinsharma7 10arjun sadielbartholomew sowmyy afiqmuzaffar lenciabeltran captainsifff bobturneruk mkcor halkhadhami shaw2thefloor iimog icodein marcoprod zed1025 lauracmurphy pratham161000 rahulengineer-spec alvaroomartino exn297 arvid5 jobe1366 christophernhill jaysinhpadhiyar sheffieldcl hasanjay drcandacemakedamoore govekk chbrandt fullertonlibcode tobyhodges k-meech hobytodges karenword decorouz deppen8

image-processing's Issues

Clarity on metadata

@mmeysenburg noted that some of the metadata that’s being shown for the tree image in this lesson has been manually entered (e.g. title, author, subject). It would be good to note this explicitly in the lesson text.

Explain point of exercises on different kernel sizes and shapes

In https://datacarpentry.org/image-processing/06-blurring/, the "experimenting with kernel size" and "experimenting with kernel shape" exercises don't give any motivation for why learners might want to do this. Could we add a bit of text explaining why this might be important?

Add line showing how to call script

At least for first Python script that is run. This may depend on whether this ends up being promoted as a "beginner" or "intermediate" lesson.

OpenCV dimensionality inconsistencies need caution callout

Add big caution callout box at the start of OpenCV episode noting the difference in color order and dimension order (y and then x and then colors in Blue, Green, Red order).

I think this is in the text somewhere, but I'm thinking a callout box titled "CAUTION" or something like that would be appropriate here.

emphasise importance of experimental design

In https://datacarpentry.org/image-processing/07-thresholding/ "Ignoring more of the image implementation" exercise -

Add to solution that this problem is easily solvable by considering this issue during the experimental design phase - using the same size label for all samples, or putting the label in the same location for all samples, or using a different colored label (not white) and writing (not black). Or in the same order from left to right or top to bottom (using contouring that we haven’t learned yet).

don't use imagemagick anymore

as far as I see, imagemagick is only used to look at metadata. We are already using fiji in this material. Metadata can be viewed there with Image -> Show Info. One less dependency to worry about here?!

Add explanation of additive vs subtractive color mixing

At the May workshop in Crete, @mmeysenburg used light vs paint mixing to explain why (255, 255, 255) is white and (0, 0, 0) is black rather than vice versa. It would be great to add this explanation to the solution to the "Thinking about RGB colors" exercise in 02-image-basics.

Add "png" to the mix

Episode 02 uses bmp and jpg to compare lossy and no compression. Would be nice to add PNG to show that for certain images (e.g. just one color as used in the example), the lossless compression works very well.

Update to latest version of template

This lesson seems to be based on an older version of the template. One visible outcome of this is that the exercise boxes aren't automatically expanded on each episode page. It would be great to use remote themes on this repo so that it automatically stays in synch with changes to the lesson template. Pinging @fmichonneau to see if he can put in a PR for adding remote themes to this repo. This isn't a priority as this lesson won't be included in the June 2019 release - sometime in the next month would be great.

Use r, c instead of y, x

@k-dominik @constantinpape,

I am proposing to use r, c instead of y, x (or x, y) throughout the whole material.

Advantages of using `r, c`

It is immediately obvious that the upper left corner of an image is at 0, 0.
It is immediately obvious that "an image is just a matrix" (which is what we need here as a concept).
It is immediately obvious that the coordinates are integer values.
It greatly removes mental load. Personally, I was struggling constantly with the dimension order when using x and y when going through the material.
The documentation of skimage uses r and c as well.
We do not really need the concept of physical (scaled) coordinates in this material; this could be introduced properly at some other stage.

What do you think?

16bit images and skimage.viewer

The image "04-drawing-bitwise/wellplateImg.tif" is a 16 bit integer RGB image, which skimage.viewer.ImageViewer cannot show out-of-the-box. To see something in the viewer, one for example can divide the image by 65535.0 to convert it to float values between 0.0 and 1.0

The issue is that, I think, this is not explained before students encounter this challenge trying to solve the exercise "Masking a 96-well plate image (50 min)".

Ability to view the Open-CV lessons for image processing.

First, thank you for this wonderful lessons. I know recently we migrated from open-cv to skimage for these lessons, is it possible to give an ability to follow lessons based on these libraries ?

I have already done few lessons couple of weeks back and today when I wanted to continue again found out that library has changed. It will be great to have an ability follow library based options available to follow the lessons

Metadata example fails

The Metadata section of https://datacarpentry.org/image-processing/02-image-basics/ has an interactive example of viewing tree image in ImageJ to see metadata, but tree.jpg doesn’t include the metadata, or at least it doesn't show up when viewed in ImageJ (but it does in ImageMagic?).

Another caution callout box for OpenCV

In introducing https://datacarpentry.org/image-processing/04-drawing-bitwise/ @mmeysenburg said something like the following:

If operating on the numpy array, it’s y, x, channel
If operating within Opencv it’s x,y

Can we add a CAUTION callout box for this?

Relate compression to storage issues

In https://datacarpentry.org/image-processing/02-image-basics/ there is a discussion of the advantages and disadvantages of file compression. It may or may not be worthwhile to add a discussion of cloud / remote computing here.

Timing of lesson elements

These timings are based on workshop held at Doane University, May 22-24. I can put in a PR to add these timings to the lesson materials, but recording them here for now.

Introduction
Teaching: 6 min
Exercises: 0 min

Images
Total: 49 min
Teaching: 34 min
Exercises: 15 min

Thinking about RGB colors (3 min)
RGB color table (4 min)
BMP image size (8 min)

OpenCV Images
Total: 100 min
Teaching: 31 min
Exercises: 69 min

Experimenting with windows (3 min)
Resizing an image (20 min)
Keeping only low intensity pixels (13 min)
Practicing with slices and Metadata, continued (16 min)
Slicing and the colorometric challenge (17 min)

Drawing and bitwise operations
Total: 77 min
Teaching: 17 min
Exercises: 60 min (2 exercises skipped)

Other drawing operations (10 min)
Masking an image of your own (skipped)
Masking a 96-well plate image (50 min)
Masking a 96 well image, take 2 (skipped)

Creating Histograms
Total: 83 min
Teaching: 22 min
Exercises: 61 min

Using a mask for a histogram (24 min)
Color histogram with a mask (25 min)
Histograms for the morphometrics challenge (12 min)

Blurring Images
Total: 50 min
Teaching: 11 min
Exercises: 39 min

What happens if the int() parameter doesn’t look like a number? (10 min)
Experimenting with kernel size (5 min)
Experimenting with kernel shape (7 min)
Blurring the bacteria colony images (17 min)

Thresholding
Total: 94 min
Teaching: 30 min
Exercises: 64 min

More practice with simple thresholding (12 min)
Ignoring more of the images, brainstorming (8 min)
Ignoring more of the images - implementation (32 min)
Thresholding a bacteria colony image (12 min)

Edge Detection
Total: 66 min
Teaching: 19 min
Exercises: 47 min

Applying Canny edge detection to another image (5 min)
Using trackbars for thresholding (42 min)

Contours
Total:
Teaching:
Exercises:

Counting dice pips ()
Extracting subimages ()

Wrong dimension order?

in lesson 03-skimage-images, it reads:

Note that the coordinates in the preceding image are specified in (x, y) order. Now if our entire whiteboard image is stored as an skimage image named image, we can create a new image of the selected region with a statement like this: clip = image[60:151, 135:481, :]

However, comparing the order of the numbers in the array to the numbers in the screenshot it looks to me that the image is in fact in (y, x) order (and not ( x, y ) ), while the numbers on the screen are in (x, y) order. What do you think?

reading contour file

In https://datacarpentry.org/image-processing/09-contours/ @mmeysenburg walked learners through reading the contour file for the first dice image. This would be good to have as an explicit exercise, as the contour file format was REALLY confusing.

Also maybe add a CAUTION box stating that there is no specific order of the contours. It isn’t always the leftmost or rightmost or whatever that is listed first.

GrayscaleMaskHistogram.py needs conversion to skimage

The file Desktop/workshops/image-processing/05-creating-histograms/GrayscaleMaskHistogram.py
still contains the open-cv based code.

Wrong folder name for image data

@k-dominik
@mmeysenburg

It seems that the folder names in the repo with the image data: https://bitbucket.org/mmeysenburg/workshops.git
...are not yet fully adjusted to the folder names in the lessons.

In the bitbucket repo there is a folder workshops/image-processing/03-opencv-images/ but in the lesson it is referred to as workshops/image-processing/03-skimage-images

Hands-on element in compression section

In the compression section of https://datacarpentry.org/image-processing/02-image-basics/, @mmeysenburg had learners zip the bmp to see compressed file size. Let's add this to the lesson explicitly.

Include program/script names

Would be good to go through the whole lesson and make sure that anywhere a code chunk displays a script, that it is clearly labelled with the name of the script as found on the virtual machine.

Shorten resizing an image exercise

Currently, the exercise reads:

Using your mobile phone, tablet, web cam, or digital camera, take an image. Copy the image to the Desktop/workshops/image-processing/03-opencv-images directory. Write a Python program to read your image into a variable named image. Then, resize the image by a factor of 50 percent, using this line of code:

At the workshop, we didn't use our own images, but instead used the existing chair.jpg image. I recommend changing the exercise language to have learners always do this as it speeds up the exercise and reduces cognitive load.

Also to reduce cognitive load, I recommend having learners copy and modify the Open.py script rather than writing one from scratch. Also recommend changing the multiplier to something like 10% of original image size (50% is still too big).This exercise is a really nice early win, and I felt very empowered after completing this challenge!

Remove named parameters from matplotlib function calls

For multiple scripts starting in the https://datacarpentry.org/image-processing/05-creating-histograms/ episode, there are function calls using matplotlib functions. Some of these functions do not work with named parameters. Both the script files and the code chunks in the lesson text needs to be updated to reflect this.

Obtaining pixel coordinates using skimage viewer

Currently, in lesson 03-skimage-images, it says:

We can use a tool such as ImageJ to determine the coordinates of the corners of the area we wish to extract. If we do that, we might settle on a rectangular area with an upper-left coordinate of (135, 60) and a lower-right coordinate of (480, 150), as shown in this version of the whiteboard picture:

However, at least on my Mac, also the skimage viewer shows the coordinates of the mouse cursor at the bottom of the image, (129, 71) in below screenshot. Maybe it would be better to refer to this, instead of ImageJ, because the students used this viewer already?

Possible option for shortening workshop

@mmeysenburg noted: if there is a desire to keep the workshop to two days, cutting edge detection could be an option. The slider bar technique for parameter optimization could be moved to the thresholding lesson.

Add glossary of key terms

Glossary should go here: https://datacarpentry.org/image-processing/reference/

Some of these terms may not be included in the glossary, depending on the expertise level of the lesson and whether or not previous Python programing experience is a prerequisite for the workshop.

All of the terms in the following list are used in the lesson, but some may be unnecessary jargon. I recommend as a first step that we divide this list into three categories:

Vocab specific to this lesson (excluding standard programming terms and Python data type terms)
Standard programming terms and Python-specific terms not specific to this lesson
Terms that might be unnecessary jargon

If this lesson ends up being an intermediate level (and having Python as a prereq), then we can limit the glossary to (1). If it's a beginner lesson, then (1) and (2). We might be able to replace some of (3) with less jargony words.

Section 1: Terms specific to this lesson

morphometrics
colorimetrics
lossy compression
lossless compression
titration
colony (as in bacterial)
raster graphics
left-hand coordinate system
additive color model
RGB color model
BMP
JPEG
TIFF
histogram
thresholding
intensity
mask
grayscale
edge detection
channel
contour
child (in contours)
parent (in contours)
root (in contours)
bounding box
crop
kernel
edge detection
noise
blur
maize (?)
adaptive thresholding
fixed-level thresholding
image segmentation
binary image

Section 2: Standard programming terms and Python vocab

bit
byte
pixel
alias
import
command
command-line
array
slicing
parameter
argument
named arguments
positional arguments
tuple
list (data type)
loop
function
library
index
bash shell
shell
script
string (data type)
metadata
call (as in function call)
trackbar
global variable
pass (as in "pass in" to a function call)
accumulator variable
control structure
bitwise operations
truth table
standard deviation
binary
output stream

Section 3: Possibly unnecessary jargon

Gaussian distribution
low-pass filter?
Gaussian blurring
median blurring
bilateral blurring
Otsu's method
Canny edge detection
Sobel edge detection
non-maximum suppression
hysteresis
moment
Euclidean distance
centroid
reference object
polygon approximation

Add hands-on element to BMP image size challenge

The BMP image size challenge in 02-image-basics doesn't include the hands-on exercise using ws.py to create jpg and bmp images of specified dimensions that was used in the May workshop in Crete.

Extension doesn't always determine format

For OpenCV's imwrite function the extension on the filename parameter determines the format in which the file is saved. It's probably worth adding a callout box to clarify that this is a behavior specific to OpenCV and can't be relied on in other contexts (e.g. changing the extension on a Word document from .docx to .pdf does NOT make it into a PDF file).

Possibly shorten "what happens if the int parameter doesn't look like a number" exercise

In https://datacarpentry.org/image-processing/06-blurring/, the exercise "What happens if the int() parameter does not look like a number?" asks learners to

Write a simple Python program to read one command-line argument, convert the argument to an integer, and then print out the result. Then, run your program with an integer argument, and then again with some non-integer arguments.

Note that this exercise could be done without writing a new script. Could just have learners enter the suggested input for the GaussBlur.py script. Would reduce exercise time to a couple of minutes, but would give less practice at writing scripts.

topic for the repository

Please, consider adding a lesson topic to the repository. To do so you can follow the help about how to add topics to the repository. Check out the topics that the Genomics R intro lesson has gotten to add others that may be relevant to this lesson.

This will help people to know which repositories are lessons and also could be used to automate analysis of the repositories.

Lossy compression doesn't clearly show pixelation

This might be my display or might be my eyes, but I can't see the difference between these two images.

Would it be possible to make the compressed image worse so that learners can easily see the effects of lossy compression?

Add solution for thresholding a bacterial colony image exercise

Last exercise in https://datacarpentry.org/image-processing/07-thresholding/ is missing solution.

What makes a good threshold?

For https://datacarpentry.org/image-processing/07-thresholding/

The first example in this episode may prompt the question “Can’t I just convert all of the white pixels to black?” - the second example or exercise helps resolve this question by showing the utility of this approach with a textured background. This is also partially addressed by demonstration of choosing a too-large threshold value. Should include the examples of using too large and too small thresholds.
Good threshold = 210, too large 250, too small 150?

Blurring bacterial colony images

Minor issue - this exercise doesn't specify that the kernel should be square. @mmeysenburg said this verbally at the workshop, but would be good to add it to the exercise text if it's important.

Add table comparing JPEG, TIFF, BMP

02-image-basics includes a good textual description of JPEG, TIFF, and BMP formats that explains some similarities and differences among these file types. It would be good to add a table summarizing this information at this point in the lesson.

titration.tif missing

the file titration.tif, which is referred to in the last exercise of episode 03, is missing in this repo (there is a titration.jpeg, however :/)

Update setup instructions for skimage-based version

Setup instructions are really nicely detailed, but probably can do with an update after switching to skimage.

I think the only thing we need to ask participants is to install the latest anaconda distribution.

I would be curious if you (@mmeysenburg) plan on still maintaining virtualbox images?

Show histograms side-by-side for comparison

In https://datacarpentry.org/image-processing/02-image-basics/, a histogram is shown for the uncompressed image and for the compressed image. The two histograms are stacked in the page, rather than side by side. I think showing the histograms next to each other horizontally would make this slightly more visually effective, but that's a very minor point and I don't feel strongly either way.

slicing script coordinates off

Minor fix: Coordinates in slicing script in the "Access via slicing" section of https://datacarpentry.org/image-processing/03-opencv-images/ are off by one pixel. Should be 60:151, 135:481. Need to fix in script and lesson.

Also, in script, doesn’t have colon for specifying color c = , only gives two coordinates. Does that matter?

Accessibility issues with Gaussian blur animation

In https://datacarpentry.org/image-processing/06-blurring/ there is an animation showing a Gaussian blur in progress. This animation is really cool, but I'm concerned that it makes part of the episode inaccessible for people without stable internet access (who may be using these lessons in PDF version).

This is what appears in the print version of the page:

Is it possible to have the print version show the final state of the animation rather than the start point? Or alternatively, to have the animation start by showing the final result and then go black and then show the blur building up?

A related issue is that the red square showing the highlighted pixel won't show up for red/green color blind users against the grey/brown/green background. From my understanding, a purple square would show up relatively well for all three major types of colorblindness. It will appear blue or bluish purple and be distinguishable from the green/brown background. (https://usabilla.com/blog/how-to-design-for-color-blindness/)

That having been said, the image itself might be hard for people with red/green colorblindness to parse - I'm not an expert on this. Happy to hear other thoughts about how best to address this!

Error in script DrawPractice.py

My notes just say:

Error in script DrawPractice.py, mat = image

But I don't remember what I meant by this.

Is shell scripting necessary here?

In the https://datacarpentry.org/image-processing/07-thresholding/ episode in "Adaptive Thresholding" section, learners are introduced to bash shell scripting. This introduces a lot of cognitive load here. Could the same thing be accomplished this with a python for loop? Or is it not possible to execute a python script within a python for loop? (Or some other reason this wouldn't work?)

This section also introduces redirection (>) in the context of bash shell scripting. Could this be accomplished in Python instead?

Reduce number of unique images used?

Thought question - is there a possibility of changing the examples so that they work with the same images over and over and learners are building up a single more complex script over the course of the workshop? That would reduce the need for explaining each piece of each script and give more hands on time in the lecture portion of the course.

Explain why 128 was chosen as cutoff

In the OpenCV episode, the section on manipulating pixels says:

Let us develop a program that keeps only the pixel color values in an image that have value greater than or equal to 128. We will start by reading the image and displaying it.

It would be good to explicitly state that 128 is half intensity.

Reduce number of jpeg images used

jpeg images are used a lot in this material. I might be living in my small bubble, but I personally, would never suggest or encourage anyone to convert their images to jpeg due to lossy compression. Still a lot of images used here are jpegs. I think that conveys a wrong message. But again, if you guys such strong use of jpegs without any downsides on your sides, then close this :)

Using a mask for a histogram

In the "Using a mask for a histogram" exercise in https://datacarpentry.org/image-processing/05-creating-histograms/, the script GrayscaleMaskHistogram.py may have a typo and needs to be checked. Unfortunately, I didn't note what the typo was . . .

Switch from OpenCV to skimage

We have discussed with @ErinBecker @k-dominik @mmeysenburg, @tobyhodges and Tessa (sorry I don't know your github handle) to translate the lessons here from OpenCV to skimage.
The motivation is that skimage is easier to install, better supported and documented than OpenCV.
@k-dominik and I would translate the lessons.

I made a first small PR for the introductory lessons, see #42.

We should collect potential issues in translation here.

ToDos:

introduce uint8 vs float images in episode 2
introduces as_gray in episode 3
make sure no cv2 functions are used anywhere in the episodes or code examples
make sure drawing is applied by indexing
all code examples have been "blacked"

Fix solution formatting in BMP image size challenge

The "BMP image sizes" challenge solution may be difficult for screen-readers to parse and for people with vision limitations to read.

Add line-by-line breakdown for each script

This lesson contains many Python scripts that are presented as a single code block and then explained in detail beneath that block. To reduce cognitive load, I recommend breaking the script (as displayed in the lesson) into multiple code blocks with detailed explanations for each chunk of code interleaved with the code blocks. The script files themselves would stay as is - only the way they are displayed in the lesson would change.

See "Analyzing Quality with FastQC" section in https://datacarpentry.org/wrangling-genomics/05-automation/index.html for example of what this might look like.

datacarpentry / image-processing Goto Github PK

image-processing's Introduction

Image Processing with Python

Lesson Content

Contribution

Code of Conduct

Lesson Maintainers

image-processing's People

Contributors

Stargazers

Watchers

Forkers

image-processing's Issues

Advantages of using r, c

Section 1: Terms specific to this lesson

Section 2: Standard programming terms and Python vocab

Section 3: Possibly unnecessary jargon

Recommend Projects

Recommend Topics

Recommend Org

Advantages of using `r, c`