Code Monkey home page Code Monkey logo

Comments (22)

hadley avatar hadley commented on June 18, 2024 1

I only skimmed most of this thread, but if you really want a fully reproducible example, using a clean R session is the only way to do it. Otherwise you need to think about every way that the user might have changed their session, and there's a lot of them! (options(), par(), S4, library(), ...)

from reprex.

jennybc avatar jennybc commented on June 18, 2024

Yes please do that. I had already thought about this.

from reprex.

dgrtwo avatar dgrtwo commented on June 18, 2024

This was a lot harder than I thought because I seem to have run into a knitr bug, or at least my own misunderstanding. Stack Overflow query here. Can you see any mistake I made?

from reprex.

jennybc avatar jennybc commented on June 18, 2024

Can't look at it now but will later. This will also potentially interact with the dput-y stuff from #7, right?

from reprex.

dgrtwo avatar dgrtwo commented on June 18, 2024

Yes, that's a concern.

from reprex.

jennybc avatar jennybc commented on June 18, 2024

More horrifying is that the reprex code currently changes the environment of reprex_(), so if the reprex happens to use important variable names, bad things happen. Will use the envir argument of render() to seal this off.

from reprex.

jennybc avatar jennybc commented on June 18, 2024

Would you think about this with me?

I have fixed the main problem, which was that objects in the user's workspace would be available when executing the reprex. Another problem with not using envir = was that the reprex could also change the environment of its caller, reprex_(), and, with an unlucky choice of variable name, break things.

OK yes we must use envir = new.env(...) inside rmarkdown::render(). That much is clear. But I'm having trouble figuring out what to set it to. Here's the most recent state: a7459fc.

As you observed on stackoverflow, new.env(parent = baseenv()) is too restrictive. For example, if I put this code on the clipboard:

(y <- 1:4)
mean(y)
search()
ls()
plot(1:100)

and use reprex() with render(..., envir = new.env(parent = baseenv()), I get this result:

(y <- 1:4)
#> [1] 1 2 3 4
mean(y)
#> [1] 2.5
search()
#>  [1] ".GlobalEnv"        "package:reprex"    "package:ggplot2"  
#>  [4] "package:testthat"  "devtools_shims"    "tools:rstudio"    
#>  [7] "package:stats"     "package:graphics"  "package:grDevices"
#> [10] "package:utils"     "package:datasets"  "package:devtools" 
#> [13] "package:methods"   "Autoloads"         "package:base"
ls()
#> [1] "y"
plot(1:100)
#> Error in eval(expr, envir, enclos): could not find function "plot"

I'm puzzled why graphics shows up on the search path but plot cannot be found. But I admit to some confusion about the enclosing environment vs parent frame of a function.

Following the other suggestion from stack overflow, I tried render(..., envir = new.env(parent = as.environment(2)), which indeed addresses our main concerns. This is where we are now. If I repeat the example above, all is well:

(y <- 1:4)
#> [1] 1 2 3 4
mean(y)
#> [1] 2.5
search()
#>  [1] ".GlobalEnv"        "devtools_shims"    "package:reprex"   
#>  [4] "tools:rstudio"     "package:stats"     "package:graphics" 
#>  [7] "package:grDevices" "package:utils"     "package:datasets" 
#> [10] "package:devtools"  "package:methods"   "Autoloads"        
#> [13] "package:base"
ls()
#> [1] "y"
plot(1:100)

But now there's another issue: packages loaded by the user are available when the reprex executes.

Final example.

Interactively, I load ggplot2 via library(ggplot2). Then put this code on the clipboard, which makes use of qplot(), despite the reprex itself containing no mention of ggplot2:

(y <- 1:4)
mean(y)
search()
ls()
qplot(rnorm(100))

I get this result:

(y <- 1:4)
#> [1] 1 2 3 4
mean(y)
#> [1] 2.5
search()
#>  [1] ".GlobalEnv"        "package:ggplot2"   "devtools_shims"   
#>  [4] "package:reprex"    "tools:rstudio"     "package:stats"    
#>  [7] "package:graphics"  "package:grDevices" "package:utils"    
#> [10] "package:datasets"  "package:devtools"  "package:methods"  
#> [13] "Autoloads"         "package:base"
ls()
#> [1] "y"
qplot(rnorm(100))
#> stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to adjust this.

which means we still have a reproducibility problem. How do we get a fresh environment as if we'd just started up R?

from reprex.

jennybc avatar jennybc commented on June 18, 2024

I just re-read parts of the Environments chapter of Hadley's book.

I think the current solution is "pretty good" and the best solution is to do a devtools::clean_source() type of thing. This is why I redesigned reprex_() to take reprex source in a file rather than as character vector anyway. Silver lining: it will solve the remaining problem here with inheriting packages from the interactive workspace AND should allow us to reprex() examples that call reprex() itself (or markdown::render(), knitr functions, etc.) (#10). So I think have a plan.

from reprex.

dgrtwo avatar dgrtwo commented on June 18, 2024

I'm working on this now as well (and re-reading the same chapter)!

So you're thinking of performing knitr through the command line rather than the knit function, as clean_source does? I understand but I'm surprised it's necessary!

from reprex.

jennybc avatar jennybc commented on June 18, 2024

It may not be necessary, though I haven't figured out how to avoid it to be honest.

I was just thinking that writing the temporary wrapper, as I do in the README, and rendering the reprex R file that way would kill two birds with one stone, so why not do it, if devtools is installed? The current solution could be a fall back.

But agree it does seem like we should be able to keep the reprex code from using packages that aren't loaded in the reprex. 😕

from reprex.

jennybc avatar jennybc commented on June 18, 2024

We could just ask hadley.

from reprex.

klmr avatar klmr commented on June 18, 2024

I think my amended answer on Stack Overflow should solve this problem.

But more philosophically, I would actually have expected the reprex function to work slightly differently:

Given the following input

x = 2
y = 3
reprex(x + 1)

I intuitively would have expected reprex to pull all dependencies (but only those actually used — here: x but not y) into the resulting example. In other words, the output for the above would be:

``` r
x <- 2
x + 1
#> [1] 3
```

Implementing this should be possible but it’s not exactly trivial.

from reprex.

jennybc avatar jennybc commented on June 18, 2024

Thanks @klmr for your wisdom here and over on stackoverflow!

I don't have the same expectation of how clever reprex() should be, but others like @dgrtwo share some of your instincts. Head on over to #7 to see that discussion.

Luckily, I'm totally comfortable with forcing user to make a self-contained example 😬.

from reprex.

klmr avatar klmr commented on June 18, 2024

It’s a valid design, of course.

The only issue I can see with this design and my suggested solution is that reprex needs to distinguish between user-loaded packages and default packages: When using an environment anchored in emptyenv(), no attached package (except package:base) will be available. Conversely, when using as.environment(2), all loaded packages will.

Hence, a robust solution would create an empty environment (anchored in baseenv()) and then use the method from the Stack Overflow answer to attach the default packages to it — these are ostensibly given by getOption('defaultPackages'), but this option could of course have been overridden by the user (in fact, I do this to load additional packages in .Rprofile in the right order). Hence, I would just hard-code this list, or maybe use the packages marked with Priority == 'base' in installed.packages() (but not all of these packages are attached by default).

from reprex.

jennybc avatar jennybc commented on June 18, 2024

I think if I execute reprex_() (the inner workhorse function that operates on the reprex code as an augmented R script), via wrapper that uses devtools::clean_source(), we get a decent result w.r.t. attached packages.

The reprex code will then execute in an R session launched with these options:

--no-site-file --no-environ --no-save --no-restore

Hmmm.... maybe one would also want --no-init-file?

I do understand @dgrtwo's distaste re: a command line solution. But I'm also interested in solving the "nested render" problem (see #10 here and also rstudio/rmarkdown#248). Until that's fixed, reprexes cannot contain any code that (indirectly) calls rmarkdown::render(), knitr::knit2html(), etc., which includes reprex() itself.

I could implement what you suggest @klmr as alternative method of execution, so that devtools is only an absolute requirement for nested render.

Would appreciate any thoughts.

from reprex.

dgrtwo avatar dgrtwo commented on June 18, 2024

@jennybc I'm very much starting to come around on a command line! It does get at the Platonic ideal of "Open up a fresh R session; can you still do this?"

Implementation- you or me?

from reprex.

jennybc avatar jennybc commented on June 18, 2024

I'd like to do it. I am still using these packages as stretch goals for myself in terms of upping my R skilz. But I am so happy to have you on board :)

from reprex.

dgrtwo avatar dgrtwo commented on June 18, 2024

Cool!

One note from early looking at it: you're going to run into a problem of passing upload.fun through. If it's an arbitrary, user-defined function, the best you could do is turn it into R source with deparse. But then you can't get that code to fit on the command line (it'll contain newlines). Basically there's no convenient way to "serialize" the given function.

I'd suggest always using imgur_upload, for now- I think this goal is more important than allowing image upload customization. An alternative is allowing it to take a character string like "knitr::imgur_upload" that can change: that way, if knitr or another package adds additional uploading functions, they can be provided as a string.

from reprex.

dgrtwo avatar dgrtwo commented on June 18, 2024

This will also make #20 a bit harder to solve, since you'll need to pass rendering errors back through the command line. Let me know if you get stuck, I have a few thoughts!

from reprex.

jennybc avatar jennybc commented on June 18, 2024

I hope to return to this ... over the weekend .. if I survive the week.

For my edification: Could one of you help me understand why, in my example above where render(..., envir = new.env(parent = baseenv()), we see graphics package on the search path and yet the function plot cannot be found at runtime? I would really like to understand that.

from reprex.

klmr avatar klmr commented on June 18, 2024

@jennybc Do you know How R Searches and Finds Stuff? It’s an excellent introduction into how R looks up objects.

The short answer here is that R doesn’t actually use the search() packages. What R actually does to find an object is the following:

  1. Set the env to the current environment.
  2. Is env equal to emptyenv()?
    1. If yes, raise an error.
    2. If no, continue.
  3. Does the object exist in env?
    1. If yes, stop.
    2. If no, set env to parent.env(env).
    3. Go to 2.

In other words, R searches through the chain of parent environments, and nothing else. search() is just a handy command to display all the parent environments of globalenv(). Most environments that we encounter in R are child environments of globalenv()1, which is why we can just use plot inside a function, say, and R will find its definition by walking up the chain of parent environments.

But for render(…, envir = new.env(parent = baseenv())), we are explicitly detaching ourselves from this chain of parent environments. Here’s what the environments in our case look like:

+---------+      +-----------+      +------------+
|  envir  | ---> | baseenv() | ---> | emptyenv() |
+---------+      +-----------+      +------------+

The environments in search() are simply never reached here.


1 Package namespace are an important exception to this rule. They are explained in the article linked above.

from reprex.

jennybc avatar jennybc commented on June 18, 2024

Thanks @klmr for your explanation. The environment chain is becoming more clear to me.

@dgrtwo I'll take you up on your Twitter offer! Here's my current progress: bd4e134. It works fine if the reprex doesn't contain a call to reprex(). Good news: there is no longer any "leakage" of attached packages from the current session, etc. Bad news: the nested render problem still isn't solved and in general I am not handling/catching the rendering errors very well.

You can see some of this success and some failure by running this and (de-)commenting the 3rd and/or 4th lines:

(y <- 1:4)
mean(y)
#qplot(rnorm(100))
#reprex::reprex({1 + 1})

from reprex.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.