Code Monkey home page Code Monkey logo

Comments (6)

damienmg avatar damienmg commented on May 23, 2024

/cc @laurentlb @adonovan @stepancheg

I would tend towards Stepan understanding of the specification: you should not get access to global scope of another module, only to the currently defined global scope as the specification clearly state that you need to import symbol for another scope.

Now I don't see how your usage is incompatible, could you create show a simple use case for us to discuss on?

from starlark-rust.

alandonovan avatar alandonovan commented on May 23, 2024

Python, Starlark-in-Go, and Bazel's Starlark all have some concept of thread-local state that can be set and accessed by the application but is otherwise simply passed through the interpreter. So, you can attach values to the thread before calling Exec, and access those same values from within a built-in function. The structure of the thread-local store can be a map from string to object, or a typed version of the same thing (such as T.class -> T).

Implementing this feature does not require thread-local store in the host language (e.g. ThreadLocal in Java; Go does not have goroutine-local store, by design).

from starlark-rust.

indygreg avatar indygreg commented on May 23, 2024

Here is my use case, with pointers to the code using version 0.2 of this crate.

There exists an EnvironmentContext Rust struct (https://github.com/indygreg/PyOxidizer/blob/3a3812c5e7e8ed4fd0710083646efb8a6c50deed/pyoxidizer/src/starlark/env.rs#L44) holding onto a bunch of global state needed by various Rust-implemented Starlark types/functions. This type impl TypedValue (https://github.com/indygreg/PyOxidizer/blob/3a3812c5e7e8ed4fd0710083646efb8a6c50deed/pyoxidizer/src/starlark/env.rs#L307). But the type is not registered as a type value on Environment. Instead, during construction of the global Environment, we env.set("CONTEXT", ...) an instance of this struct (https://github.com/indygreg/PyOxidizer/blob/3a3812c5e7e8ed4fd0710083646efb8a6c50deed/pyoxidizer/src/starlark/env.rs#L545), effectively making it a global variable inside of the Starlark evaluation environment. We can also see a few lines below how we assign other global variables like CWD and CONFIG_PATH, but from more primitive types (namely str).

These .set() calls update the variables map, not typed values.

For various Rust-implemented Starlark functions/methods, we call env.get("CONTEXT") to get a handle on CONTEXT and downcast it to EnvironmentContext so the Rust code can access its non-exposed-to-Starlark fields. e.g. https://github.com/indygreg/PyOxidizer/blob/3a3812c5e7e8ed4fd0710083646efb8a6c50deed/pyoxidizer/src/starlark/python_distribution.rs#L186.

We also have some of these UPPERCASE variables documented as part of the API contract for the execution environment so user-provided Starlark code can reference their values. And Rust's reading of global variables isn't limited to CONTEXT: it may also read other variables from the outer/global scope. So the global variables are not just opaque global state.

As for mapping PyOxidizer onto the Starlark specification, we should only have a single module at play here. According to the Starlark spec: Each module is defined by a single UTF-8-encoded text file. There is only a single PyOxidizer configuration file evaluating at a time and no load() is provided. So we will have exactly 1 module in any evaluation context. So there is no "other module" that symbols may be coming from, just the lexical blocks in a single module.

From the Starlark specification:

Nested beneath the universe block is the module block, which contains the bindings of the current file. Bindings in the module block (such as a, b, c, and h in the example) are called global. The module block is typically empty at the start of the file and is populated by top-level binding statements, but an application may pre-bind one or more global names, to provide domain-specific functions to that file, for example.

So conceptually there is a universe block and a module block. Now, the Rust Starlark implementation doesn't seem to make this distinction: it just has an Environment which you build up by adding symbols. You even need to add the standard library symbols into this Environment! This is where PyOxidizer is injecting its symbols. PyOxidizer simply populates a fresh Environment with the standard library symbols plus its custom symbols: https://github.com/indygreg/PyOxidizer/blob/3a3812c5e7e8ed4fd0710083646efb8a6c50deed/pyoxidizer/src/starlark/env.rs#L536. It then evaluates a file with this Environment: https://github.com/indygreg/PyOxidizer/blob/3a3812c5e7e8ed4fd0710083646efb8a6c50deed/pyoxidizer/src/starlark/eval.rs#L63.

A Starlark file has access to symbols in outer/higher blocks:

Welcome to Starlark REPL, press Ctrl+D to exit.
>>> MESSAGE = "hello, world"
>>> def do_it():
...     print(MESSAGE)
...
<function do_it from repl>()
>>> do_it()
hello, world
>>>

This has to be the case, otherwise global symbols like None would not work!

Functions defined and implemented in Starlark obviously have access to these outer symbols.

But Starlark functions implemented in Rust do not have access to these outer variables because they no longer have access to the Environment they are running from. I don't think this is correct.

What could be causing confusion here is that in the case of PyOxidizer, the Rust-implemented Starlark functions are being injected into module/file scope: they are not defined in a separate module. Conceptually, I'm taking a user-provided Starlark configuration file and prepending a prelude containing PyOxidizer's set of symbols. This enables me to keep user-facing Starlark configuration files simpler, as config files don't need to worry about loading/importing additional files/modules/symbols. I have also chosen to implement various Starlark functions/methods in Rust instead of Starlark itself because I like the guarantees the Rust compiler ensures at compile time. If push came to shove, I could certainly move functionality from Rust to Starlark so the Starlark-implemented functions could get access to global variables then pass this value to a Rust-implemented function. e.g.

# In Starlark prelude provided by PyOxidizer - implemented in Starlark
CONTEXT = ...

def my_func():
    _real_my_func(CONTEXT) # calls a Rust-implementation function with handle on global

# User-provided Starlark content follows
my_func() # Calls Starlark native function, which calls a Rust-implemented function.

This would work. But it feels extremely hacky and I have a strong preference to avoid doing this.

I can see why it is useful to have multiple modules and separation of symbols between them. But for PyOxidizer I have purposefully rejected this approach in the name of simplicity. The Starlark specification doesn't say "you can't inject additional global symbols into module scope" or "you must use a separate module for a file from the ones providing the universe/module block." In contrast, it seems to encourage PyOxidizer's use case via the "an application may pre-bind one or more global names, to provide domain-specific functions to that file" language. This is why I think the change to this crate removing access to Environment and the global variables it exposes is too constraining and doesn't allow full use of Starlark as defined by its specification.

from starlark-rust.

damienmg avatar damienmg commented on May 23, 2024

IIUC Your use-case is totally legit and that should work. I would need to build a repro to understand what is going wrong.

from starlark-rust.

damienmg avatar damienmg commented on May 23, 2024

Ok I got around to do a test, and could not repro with this simple test case:

$ cargo run 
   Compiling starlark-repl v0.3.1-pre (/Users/dmarting/git/skylark-rust/starlark-repl)
    Finished dev [unoptimized + debuginfo] target(s) in 6.42s
     Running `/Users/dmarting/git/skylark-rust/target/debug/starlark-rust`
Welcome to Starlark REPL, press Ctrl+D to exit.
>>> def f():
...   return CONTEXT
... 
<function f from repl>()
>>> f()
"test"
>>> 

Goodbye!
$ git diff
diff --git a/starlark-repl/src/lib.rs b/starlark-repl/src/lib.rs
index bb66a6e..fcf4441 100644
--- a/starlark-repl/src/lib.rs
+++ b/starlark-repl/src/lib.rs
@@ -121,6 +121,7 @@ pub fn repl(global_environment: &mut Environment, type_values: &TypeValues, dial
     let mut env = global_environment.child("repl");
     let mut n = 0;
 
+    env.set("CONTEXT", Value::from("test")).unwrap();
     // Linefeed default history size is unlimited,
     // but since we write history to disk, we better limit it.
     reader.set_history_size(100_000);
$ 

That's probably because I misunderstood your use case, care to try to do a simple repro?

from starlark-rust.

indygreg avatar indygreg commented on May 23, 2024

I worked around this in PyOxidizer by registering attributes on a dummy type value that are effectively aliases to Value in the variable scope: indygreg/PyOxidizer@6b0f92b. This works for read-only Value. While I haven't tried, I suspect that if a variable is replaced, the type value reference will point to the old Value.

Anyway, this allows me to access global symbols from Rust-defined functions while only having access to TypeValues. This unblocked me from upgrading to version 0.3 of this crate.

I want to emphasize that this solution of using TypeValues to stash a reference to a variable or other global Value feels extremely hacky. But it did unblock me from upgrading to version 0.3 of the crate.

From your last comment, I think the context being lost here is that Rust-implemented Starlark functions can not access symbols outside of TypeValues. A Starlark-defined function can access variables from outer scopes. But due to the Rust API change of passing TypeValues instead of Environment to the Rust functions, there is no longer any way to access variables, only TypeValues.

from starlark-rust.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.