Code Monkey home page Code Monkey logo

kuroko's People

Contributors

harjitmoe avatar klange avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kuroko's Issues

keyword arguments can't default to themselves

this bug is a little awkward to describe in words, so instead, consider the code:

def greet(print=print):
    print("Hello, world.")

greet()
Traceback (most recent call last):
  File "issue01b.krk", line 4, in <module>
    greet()
  File "issue01b.krk", line 2, in greet
    print("Hello, world.")
TypeError: 'object' object is not callable

at first, I thought this behavior was specific to built-ins, but I just now thought to try it with other types:

let message = "Hello, world."

def greet(message=message):
    print(repr(message))

greet()

ArgumentError: repr() takes exactly 1 argument (0 given)

this bug only occurs when the left hand side and right hand side are the same. therefore, these programs will work:

let message = "Hello, world."

def greet(message_=message, print_=print):
    print_(message_)

greet()
let message_ = "Hello, world."
let print_ = print

def greet(message=message_, print=print_):
    print(message)

greet()

tested on commit 8fb1689.

Found a possible security concern

Hey there!

I belong to an open source security research community, and a member (@geeknik) has found an issue, but doesn’t know the best way to disclose it.

If not a hassle, might you kindly add a SECURITY.md file with an email, or another contact method? GitHub recommends this best practice to ensure security issues are responsibly disclosed, and it would serve as a simple instruction for security researchers in the future.

Thank you for your consideration, and I look forward to hearing from you!

(cc @huntr-helper)

[Termux] TypeError: __init__() expects list, not 'list'

~/downloads/kuroko $ ./kuroko
Kuroko 1.3.1 (Oct 18 2022 at 19:47:41) with clang 15.0.2
Type `help` for guidance, `paste()` to toggle automatic indentation, `license` for copyright information.
>>> list()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() expects list, not 'list'
>>> [1,2,3]
 => [1, 2, 3]
>>> range(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() expects range, not 'range'
>>> list(range(5))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() expects range, not 'range'
>>>
~/.../kuroko/test $ ../kuroko testThreading.krk
Starting 10 threads.
Traceback (most recent call last):
  File "testThreading.krk", line 33, in <module>
    let numbers    = list(range(totalCount))
TypeError: __init__() expects range, not 'range'

“Missing” output

On every new release of kuroko I tell myself that now is the time to learn kuroko, and every time I give up after a line or two in the REPL because it doesn't seem to show any output. This time is different, though, because I'm gonna file an issue where I ask if this is the intended behaviour:

kuroko-silent

Can I disable syntax highlighting, if that is what is interfering with the dark shell theme that I'm using?


  • Kuroko v1.4.0 (compiled from source with gcc v13.2.1)
  • Terminator v2.1.3 (terminal emulator)
  • Fish v3.6.1 (shell)
  • Linux x86_64

`asyncio` module

Create a module implementing the interface from Python's asyncio:

  • Provide at least one event loop, probably with poll.
  • Support at least basic Futures, timers.

Kuroko 1.3

This ticket is a draft release notes and TODO list for Kuroko 1.3, the next major release. 1.3 will follow 1.2.5.

Features:

  • Optimized method invocation instructions
  • Cached methods for operators.
  • Loop/iterator optimizations.
  • else branches for for and while loops.
  • Support for = in f-string expressions.
  • Relative import statements.
  • Improved support for package modules.
  • Overloads for in-place binary operators (eg., __iadd__)
  • Arbitrary precision integers with the long type, with seamless switchover back and forth from int.
  • Added comparator methods for tuples and lists.
  • Support for exception contexts when an exception is raised in an except block.
  • raise ... from ... to set causes for exceptions.
  • Private identifier mangling rules within classes, following Python's rules.
  • Format specs in f-strings, __format__ implementations for strings and ints/longs.
  • __setattr__, __set_name__
  • Additional operators: unary +, infix @
  • Compiler concatenation of unalike string tokens (f-strings, r-strings, and regular strings)

Notable bug fixes:

  • Resolved several issues with memory allocation tracking for objects of flexible size.
  • Resolved an issue where value stacks referenced as argument lists in native functions could be reallocated, resulting in stale references.
  • Resolved various issues in which exceptions raised in nested managed execution could result in incorrect stack pointer offsets.

Can colourisation be disabled?

In short: can the REPL colourisation ‘theme’ be disabled or changed?

While I like pretty colours, a lot of the text in the kuroko REPL is black on black — even when compiled with KRK_DISABLE_RLINE=1 or invoked as kuroko -r — and thus illegible:

$ kuroko -r
Kuroko 1.1.2 (Nov 14 2021 at 12:59:00) with GCC 11.1.0
Type `help` for guidance, `paste()` to toggle automatic indentation, `license` for copyright information.
>>> help

>>> 2

>>> dir('')


>>>

In order to see the output I have to paste it into an editor, which is counter-productive.

Compile to native code?

I appologize if this is a dumb question and i don't suppose it is in your agenda for kuroko's future but is compiling kuroko to native code on your agenda cuz that will be hella freakin cool.also a seperate question: do you plan on registering kuroko on github linguists as a seperate programming language.usually registering a language requires it to have 200 or so seperate repos written in that language by seperate users.i know it hasn't garner much attention but just wanted to know if you had any such plans.

Remove/Rename un-namespaced macros

We expose a lot of macros which are not prefixed with KRK_ or krk_. To provide a clean API, these should be renamed.

  • *_VAL macros for creating value cells. These should be renamed to look like function calls and use the correct type names, eg. krk_new_int or something similar.
  • AS_* for extracting from value cells and objects. These should at least be prefixed, and the value ones should be updated to use the right names. krk_as_?
  • IS_* type checking convenience macros. Same as above.
  • Various old-style argument checking macros like METHOD_TAKES_*/FUNCTION_TAKES_*.
  • Binding macros, like BIND_METHOD, and the related MAKE_CLASS macro (which doesn't seem to be used?)

While we're at it, removing the old-style KRK_FUNC/KRK_METHOD macros would be worthwhile. All of the code I have on hand has been converted to the new-style KRK_Function/KRK_Method macros.

Kuroko 1.6 Roadmap

  • Nested destructuring in for loop assignment lists
  • Arbitrary assignment targets in for loops
    • Eg. object members
  • Some sort of syntax for not implicitly declaring names in various block statements?
    • Includes for, with ... as ..., def.
  • Ordered dicts by default?
  • Bare raise

v1.1 Roadmap / Ideas

This ticket is for tracking ideas for the next full release of Kuroko.

1.0 was essentially a demo, the first version of Kuroko as a viable language built from the core of Crafting Interpreters. 1.1 is intended to be a "real" release - one with a production-ready release process, "complete" standard library, and stable ABI/API for extension developers and embedders.

  • Standard library improvements:
    • subprocess module, etc. (can we do this in managed code around the os module?)
    • Complete os coverage (what's still missing that we need? We have the exec family, fork, ...)
    • socket module
      • useful extensions (http.server; requests implementation?)
    • Merge harjit's codecs implementation
    • Pull object methods out of vm.c and use macros from bim development branch.
    • str methods: replace, startswith, endswith, find/index
    • isalpha (etc.); support other sequences in join
    • Mutable sequence methods: index, count, clear, copy, remove, reverse
    • del on slices
    • Mapping methods: clear,copy,get,setdefault,update
    • Pull set into C core
  • Add class decorators (bim syntax files would really like to have this)
  • Threading? Or maybe that's best left way on the backlog for "2.0". I actually sat down and did this, though it's still missing locks in a lot of places and is only known to work on Linux...
  • Persist module bytecode with a marshaling format. Bytecode itself can be directly saved; function constants tables are a bit trickier. Format stability is not important, but we should be able to write once and then read from future runs of the same interpreter version like Python does with 'pyc' files. This is experimental, but a functional demonstration is provided in 1.1
  • Finish the documentation rewrite (https://kuroko-lang.github.io/docs.html)
    • Add docstring bindings for everything in the standard library. (__builtins__ still needs full coverage of types)
    • Build tools for turning docstring docs into usable HTML documentation, like Sphinx does.
    • Complete C API documentation, follow a standard like Doxygen, add build tools for it.
    • Prose docs for embedding and extending, building from source, something like the Python Tutorial.
  • Rewrite/reorganize headers:
    • C library users should only need to include <kuroko/kuroko.h> and not separate headers.
    • Library sources should also be including things like <kuroko/kuroko.h> instead of "kuroko.h"
    • Need a "config.h" to hold build-time settings like whether threads are enabled, as they affect struct definitions. This is still a problem, but not as much as it was...
  • Add an autotools-like configure script so we can stop worrying about weirdness when building for win32? Maybe also integrate some of the WASM build scripts into that...
  • Make class fields work like in Python / eliminate the "fields"/"methods" split?
  • Implement multiple inheritance, runtime attribute lookup ordering? Not doing multiple inheritance now or ever, but maybe mixins will show up at some point.
  • Type hinting annotations
    • On module-level let declarations
    • In function signatures
    • On class attributes

Deferred to 1.2

  • Rewrite the compiler to build an AST so we can reasonably implement things like multiple assignment, real tuple syntax, etc.
  • Clean up cruft from Crafting Interpreters that no longer makes sense in modern Kuroko:
    • Merge "chunks" and "functions" as there is a 1:1 relationship between the two.
    • Call the result "code objects" like Python does? Done!
    • "Functions" vs "Closures" makes little sense in the C API. Partially done
    • Fix all those IS_TYPE / AS_TYPE macros to use the format from util.h with actual type names.
    • Eliminate object.c and move things into their obj_ sources (and all the common stuff into memory.c? which should probably be renamed gc.c?)

Cancelled

  • Write an ARCHITECTURE.md with a prose description of how everything meshes together. We have a website, with extensive documentation, it describes the architecture of the compiler.
  • Make the compiler fully re-entrant. There were speed concerns with making the VM re-entrant as means of supporting threading, but this is not the case with the compiler, so we can definitely have a top-level state we chain down to everything... A complete compiler rewrite with an AST is planned for 1.2, so I'm not going to waste my time with this right now.
  • Consider whether basic value types should be nixed and moved to objects. Even Python has its integers and such as objects, and it would allow us to reduce stack references to just pointers, halving (or more) their size. Having a single path for type checking might also improve performance. Followup: This was tried and it improved some cases and destroyed some others. Not really sure if it's worthwhile unless we want everything to have fields, like in Python.

Python Compatibility TODO List

Kuroko is essentially an incomplete implementation of Python with some quirks. This issue is intended to track what Kuroko is missing or unintentionally does differently.

  • Tuples: Are these really any different from lists without mutation? Look into also implementing CPython's freelists for small tuples, which probably requires some GC integration. Syntax support should be fairly straightforward: Detect TOKEN_COMMA in a grouping after an expression and switch to tuple processing.
  • if statements in comprehensions: Implement as an if branch that pops and decrements the iterator counter if expression is False?
  • print as a function: OP_PRINT is just a loxism and there's really no reason this couldn't be moved to a native function, including support for the various keyword arguments (end is particularly useful). This can happen right now? (Even inclusive of the string coercion...)
  • is: We only have == and there are semantic differences between is and ==. Also is not.
  • Various special methods: __get__/__set__ should be renamed to their Python counterparts of __getitem__/__setitem__, we're missing all of the operator ones (__add__, __eq__, etc.)... Marked done: Some of these aren't the same as in Python, but we have good overall coverage for operators.
  • Class fields: These are just generally useful for defining fixed members or defaults without an init, and also make nicely namespaced enums and whatnot.
  • Standard Library: Throwing this very large task in a single checkbox, but there's a lot of ground to cover even just for the basics.
  • Line feeds in statements: Inside of parens or brackets, line feeds should be accepted.
  • pass statement: We accept empty blocks, maybe we shouldn't? Still should have pass either way.
  • with ... as ...: I think this fairly straightforward? It's a scoping block that introduces a local from an expression. Slap a let on the line before and the typical Python use case makes sense of opening and reading a file into a variable and then closing the file (still need to support files going out of scope correctly closing/freeing, but that's a different issue). Marked done: VM still needs better support for early exceptions.
  • finally statement: Semantics for this are complicated.
  • Multiple arbitrary assignments: We can assign to multiple members with special syntax (foo.(a,b,c) = ...) and we can assign multiple variables in a let expression, but assignment to multiple arbitrary assignment targets is tricky without an AST.
  • async/await
    Coroutines in CPython are implemented in essentially the same ways generators, combined with an event loop that handles scheduling them. We have generators, we have real threads to use with to_thread, we should be able to build a compatible implementation of async/await and start pumping out coroutines.
  • asyncio module
  • yield from, yield expressions
  • Understand global/nonlocal in functions; global is probably a no-op, non-local may be useful as a static check that enforces resolution as an upvalue?
  • Recover the Python-compatible mode in the compiler.

`dict` argument compatibility with Python

compared to Python 3,

  • the dict built-in does not take arguments (and raises an error; fair enough).
let keys = ("key", "this")
let values = ("value", "that")
print(dict(zip(keys, values)))
# ArgumentError: __init__() takes no arguments (1 given)
  • the dict built-in silently ignores keyword arguments. this is my motivation for writing this issue. fixed!
let colors = dict(red=31, green=32, blue=34)
for name, code in colors.items():
    print(f"\033[{code}m{name}")
# no output!

[scope] `for` iteration scope

>>> let ls = []
>>> for i in range(10):
  >     ls.append(lambda: i)
  >
>>> ls[0]()
 => 9
>>> ls[1]()
 => 9
>>> ls[2]()
 => 9
>>> print(i)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: Undefined variable 'i'.
>>>

In Python, for loops define a global variable which does not get cleaned after iterating. After Python's iteration, every i inside of lambda bodies points to the same value.

In Kuroko, i seems to be a local variable instead of a global one, but every i still got the same value - It works just like Python, which is not a bad thing, but I was expecting every for loop got its own scope / namespace / whatever - is this a feature in Kuroko too?

Clean up the compiler

  • Make everything reentrant.
    • Will help with compiling across multiple threads.
    • Compiler should suffer no performance impact from carrying around a context object.
  • Expose a better API than just the top-level krk_compile()
    • Can we expose subexpression compilation?
  • Can we fix error handling to get the best of both clear early SyntaxErrors and actually detecting multiple parse errors?
    • There's a lot of places where we may not be correctly breaking out of loops on errors.
    • We need a better system for collecting errors than raising a single exception or printing.
  • Merge the scanner into the compiler.
    • No solid reason for these to be separate sources, only done that way because of CI.
    • Might be able to eliminate the separation between parse rules and the big token enum?
    • Expose a better scanner API so we can build better REPL tools.

Evalaute Profile-Guided Optimization (PGO)

Hi!

Recently I did a lot of PGO benchmarks on different kinds of software - the results are here. I think the same optimization option could be useful to Kuroko too since PGO is already used for similar projects like CPython.

We need to evaluate the PGO benefits on Kuroko. And if helps - document building Kuroko with PGO. For the users and maintainers would be helpful to see information about PGO effects (and other performance tuning techniques if any) to improve Kuroko's performance.

As an additional step, after PGO I can suggest you evaluate LLVM BOLT optimizer but only as an additional post-PGO optimization step. E.g. rustc compiler is already optimized with PGO + BOLT on Linux platform.

print won't print without an argument

print()  # prints newline in python, nothing in kuroko
print("")  # prints in both

# also...
print("hello", end=" ")
print(end="world")  # this only works in python
print()
print("hello", end=" ")
print("", end="world")  # this works in both

the case with print(end="blah") not printing anything had me especially confused when I was porting a script, as I have a (bad?) habit of using that form.

this is with Kuroko 1.2.3 (commit 650f324) and a tiny patch to build with MSYS2 (MSYSTEM=MSYS).

dangling pointers to VM stack

this one was a toughie. at several points in src/vm.c, pointers to the VM's stack are temporarily stored, but krk_growStack can be called, invalidating those pointers. this probably went by unnoticed because either realloc would not move the pointer, or the old values persisted in memory (use-after-free).

let's look at krk_processComplexArguments. startOfExtras is a pointer to VM stack, then krk_tableSet is called in a loop. a problematic callstack might look like krk_processComplexArguments -> krk_tableSet -> krk_findEntry -> _krk_method_equivalence -> krk_push -> krk_growStack (copied from gdb, might be missing tailcalls). in my case, I kept running into this seemingly illogical error: (this example is not meant to be reproducible)

let subsets
subsets = dict(
    zero=0,
    all=1,
    scalar=2,
    antiscalar=3,
    point=4,
    line=5,
    plane=6,
    moment=7,
    direction=8,
    motor=9,
    rotation=10,
    translation=11,
    flector=12,
)
print(subsets)
Traceback (most recent call last):
  File "/home/notwa/play/issue.krk", line 16, in <module>
    )
TypeError: __init__() got multiple values for argument 'zero'

this is only one example, I haven't checked that the other instances of /&krk_currentThread.stack/ are safe.

Local time?

Kuroko's time module seems to have two methods only — sleep() and time(). Is there a way to get the local time from within kuroko?

Does Kuroko have an equivalent to Python's `round()` method?

[ Re: commit f1d7bda ]

I cannot seem to find a way to round a float to an int (or to a float with a given number of decimals), the way Python's round() does (and I ran into #47 when I tried to write a simple method that would round to an int).

I can see "round" referenced in the syn_py_types array in rline.c (line 910), but that seems to be related to Python, and not to Kuroko.

Am I missing something obvious?

If there is currently no way to round(number, ndigits=None) in Kuroko, please let this issue serve as an enhancement request.

🙏

Kuroko does not read files with Windows \r\n newlines

I was trying out Kuroko 1.1.1 on Windows 10, and I noticed that in souce files I wrote with multiple lines I would get the error:
kuroko: could not read file 'hello.krk': No error

Upon changing my newlines to UNIX (\n), Kuroko could read the file perfectly.

Thus it is safe to assume that kuroko.exe was failing to parse files that use Windows (\r\n) newlines.

Externalize core modules

Several modules in the core interpreter would be better implemented as loadable modules:

  • os has lots of bindings that bloat the size of the core interpreter and is not necessary or useful in a number of environments.
  • stat similarly is just part of the os module...
  • time is pretty simple
  • dis, while somewhat intertwined with the debugger, could still be separated to keep instruction bindings out of the core.
  • fileio could probably be split off
  • threading has some parts that the core depends on, but might still be feasible to separate

Could time.*time() silently truncate floats into ints?

Re: #43

Now that we have time.gmtime() and time.localtime(), could these methods silently truncate (well, preferrably round) the floaty output from time.time() into an int?

time.localtime(time.time())  # looks much cleaner than:
time.localtime(int(time.time()))

i.e., like Python.

Globals should be bound to `function`s, not `codeobject`s

Currently, we are attaching globals tables to codeobjects (KrkCodeObject), as a pointer to an instance object (KrkInstance*). Thus far, this has been fine, as functions (KrkClosure) are only ever built from codeobjects in places where the difference is not visible, but if we want to be able to instantiate functions from codeobjects as one can do in Python, then the ownership of the global context needs to move to the function.

Further, the use of a pointer to an instance is not the best approach here, as it does not allow us to meaningfully use a dict as the globals of a function - all of the current global reference code assumes we are using an instance's fields, but the value table in a dict is not its fields - it is a separate table. We could use a pointer to the relevant table, but this would pose challenges for garbage collection. More investigation should be done here.

math functions log1p and expm1

  1. In kuroko documentation, log1p(x) is defined as log(x) +1. At the same time it is said that this libc math function. This is inconsistent. I would like to point out that in libc log1p(x) is ln(1+x) where ln is the natural logarithm and log1p is used when x <1. When x<<1, there are truncation/roundoff errors in adding a large number 1 to a small floating point number x, and to avoid these a separate function log1p has been provided in libc which is used for small x. There is corresponding assembly instruction in x87 for this function also. So if you are computing log1p as log(1+x) it is inaccurate, and if you are computing log1p as log(x) +1 it is wrong.
  2. There is a corresponding function expm1(x) in libc, which is defined as (e^x -1), which is as necessary as log1p for the same reason for small x<1. The reason is for x<<1 e^x is close to 1 while the difference e^x -1 causes truncation/roundoff errors as before. So the expm1 function has been provided in libc to avoid these errors and there also a instruction to compute it x87 assembly. I request the libc function expm1(x) be added to kuroko.
  3. Having great fun with kuroko! Congrats!

missing gamma function when built against musl

more fun times with detecting the existence of this function…

this issue prevents the math module from being imported — but let it be known that, after omitting the gamma function, all of kuroko's tests pass with musl! I threw together a Dockerfile running Alpine Linux for testing purposes.

note that musl does implement lgamma (and tgamma), but not gamma. infamously, musl does not define any macros for detection. however, we can exploit a leaked #define from math.h that's existed since 2012.

#ifndef __NEED_double_t
MATH_ONE(gamma)
#endif

/* snip */

#ifndef __NEED_double_t
	KRK_DOC(bind(gamma),
		"@brief Calculates the gamma of the input.\n"
		"@arguments x");
#endif

I did a quick google search to see if any non-musl code is defining __NEED_double_t. it seems that all the results are either forks of musl, or LLVM defining __need_double_t (all lowercase), so I don't think there'll be conflicts with other libc's.

at least, that's my suggestion. I'll leave the final implementation up to you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.