Code Monkey home page Code Monkey logo

subleq-emu's People

Contributors

mintsuki avatar xtcshd avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

subleq-emu's Issues

KVM doesn't work

How do you expect me to convert 2D anime girls into 3D when the framerate is under 5 FPS!!!!!11!one! See, this is why Geri wants to burn down IBM and Intel HQ, because they made it so nothing is compatible. How are we going to live in a world where Waifu marriage is legal is we don't have a superfast DawnOS emulator and to do this we need KysVM!!!!!oen!1

but srsly, it gets to the point right before the screen clears Dawn Initializing shows up and then crashes :^(

ISO

In readme file following line exists:
https://github.com/mintsuki/subleq-emu/blob/38abb9063ab0e4c0fef87591e9e91e4dc9715315/README#L1

When I looked inside this archive:
https://github.com/mintsuki/subleq-emu/releases/download/bin-release10/subleq.img.xz
I found .img file instead of .iso file.

Renaming it to .iso did not allowed me to boot it in VirtualBox.
Converting to .vdi with VBoxManage command succeeded, but I see empty window when launch Disks application, so I'm not sure if such conversion was the right way of solving this problem.

Please either mention in readme that it is HDD image and not ISO 9660 one or convert your image to .iso format.

Compilation instead of emulation

Given the abysmal performance of Dawn on any hardware, I propose a change in design:

Instead of emulating every single instruction, we could instead convert/compile the SUBLEQ bytecode into direct CPU machine code. The idea is to let primary code run directly on the hardware, on paged RAM, and handle I/O by means of page faults.

The conversion/compilation speed will depend on whether or not Dawn separates its code from its data, because if it doesn't we'd have to perform an entire analysis of the binary before compiling. Needless to say, this suggestion wouldn't work at all if Dawn engages in self-modifying-code shenanigans.

You may be asking though why go through all this trouble? Compilation by itself won't make it run much faster, since the frequent page faults would offset the gains of direct execution... Ahh, but now, the bytecode can be optimized! We can convert sequences of subleq into faster instructions, for example:

0 : subleq A, B, 24
24:

0 : subleq A, A, 24
24:

0 : subleq A, A, off

to

mov rsi, A
mov rdi, B
mov rax, [rsi]
sub [rdi], rax

mov rdi, A
xor rax, rax
mov [rdi], rax

mov rdi, A
xor rax, rax
mov [rdi], rax
mov rax, off
jmp rax

you know, instead of the generic:

mov rsi, A
mov rdi, B
mov rax, [rsi]
sub [rdi], rax
jg .skip
mov rax, C
jmp rax
.skip:

All addresses of course would have to be converted to the proper endianess during compilation. Or rather, not just addresses but every single qword.

2GiB of RAM

Fam,
SubleqEmu crashes with 2GiB of RAM, it needs moar than 2GiB of RAM

Fundamental issues with inline assembly usage

Graphics.c is only an example, but there is a problematic design pattern in your extended inline assembly usage. Part of it is your belief that an input constraint is automatically also a clobber. That is false. If a constraint is an input/output(+) or an output only (=) the compiler assumes they can be clobbered. That is not the case for input only constraints. unless paired with a corresponding output constraint.

In May when you informed me of how clobbers allegedly worked (I knew better) - I pulled up this project and found within a minute a piece of inline assembly that breaks many of GCC's inline assembly usage rules. I knew what the issues were, very obvious so I decided back then to post my concerns about your understanding of clobbers and an analysis of one function in particular: swap_vbufs for all its flaws. A third party who also is well versed in inline assembly confirmed all these observations in an answer on Stackoverflow

As well, when passing addresses of data structure and/or arrays through registers you have to inform the compiler that there are constraints that are accessing data indirectly. For example - an address being passed via a register where the compiler may not be aware that what the register points at is being read and/or written to. The easiest brute force mechanism in the GCC documentation is the usage of "memory" clobber or you can create dummy memory constraints.

If you would like to see all the issues with swap_vbufs 's inline assembly then this Stackoverflow question and answer may be of some benefit.


Regarding memset (using inline assembly) where you don't inform the compiler (assume 64-bit code) that the RDI and RCX registers clobbered. I have godbolt output of this code:

#include<stddef.h>
#include<stdio.h>
#include<string.h>

/* Compiler isn't aware that RDI or RCX are clobbered */
void *badmemset1(void *dest, int value, size_t count)
{
    asm volatile  ("cld; rep stosb" :: "D"(dest), "c"(count), "a"(value) : "cc", "memory");
    return dest;
}

/* Compiler isn't aware that RCX is clobbered, but RDI is clobbered */
void *badmemset2(void *dest, int value, size_t count)
{
    void *temp = dest;
    asm volatile  ("cld; rep stosb" : "+D"(dest) : "c"(count), "a"(value) : "cc", "memory");
    return temp;
}

/* Compiler is aware that RDI and RCX are clobbered */
void *goodmemset(void *dest, int value, size_t count)
{
    void *temp = dest;
    asm volatile  ("cld; rep stosb" : "+c"(count), "+D"(dest) : "a"(value) : "cc", "memory");
    return temp;
}

int main()
{
    char bufa[]="0123456789";
    char bufb[]="0123456789";
    char bufc[]="0123456789";

    badmemset1(bufa, 'a', 10);
    badmemset1(bufa, 'b', 10);
    /* This should print out 10 'b' */
    printf("bufa: %10s\n", bufa);

    badmemset2(bufb, 'a', 10);
    badmemset2(bufb, 'b', 10);
    /* This should print out 10 'b' */
    printf("bufb: %10s\n", bufb);

    goodmemset(bufc, 'a', 10);
    goodmemset(bufc, 'b', 10);
    /* This should print out 10 'b' */
    printf("bufc: %10s\n", bufc);
}

#if 0
/* This version is less bruteforce and doesn't require volatile, and drops the unneeded CLD
 * the OP had, as the calling convention requires that DF=0 upon entry to a function */
void *goodmemset(void *dest, int value, size_t count)
{
    void *temp = dest;
    asm ("rep stosb" 
         : "+c"(count), "+D"(dest), "=m"(*(char (*)[count])dest) 
         : "a"(value));
    return temp;
}
#endif

Very simple code. Memset a buffer with a and then immediately memset a buffer with b. Three different variants of the code. Each should should print bbbbbbbbbb. With -O0 it works as expected as the the output is:

bufa: bbbbbbbbbb          
bufb: bbbbbbbbbb 
bufc: bbbbbbbbbb

At -O3 it doesn't work. Versions of GCC on godbolt produce output that may often look like:

bufa:           
bufb: aaaaaaaaaa
bufc: bbbbbbbbbb

The behaviour of violating GCC's inline assembly rules as documented may vary from compiler to compiler so the output of any one compiler may not match the results of above.

badmemset1 is a version where the compiler is unaware RDI and RCX are potentially clobbered; badmemset2 is a version where the compiler is told RDI is clobbered and RCX isn't; and goodmemset is a version where the compiler is properly told that RCX and RDI are both potentially clobbered. Only the last version produces the expected results.

This is all contrary to your assertion on Discord in May that:

"c"(count) already tells the compiler c is clobbered


Contrary to your claim that I said

claims rep ** shit in x86 is good

I have never made such claims. Someone posted code using stosb in memset and I tried to explain about the clobber issues which you refuse to understand. My discussion wasn't about the merits/pitfalls of using string instructions., but why the inline assembly was incorrect. I spent great deal of time in your absence explaining to iiSaLMaN why his code was wrong (despite your incorrect claims), and got him an implementation that works with higher optimization levels and code being inlined. In fact I'd be more inclined to code memset in C as it doesn't necessarily have to be done in assembly.

You also published this following bios_print function so that iiSaLMaN could learn from your programming prowess. I'll give you an A for effort but an F on implementation. It suffers the same issues as the other code you believe is correct. If you compile with -O0 there is a reasonable chance the code will work because the compiler like GCC/CLANG will usually generate load/stores around every C statement. The problem with clobbers may be hidden but becomes potential source of bugs on higher optimization levels especially when the compiler decides to inline functions. This was the bios_print you touted as an example:

void bios_print(const char *str) {
    asm (
        "1:\n\t"
        "lodsb\n\t"
        "test al, al\n\t"
        "jz 2f\n\t"
        "int 0x10\n\t"
        "jmp 1b\n\t"
        "2:\n\t"
        :
        : "a"(0x0e00), "S"(str)
        : "cc", "memory"
    );
}

A simple way to break this code with -O3 is to call the function bios_print twice in a row with the same string:

void bios_print(const char *str) {
    asm (
        "1:\n\t"
        "lodsb\n\t"
        "test al, al\n\t"
        "jz 2f\n\t"
        "int 0x10\n\t"
        "jmp 1b\n\t"
        "2:\n\t"
        :
        : "a"(0x0e00), "S"(str)
        : "cc", "memory"
    );
}

int test_print()
{
    char *mystring = "Hello, world!";

    bios_print (mystring);
    bios_print (mystring);
}

You can observe the incorrect behaviour in the generated assembly in the second godbolt pane:

.LC0:
        .string "Hello, world!"
test_print:
        push    esi
        mov     eax, 3584
        mov     esi, OFFSET FLAT:.LC0
        1:
        lodsb
        test al, al
        jz 2f
        int 0x10
        jmp 1b
        2:
        ; At this point ESI is pointing one past the end NUL byte of the string.
        ; The following loop will start where ESI left off printing something unexpected
        ; or nothing rather than printing the same string again.
        1:
        lodsb
        test al, al
        jz 2f
        int 0x10
        jmp 1b
        2:

        pop     esi
        ret

This occurred because "S"(str) doesn't mark ESI as a clobber. This is an input only constraint where the compiler expects the value not to change. The compiler assumed that the beginning of the string that was originally in ESI was still there and it reused it. Unfortunately you changed it without telling the compiler. This is documented in the GCC manual with a warning:

Warning: Do not modify the contents of input-only operands (except for inputs tied to outputs). The compiler assumes that on exit from the asm statement these operands contain the same values as they had before executing the statement.

What happens if the code is changed to mark ESI ("S") as being clobbered by specifying it as an input/output constraint using "+S"? The code could look like:

void bios_print(const char *str) {
    asm volatile (
        "1:\n\t"
        "lodsb\n\t"
        "test al, al\n\t"
        "jz 2f\n\t"
        "int 0x10\n\t"
        "jmp 1b\n\t"
        "2:\n\t"
        : "+S"(str)
        : "a"(0x0e00)
        : "cc", "memory"
    );
}

You can see this in the third pane in the godbolt example how the output differs where ESI is being reloaded with the address of the string:

.LC0:
        .string "Hello, world!"
test_print:
        mov     edx, OFFSET FLAT:.LC0      ; EDX is used to store a copy of the string address
        push    esi
        mov     eax, 3584
        mov     esi, edx
        1:
        lodsb
        test al, al
        jz 2f
        int 0x10
        jmp 1b
        2:

        mov     esi, edx                   ; ESI is reloaded with copy of address in EDX 
        1:
        lodsb
        test al, al
        jz 2f
        int 0x10
        jmp 1b
        2:

        pop     esi
        ret

I would have avoided having to specify the "memory" clobber by using a dummy memory constraint ("m") to inform the compiler that we are reading an unspecified number of elements from the character string. I would have also ensured BH was 0 (assuming output to page 0 of course) to ensure the page was set. I would have also coded the loop so there is only one unconditional jump and the test al, al results form the conditional branch. This avoids extra unnecessary unconditional branches. The code could have looked like:

void bios_print(const char *str) {
    asm volatile (
        "jmp 2f\n\t"
        "1:\n\t"
        "int 0x10\n\t"
        "2:\n\t"
        "lodsb\n\t"
        "test al, al\n\t"
        "jnz 1b\n\t"
        : "+S"(str)
        : "a"(0x0e00), "b"(0x0000), "m"(*(const char (*)[])str) 
        : "cc"
    );
}

Of course the lazy way would be to have the compiler do the looping. This will likely produce longer encoding and probably why you did the loop inside inline assembly. That code would have simply looked like:

void bios_print(const char *str) {
    while (*str) {
        asm (
            "int 0x10\n\t"
            :
            : "a"((uint16_t)0x0e<<8 | *str++), "b"(0x0000)
        );
    }
}

Note:: It can probably go without saying that it is probably better to reimplement the BIOS tty output, scrolling etc in C and dispense with calling the BIOS functions if you are writing a large quantity of real mode code in GCC.

geri geri geri geri geri geri geri geri

It should display Geri's anime waifu when starting up. Attached is a image of what is believed to be the waifu that is supposed to be the inspiration for dawnos.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.