Code Monkey home page Code Monkey logo

pantheon's People

Contributors

bschnepp avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

pantheon's Issues

[BUG] - Multi-core scheduling encounters data races

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
A side-effect of switching processes seems to encounter some problems with memory accesses: either the TLB isn't completely purged as expected, or it is possible to enter a race condition where a page table switches but hasn't been updated for the current context.

To Reproduce
Please list the steps to produce the bug below:

  1. Re-enable multi-core scheduling
  2. Attempt to run the OS
  3. A crash involving page table entries quickly occurs

Screenshots
If relevant, please provide screenshots here.

Expected behavior
Memory must always be in a consistent state

Additional information
Any additional information should be placed here.

Initialize other cores with PSCI

One of the primary tasks that needs to be done is to get the other cores to boot. Naturally, it's rather unlikely to find an application-class aarch64 SoC/SoM out that that's still a uniprocessor in 2021, so at least SMP support is nice, and we can worry about asymmetric multiprocessing later with all the combined power saving performance core things that exist out there.

Something to do right this time, unlike in Feral, is to bulldoze straight to running a process, and worry about architectural correctness later (memory mapping and process isolation and all that stuff.) later. We can even go so far as to share kernel and userspace page tables for now. This was a mistake in the way Feral was built, focusing way too much on correctness the first time, and led to some incoherent architectural nonsense that still hasn't been entirely resolved 3 years later. Let's do that right this time.

[FEATURE] Feature request - Port to Tinkerboard S

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
Support should be added to this board by modifying the relevant bootloader code and adapting necessary components to allow pantheon to boot.

Feature Benefits
List the reasons why this feature would be beneficial

  • A number of similar boards exist
  • The RK3328 is widely available to develop on: many boards are in stock and available
  • Support under Linux appears stable
  • Similar overall to the RK3399, of which debugging over SWD is well documented

Use case examples
List examples where this feature could be useful for end users.

  • Availability on real hardware
  • Well-known and supported GPU
  • Something that's physical and can be actually connected to something

Additional information
This is something to revisit in a few months: just an idea to look through later.

[FEATURE] Feature request - Change all handles to be global

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
The kernel should use global handles instead of the current per-process handle scheme. This decouples the process block from handles, and greatly eases the ability of passing ownership of a kernel object to another process.

Feature Benefits
List the reasons why this feature would be beneficial

  • Eases implementation of ports
  • Eases process management and handle lifecycle

Use case examples
List examples where this feature could be useful for end users.

  • A service designates itself as such by communicating with a server for advertisement of it, and with a permissions server for access permissions of the service.
  • It then delegates a handle to the advertisement server, and sends an RPC request with permissions to the permissions server
  • Then, sessions can be used from the port by clients accessing them. All permission checks remain in userspace, and port access is only regulated by ownership of the handle

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - GICv3 support

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
A driver should be implemented which support the GICv3 interrupt controller, for it's added benefits to
supported hardware, and lifting restrictions on core counts being very low. It is also important to note
that a GICv3 implementation may not be backwards compatible with the GICv2, requiring the creation
of this driver for hardware support.

Feature Benefits
List the reasons why this feature would be beneficial

  • GICv3 is more common in newer hardware platforms, ie, i.MX8, RK33xx.
  • GICv3 lifts restrictions from 8 CPUs to 512 CPUs
  • GICv3 acts as system coprocessor, not as MMIO peripheral

Use case examples
List examples where this feature could be useful for end users.

  • Allows for initial support for a real hardware port to RK33xx series SoC
  • Allows for initial support for a real hardware port to i.MX8 series SoC
  • Version under QEMU simulation can be changed to use GICv3 for testing

Additional information
Any additional information should be placed here.

Create driver init graph

A graph of lambas should be used to handle startup initialization, so "board-specific" drivers that are currently in use which don't actually have to be tied to the device or board directly can be separated, and possibly even attach unit test cases to them. The primary and immediate use case for this is to properly detect the timer on the board, instead of simply assuming it's the generic system timer.

This should be achieved with a graph of lambas, and done in 3 passes: one for driver initialization to set the device up at the mmio address needed, another pass for the actual driver startup routine, and a final pass for any cleanup the driver needs to do. Each of these routines should be reentrant, and can maybe be called more than once.

These functions should be assembled based on the device tree when startup happens, and then executed in a depth-first order.

[FEATURE] Feature request - Higher half mapping for MMIO

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
The kernel should isolate device MMIO from ordinary memory, and have it virtually mapped in the higher half area.
This will (eventually) allow remapping of the whole kernel there, and begin properly using the MMU.

Feature Benefits
List the reasons why this feature would be beneficial

  • Process isolation
  • Performance

Use case examples
List examples where this feature could be useful for end users.

  • Programs are more strongly separated

Additional information
Any additional information should be placed here.

[BUG] - Resources are not properly refcounted

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
A process can quickly exhaust all available resources on a handle, effectively doing a denial of service on any other user of that resource backed by that handle.

To Reproduce
Please list the steps to produce the bug below:

  1. Write an initial process which opens a named port
  2. Constantly try to re-open it
  3. Eventually, resources on it will be exhausted

Screenshots
If relevant, please provide screenshots here.

Expected behavior
A handle can't be overloaded

Additional information
Any additional information should be placed here.

On-device testing support

Support should be added to check for architecture- or board-specific code correctness with unit tests. This can be most easily achieved by running the majority of tests on the host, then connect a development board to a Kubernetes cluster which runs CI build jobs. Said cluster would share the development board, and when it is free, upload the new firmware, then use the serial port to log the test status for all board-specific tests.

This would aid in covering the remaining, currently untestable, parts of the kernel.

[FEATURE] Feature Request - Rewrite scheduler for O(1) lookup

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
The scheduler should be rewritten to allow for O(1) lookup of process structures.
This can be achieved in two ways: a HashMap and giving a per-core scheduler a PID to execute,
or to use the GlobalScheduler to pass along pointers to processes (or nullptr) as needed.

Feature Benefits
List the reasons why this feature would be beneficial

  • Less overhead in scheduling from IRQ interrupts
  • Avoid data race conditions in existing O(n) implementation
  • Enforce consistent state across processes

Use case examples
List examples where this feature could be useful for end users.

  • Many processes are scheduled, ie, a User Accounts daemon, a networking daemon, a filesystem daemon, etc. until lookup takes longer than the interval of the IRQ interrupt
  • This causes the system to feel less responsive, and as such causes issues with timing with RPC requests or other IPC calls
  • This may also introduce system instability if, for example, a process implements a socket timeout feature and the network daemon never gets a chance to run

Additional information
This feature should be considered as high priority.

[BUG] - Processes may issue SVCLogText to access kernel memory

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
A running process may access kernel memory by issuing a pointer to higher-half memory as an argument to SVCLogText.
As the system call does not sanitize any pointers at this time, this means that an attacker is capable of using this to dump the contents of the kernel (and possibly all of system memory) through a serial port, if listening to it.

To Reproduce
Please list the steps to produce the bug below:

  1. Create a user process
  2. Prepare a pointer to higher half memory
  3. Issue SVCLogText
  4. Memory will be accessed by the kernel and data sent through the serial port.

Screenshots
If relevant, please provide screenshots here.

Expected behavior
The program should either crash, or the kernel should refuse to issue the text.

Additional information
This should be considered as a serious security bug. While not actually exploitable in practice (no physical system can run pantheon, nor can any arbitrary user programs be started: the only processes running are those put there initially.), this should be fixed quickly as when proper program loading occurs, it may be possible to exploit.

[FEATURE] Feature request - Implement style guidelines

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
A formally described style guidelines document, which will ensure consistency throughout the codebase

Feature Benefits
List the reasons why this feature would be beneficial

  • Code remains consistent throughout the project
  • Naming conventions are always unified
  • Consistent agreement on variable, class, and structure names

Use case examples
List examples where this feature could be useful for end users.

  • None directly

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - Implement usermode switch

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
The ability to either create or drop permissions of the current thread to userland should be implemented. This needs to be done to ensure system level daemons are kept in userspace, and eventually isolated from the kernel's runtime memory.

Feature Benefits
List the reasons why this feature would be beneficial

  • Allows implementation of real userspace programs
  • Necessary to implement device drivers
  • Necessary to test system calls

Use case examples
List examples where this feature could be useful for end users.

  • This is necessary to run any programs at all!
  • Developers writing code for end users need this to be able to write programs

Additional information
Any additional information should be placed here.

Check pantheon::String correctness for right-to-left languages

Correct parsing of length and other properties for these languages is currently not done.

Specifically, some test cases need to be added to check that:
- The length of the right-to-left (or left-to-right) marker character is ignored for CharLength(), but counted for DataLength()
- That a single character is always counted as a single character, even if that character is modified by a subsequent letter.
- That the returned value of a given character is the byte at the location, per the specification for operator[].
- Mixing these with Latin script should work as intended.

[FEATURE] Feature request - Page Coloring

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
Pages allocated by the virtual memory manager should never allocate contiguous pages (more than number of cache sets) in virtual memory which contend for the same cache line.

Feature Benefits
List the reasons why this feature would be beneficial

  • Essentially free performance in many cases
  • Allows for near deterministic behavior for userspace programs with regards to cache efficiency

Use case examples
List examples where this feature could be useful for end users.

  • Performance-sensitive code (eg, games, web browsers, database engines) can better predict performance
  • Aid in benchmark suites (less swings in performance)

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - Deferred interrupt handler

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
A mechanism should be in place to handle "heavyweight" interrupt tasks without keeping the system waiting for it.

Feature Benefits
List the reasons why this feature would be beneficial

  • ISRs can be kept small, and aim for minimizing time spent as much as possible
  • Allows for heavier interrupt processing, such as keyboard events

Use case examples
List examples where this feature could be useful for end users.

  • Processing data from I2C or GPIO pins can avoid conflicting with network events, pcie, etc. as often
  • System gains more responsiveness, as upper bound on ISRs is lowered

Additional information
Any additional information should be placed here.

[META] - Decide on a license

An open source license should be used for this project, ideally something with few restrictions.
A reasonable license would be something like BSL-1.0, but something like X11, GPL or MPL isn't unreasonable as well.

[BUG] - System calls aren't sanitized

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
A userland process issuing a system call (ie, svcCreateNamedEvent) can pass in arbitrary values to the kernel.
These do not necessarily have to be valid arguments: they could be invalid memory, memory owned by another process, etc.

To Reproduce
Please list the steps to produce the bug below:

  1. Modify a system call such as svcCreateNamedEvent in existing code (ie, sysm) to be invalid
  2. Undesired behavior is now triggered

Screenshots
If relevant, please provide screenshots here.

Expected behavior
The kernel returns an error, or refuses to complete the request

Additional information
This is a very serious bug. Any (and all) system calls need to be checked through some method of copyin/copyout from userland to a temporary kernel buffer to check if it's valid or not. Otherwise, issues like this could occur.

[BUG] - Process page tables do not properly switch on thread swich

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
Upon swapping to another process, the page tables for them do not appear to properly switch.

To Reproduce
Please list the steps to produce the bug below:

  1. Build and run the vmm branch
  2. Note that the second process always traps on an undefined instruction
  3. But GDB is able to disassemble this instruction fine...?

Screenshots
If relevant, please provide screenshots here.

Expected behavior
Both processes should run correctly

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - Use pantheonSDK for initial processes too

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
It would be nice to have a single unified (authoritative) SDK for any program that should run under pantheon. As it is right now, there's the content within this repository, and another intended to be used by "real programs". This distinction is somewhat arbitrary.

Feature Benefits
List the reasons why this feature would be beneficial

  • Less development overhead
  • Definitions can be separated from kernel headers nicely

Use case examples
List examples where this feature could be useful for end users.

  • Any programs built will always agree with the same one used for the OS
  • More documentation for all of the services available, since there are no secret libraries

Additional information
The repository in question is https://github.com/bSchnepp/pantheonSDK

[FEATURE] Feature request - Create a slab allocator for all kernel structures

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
A slab allocator should be used by the kernel to allocate and initialize objects very quickly, and with minimal fragmentation.
Ideally, total fragmentation should not exceed 20% of a page size (roughly 800 bytes).

Feature Benefits
List the reasons why this feature would be beneficial

  • Use after free bugs are much more difficult to take advantage of
  • Solaris, L4, and Linux all use this kind of allocator
  • Predictable performance and avoiding mixing kernel memory with userland memory
  • Cache benefits, for iterating over all processes

Use case examples
List examples where this feature could be useful for end users.

  • System performance will be improved
  • Security is improved for free
  • Stronger guarantees about object consistency
  • Can be combined with a generic buddy or simple doubly linked allocator for more fine-grained control (see Linux)

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - Embed initial userland inside final kernel binary

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
The userland image (either as some kind of initramfs, raw binaries in a table, etc.) should be included in the kernel binary.

Feature Benefits
List the reasons why this feature would be beneficial

  • Allows for a unified, single image, avoiding a mess with second/third/fourth stage bootloaders and keeping filesystem drivers out of the kernel
  • Ensures that if we need to sign the kernel, signing the initial userland comes along with it
  • Embedded objects can be copied and then freed, reclaiming some space within the kernel's known safe area

Additional information
Any additional information should be placed here.

[BUG] - Scheduler isn't always fair

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
The scheduler currently relies upon a race to reschedule a process, where there is a different idle thread found first before the it is picked up, as in 4bd7bee.
Correct behavior would show that with a given core and thread is run, after the number of ticks has expired, it picks up a different thread rather than the same one. The current setup always ensures there is at least one idle thread at any given time, so this condition should always occur.

To Reproduce
Please list the steps to produce the bug below:

  1. Run the software in qemu, using test-nosd.sh
  2. Observe serial output
  3. Note the frequency for the same thread being scheduled

Screenshots
If relevant, please provide screenshots here.

Expected behavior
The output from kern_idle() should remain fairly similar between all of the threads, with some variance of being within a few thousand.

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - Port to Raspberry Pi 4

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
A port to the Raspberry Pi 4 should be done. This will allow pantheon to boot on a inexpensive arm64 board, which is (relatively) widely available and popular.

Feature Benefits
List the reasons why this feature would be beneficial

  • The Pi is a very popular device, of which I already have some to test with. (B0 stepping, however) The B0 stepping specifically has some problems (3GB PCIe window limitation, emmc within the first 1G, which is precious and should be used for GPU device code mappings..., we should be extra careful since C0 is different enough to mess up uboot, etc.)
  • A Mesa driver already exists: we just need to wrap DRM/DRI/KMS/etc. as userland daemons (or even within one big 'gpu' process) and should be good to go. That's a ton of work, but much easier than reimplementing DRM/KMS from scratch. Or we can reimplement DRM from scratch anyway, and have a useful reference driver.
  • We can see this work on the real thing!

Use case examples
List examples where this feature could be useful for end users.

  • Deployment on pantheon in small CM4 devices (ie, IoT stuff)
  • Deployment of pantheon for normal Pi 4 Model B, as a reasonable alternative to Linux

Additional information
Any additional information should be placed here.

[BUG] - Occasional crash after handling interrupt while within another interrupt context

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
At some point of execution, usually after a few hours, some condition occurs where code execution is resumed at some point causing a data abort or some other race. This causes the kernel to crash in a subtle way, but without triggering a kernel panic.

To Reproduce
Please list the steps to produce the bug below:

  1. Run sysm and prgm on an VM with SMP
  2. Wait a very long time (2-3 hours)
  3. A crash occurs

Screenshots
If relevant, please provide screenshots here.

Expected behavior
The kernel should never crash with expected behavior.

Additional information
This bug typically gets triggered after some condition with sysm causes it to stop being scheduled. It may help to look for race conditions there.

[BUG] - Implement .init_array functionality

Issue Checklist

  • A related or similar issue is not already marked as open
  • The steps to reproduce have been tested, and do produce the issue described
  • If relevant, graphical issues have a screenshot presented as well. Text-only issues have the text and it's correct version listed within a Markdown code block section
  • The most recent commit on the master branch the bug is present in, with it's commit hash, is listed in this report

=====================================================
Bug Description
The constructors of statically-declared objects are not initialized by the kernel before executing kern_main.
This is present in 17c5056

To Reproduce
Please list the steps to produce the bug below:

  1. Remove ArrayList assignment code in pantheon::GlobalScheduler::CreateIdleProc
  2. Execute the kernel, as with test_nosd.sh
  3. The serial console will log an undefined instruction error, caused by a null pointer dereference
  4. This behavior causes the given idle process to never be properly created.

Screenshots
If relevant, please provide screenshots here.

Expected behavior
Constructors on all objects declared statically should be executed.

Additional information
Any additional information should be placed here.

[FEATURE] Feature request - Implement drivers and support for ARM Trusted Firmware

Issue Checklist

  • A related or similar issue is not already marked as open
  • Another issue describing a similar feature has not already been marked as wontfix or closed
  • This feature is not already present in the software

=====================================================
Feature Description
Support for interaction with a secure world should be done to allow programs desiring to hold secrets from other programs to be ported.

Feature Benefits
List the reasons why this feature would be beneficial

  • Allows for certain proprietary programs to be ported easier, such as web browsers with certain extensions. This, of course, requires the program's developer to port the program in the first place however.
  • More broad hardware support when ports should be done
  • Allows for system management that system escalation won't simply bypass. Ie, a secure way to hold onto filesystem access permissions.

Use case examples
List examples where this feature could be useful for end users.

  • Programs requiring secure services will be possible to port
  • The system may now have hardware allow for protection against unauthorized accessing of files, ie, protecting a user's home directory from being viewed by others based on permissions granted by the secure world
  • Additional utilization of hardware

Additional information
Any additional information should be placed here.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.