Code Monkey home page Code Monkey logo

printbf's Introduction

printbf -- Brainfuck interpreter in printf

Authors

Background

Generic POSIX printf itself can be Turing complete as shown in Control-Flow Bending. Here we take printf-oriented programming one step further and preset a brainfuck interpreter inside a single printf statement.

An attacker can control a printf statement through a format string vulnerability (where an attacker-controlled string is used as first parameter to a printf-like statement) or if the attacker can control the first argument to a printf statement through, e.g., a generic memory corruption. See the disclaimer below for practical in the wild considerations.

Brainfuck is a Turing-complete language that has the following commands (and their mapping to format strings):

  • > == dataptr++ (%1$.*1$d %2$hn)
  • < == dataptr-- (%1$65535d%1$.*1$d%2$hn)
  • + == (*dataptr)++ (%3$.*3$d %4$hhn)
  • - == (*dataptr)-- (%3$255d%3$.*3$d%4$hhn -- plus check for ovfl)
  • . == putchar(*dataptr) (%3$.*3$d%5$hn)
  • , == getchar(dataptr) (%13$.*13$d%4$hn)
  • [ == if (*dataptr == 0) goto ] (%1$.*1$d%10$.*10$d%2$hn)
  • ] == if (*dataptr != 0) goto [ (%1$.*1$d%10$.*10$d%2$hn)

Demo and sources

Have a look at the bf_pre.c sources to see what is needed to setup the interpreter and also look at the tokenizer in toker.py.

Run make in ./src to generate a couple of sample programs (in ./src).

Disclaimer

Keep in mind that this printbf interpreter is supposed to be a fun example of Turing completeness that is available in current programs and not a new generic attack vector. This demo is NOT intended to be a generic FORTIFY_SOURCE bypass.

Current systems often either (i) disable %n (which is used to write to memory and allowed according to the standard but rarely used in practice) or (ii) through a set of of patches that test for attack-like conditions, e.g., if the format string is in writable memory.

To use printbf in the wild an attacker will either have to disable FORTIFY_SOURCE checking or get around the checks by placing lining up the format strings and placing them in readonly memory. The FORTIFY_SOURCE mitigations are glibc specific. The attacker model for printbf assumes that the attacker can use memory corruption vulnerabilities to set-up the attack or that the sources are compiled without enabled FORTIFY_SOURCE defenses.

printbf's People

Contributors

andybalaam avatar benaryorg avatar gannimo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

printbf's Issues

Heavy reliance on undefined behavior

When you invoke UB, you can't say the code will do anything, as the compiler can interpret it in any way it wants to, and it will be technically correct. There was a cryptographic function, srandomdev, which existed in BSD sources for a long time using an uninitialized variable as a source of entropy. Nearly a decade later, they switched to LLVM and after a while, someone noticed that the random seeds were coming out very predictable, and it was very subtle. Turns out the compiler was optimizing out the entire seed computation, because they had invoked undefined behavior.

In the same way, the whole premise of this project relies on illegal reads and writes, which are undefined, and assuming that the compiler will do one specific thing. But the behavior is undefined. There is nothing stopping a compiler from seeing the bad reads/writes, or seeing the many occurrences of signed overflow UB, and simply deleting all of the code that touches them, causing either the whole program to collapse in on itself or the program to break in subtle ways that won't be noticed for years. This is what you sign up for when you violate the standard. Have fun.

LBA not TC

The implementation seems to have a maximum of 2^16 cells, therefore it's a Linear-Bounded Automaton, not a Turing Machine.

I think it's worth adding that to the README

Oops inception on the README

A true double BF interpreter.

Minor issue on your README, the C translations *dataptr++ and *dataptr-- change the pointer not the cell contents, you want (*dataptr)++ or ++*dataptr.

Edit: Oh dear, and your slides.

#9 was closed without being resolved

Issue #9 was closed, but the undefined behavior was not addressed. There is a clear lack of programmer discipline here. Your program only does what you think it does on a few specific hardware/OS/compiler/library combinations, and it does not prove that printf is Turing complete. To prove printf is TC, you would have to find a security hole within the standard itself that allows for an ACE exploit, not a bug in a specific implementation. The second something upstream updates to patch the hole, boom, the program stops working.

If this were presented as a demonstration of an ACE exploit for the purpose of informing upstream of a security hole (in which case you would be listing the libraries and systems that it works on), then I would not be taking issue. But instead, you seem to be keen on slandering C itself for some internet clout, and making ridiculous claims like your claim that printf is TC, based on some implementations having a vulnerability. This is not okay. You are misrepresenting the language and its library. This issue should remain open until you adjust your priorities away from clout and towards helping major operating system vendors improve their security. Do you have any plans to contribute security code to the Linux kernel, or is this an ego thing?

README < > inverted?

I'm likely to have missed something, but shouldn't it be:

  • > == dataptr++ (%1$.*1$d %2$hn)
  • < == dataptr-- (%1$65535d%1$.*1$d%2$hn)

?

Or was it put this way to go down the stack when > is run?

doubt about the command about ">", "<","+","-"

  • == dataptr++ (%1$65535d%1$.*1$d%2$hn)

  • < == dataptr-- (%1$.*1$d %2$hn)
  • + == dataptr++ (%3$.3$d %4$hhn)
  • - == datapr-- (%3$255d%3$.3$d%4$hhn -- plus check for ovfl)

Why do ">" and "+" map to the same command? The same to "<" and "-"!

P.S. the word "datapr" is error, you may mean "dataptr"!

Please remove printbf from the Internet

This is too much. printbf is so nasty, it may form a digital black hole, swallowing all the interwebs.

For the sake of humanity and its dependence on Facebook and Instagram, please delete this repository before it's too late!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.