Code Monkey home page Code Monkey logo

pholyglot's Introduction

Blog posts

For information about the project, see the following blog posts:

IRC

,cc int $a = 10; printf("%d", $a); // Now I'm curious
http://www.iso-9899.info/n1570.html#J.5.2p1 [Specialized identifiers] Characters other than the underscore _, letters, and digits, that are not part of the basic source
                 character set (such as the dollar sign $, or characters in national character sets) may appear in an identifier (6.4.2).
,cc int \u0024a = 10; // this, on the other hand, is strictly conforming
"Before C++11, C++ didn't allow Unicode escapes with a code point under 0xA0, except for 0x24 ($), 0x40 (@), and 0x60 (`)."
"C still has the same restrictions as C++98."

-w #include <dlfcn.h> void callbyname(const char *name, const char *arg) { ((int (*)(const char *)) dlsym(RTLD_DEFAULT, name))(arg); } callbyname("puts", "YOLO");

olle: for your c target, probably easiest to say: int *temp = &a->b; *temp = 1
iow the code generator treats lvalues uniformly as pointers

Notes

#define \u0024 $
#define \u003C ?
#define \u002F /

#define EMPTY
EMPTY#ifdef

GOTO here;
here:

gcc polyglot.c -I/usr/include/mysql -I/usr/include/mysql/mysql -lmysqlclient

#include "fcgi_stdio.h"

$a = parse_int($_GET['used_id']);

// TODO: This won't deal with different mem strategies
// BUT: Use _Generic to switch on it? Use same new() with different behaviour - alloca, malloc, Boehm etc.
// BUT: Should be able to use one class with many diff allocation strats, /** @alloc stack */ etc
// NB: new (foo()) is valid in PHP when foo() returns string/class name
#define new(x) x ## __constructor(alloca(sizeof(struct x)))

g_printf vs printf?

g_slice instead of malloc?

https://stackoverflow.com/questions/25144963/converting-a-mysql-result-into-a-json-string-in-c/25523088#25523088

https://stackoverflow.com/questions/5451913/how-to-retrieve-form-post-data-via-cgi-bin-program-written-in-c

  • Class can be struct.
    • Pass around $__self explicitly
  • String can use smart strings from PHP core or GNU libc
  • Array can use same array. ** #define array const Body* but also need to pass length as an argument
  • Hashtable gonna be hard. Must use class for that?
  • GC with Boehm, no long-running scripts? Or glib slice/manual ref count
    • libgc-dev on Ubuntu
  • Need docblock for arrays - no collection class built-in in PHP
    • Fixed-size array (as in C); can't change size (at least not without copy mem; but mess up pointers?)
    • Vector, as array but can change size
    • (Double-linked) list
    • Hashtable

Impossible

  • Nested assoc arrays
  • Read/write from PHP session
  • Array init in both C and PHP ** BUT: Maybe possible with #__C__ and sed? Or just sed?
  • UTF8

Snippets

To include PHP which is ignored in C:

// ?> <?php $arr = $arr[0]; ?>

Use-cases

  • Read database config from JSON file
  • Read/write from/to database
  • Curl call somewhere
  • Some mathematical thing (n-body problem?)
  • Export something big to CSV
  • IRC-bot

Open questions

  • String concat
  • Local type inference
  • Assoc array is banned, hash table with string keys is ok

Benchmark

strip binary first

$ time f r a in {1..100}; do php polyglot.php; done
$ time for a in {1..100}; do ./polyglot; done

C version:
real    0m0,532s
user    0m0,266s
sys     0m0,217s

PHP version:
real    0m2,211s
user    0m1,338s
sys     0m0,810s

Transpiler

  • What type information is needed?
function add(int $x, int $t): int {
    return $x + $y;
}

function add_points(Point $p1, Point $p2) {
    return new Point($x->x
}

setVariableAndTokenMappingsForExpressionManager

Notes

Topic is language design and type-inference.

PHP has a weird "array" type that's basically a hash table that also acts as a list. When transpiling PHP to a more machine-close language, one optimization is to infer the "array" type to a more precise type, e.g. to an actual C-like array, linked list, dynamic vector, or a traditional hash table.

Some example code:

$arr = [1, 2, 3];     // Fixed size integer array
/** @var int[42] */
$arr = [];            // Fixed-size int array with size 42
$arr[] = 4;           // Pushing a new element on top of $arr, so either dynamic vector or linked list
$arr["moo"] = 5;      // Now it's a hash table
$tuple = [1, "moo"];  // This could be a tuple, since it has different types

The challenge is to deal with these different alternatives when only limited information is available.

function foo(array $arr): string {
    return $arr[0];  // Assume a string array, that is, string[]
}

function foo(int $a, string $b): array {
    return [$a, $b];  // Tuple with size 2
}

Possible to enforce stack alloc for class? So can't combine with heap alloc classes.

/** @pholyglot stack-only */
class Point
{
}

Arrays

use _ as tuple;

/**
 * @param array<int>  // Fixed or dynamic or list?
 */
function foo(array $arr): void {
}

$foo = /** @alloc stack */ new Foo();
$foo = /** @alloc pool */ new Foo();

/**
 * @param-pholly vect<int>
 * @return-pholly tuple
 */
function bar(array &$a): array
{
    $b = $a[0];  // Fixed-size
    return [$b];
}
/**
 * OK in Psalm
 * @return &array<int>
 */
function &foo(): array
{
    $a = [1, 2, 3];
    return $a;
}
[$a, $b, $c] = &foo();
echo $a;
echo $b;
echo $c;

Error. All good if error message is empty (as in Go).

function bar(): array
{
    //return [128, null];
    return [127, "error message"];
}
if (([$result, $error] = bar()) === [$result, null]) {
    echo 'all good, we got result ' . $result;
} else {
    echo $error;
}

gcc

__cleanup__

http://echorand.me/site/notes/articles/c_cleanup/cleanup_attribute_c.html

Stack alloc vs value types:

/**
 * @return Point@stack
 */
function makePoint() {
    $p = /** @alloc stack */ new Point();
    ...
    return $p;           // Break;
    return new Point();  // Works with value types...?
}

/**
 * @param SplDoublyLinkedList $points
 */
function addPointToList(SplDoublyLinkedList $points) {
    // Pool, Boehm, ref count
    $points->push(new Point());
}

// Can be done with both pool, boehm, ref count?
// Polymorph on alloc kind?
array_slice($arr, 0, 1);

Pool:

$foos = /** @mem pool */ [new Foo(), new Foo(), new Foo()];
$slice = array_slice($foos, 0, 1);
return $slice;  // Pool can't escape, same as stack?

Arena is not the same as a pool.

Make long == double == uintptr_t same size

21:46 size_t a = sizeof(long), b = sizeof(double), c = sizeof(uintptr_t); if (a == b && b == c) {}
22:01 static_assert(sizeof(long) == sizeof(double) == sizeof(uintptr_t));

$ gcc -E -dM - < /dev/null | egrep 'UINTPTR_TYPE|SIZEOF_LONG'
#define __SIZEOF_LONG__ 8
#define __SIZEOF_LONG_DOUBLE__ 16
#define __UINTPTR_TYPE__ long unsigned int
#define __SIZEOF_LONG_LONG__ 8
$ x86_64-w64-mingw32-gcc -E -dM - < /dev/null | egrep 'UINTPTR_TYPE|SIZEOF_LONG'
#define __SIZEOF_LONG__ 4
#define __SIZEOF_LONG_DOUBLE__ 16
#define __UINTPTR_TYPE__ long long unsigned int
#define __SIZEOF_LONG_LONG__ 8

x86_64 tolerates unaligned access but it's slow. Other platforms normally just segfaults.
alignof(max_align_t) is the safest alignment that works for all types.

Single owner as a solution to malloc from mysqli and other external libs?

PHP FFI could be used?

pholyglot's People

Contributors

olleharstedt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.