Code Monkey home page Code Monkey logo

Comments (11)

mlugg avatar mlugg commented on June 18, 2024 2

To be clear, I agree we should improve on this, it would be great to avoid those allocs. I was more just trying to explain the reason for the complexity to Vexu.

from zig.

mlugg avatar mlugg commented on June 18, 2024 2

Ah yes, that makes sense. We could set up AstRlAnnotate to require result locations for destructures if there are ever multiple peers - that way we avoid the PTR issues. That might be a little tricky to detect in general, but it would be very easy to at least optimize the trivial case of directly destructuring the result of these builtins - that would be a few lines in AstRlAnnotate and a new result location strategy in AstGen. I'll take a look at this soon.

from zig.

IntegratedQuantum avatar IntegratedQuantum commented on June 18, 2024 1

Part of the difference seems to come from tuple destructuring, which seems to add an extra write to the stack.
But there seems to be an even bigger issue here:
@addWithOverflow seems to generate a lot more code than the equivalent C function (godbolt).
Why is the zig version even writing to the stack? Why is it using so many extra instructions?

from zig.

notcancername avatar notcancername commented on June 18, 2024

Part of the difference seems to come from tuple destructuring, which seems to add an extra write to the stack.

Yes, I added the NoDestructure variants to verify this.

Why is the zig version even writing to the stack? Why is it using so many extra instructions?

I blame tuples.

from zig.

Vexu avatar Vexu commented on June 18, 2024

@addWithOverflow seems to generate a lot more code than the equivalent C function (godbolt).

Because at -O3 the C version doesn't check the result:

define dso_local i64 @addInC(i64 noundef %a, i64 noundef %b) local_unnamed_addr {
entry:
  %retval.0 = tail call i64 @llvm.uadd.sat.i64(i64 %a, i64 %b)
  ret i64 %retval.0
}

from zig.

Vexu avatar Vexu commented on June 18, 2024

And the reason for the write is AstGen.assignDestructureMaybeDecls always generating an alloc for const variables.

I may be missing something but I don't see why destructuring into a const would ever require an alloc.

from zig.

mlugg avatar mlugg commented on June 18, 2024

I may be missing something but I don't see why destructuring into a const would ever require an alloc.

Well, firstly, we need one if we want to use RLS, just like with a single const. However, unfortunately, we pretty much need to be consistent and use allocs for all destructures. The reason is that by the current semantics, there is no type T such that all destructurable types can coerce to T. Consider this:

var rt = true;
var a: struct { u64, u64 } = .{ 1, 2 };
var b: [2]u32 = .{ 1, 2 };
_ = .{ &rt, &a, &b };
const c, const d = if (rt) a else b;

If we didn't use allocs here, we would create a block which had breaks for both of these types, and they don't peer resolve. We could restrict the language to disallow this, but that would break the RLS case by default (allowing things that we've disallowed), so we'd need instructions to check this.

from zig.

Vexu avatar Vexu commented on June 18, 2024

Oh right. But shouldn't it still be able to use nodes_need_rl?

from zig.

mlugg avatar mlugg commented on June 18, 2024

For a single const, we can omit the alloc if we don't need to utilize RLS (based on nodes_need_rl). What I was getting at above is that trying to elide allocs for destructuring here would cause an actual difference in semantics, because of the PTR/coercion issue I described, where we would be trying to peer resolve potentially-incompatible types which might nonetheless be individually destructurable.

Also note that even if it did work (we adjusted the language spec in some way or implemented some weird ZIR instructions to coerce tuples how we want) we would have to deal with the weird scenario of some elements of a destructure having result locations when others don't.

from zig.

andrewrk avatar andrewrk commented on June 18, 2024

The whole point of switching @addWithOverflow to tuples was so that the backends could lower it more efficiently. It's pretty lame how it turned out.

from zig.

Vexu avatar Vexu commented on June 18, 2024

Looking at it closer nodes_need_rl isn't enough but shouldn't combining that with a check for control flow that could result in PTR be enough to remove the allocs in most cases?

from zig.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.