Code Monkey home page Code Monkey logo

Comments (7)

raphlinus avatar raphlinus commented on August 31, 2024

Heh, I'm not parsing the transform properly; the --flip flag is present on the command line to work around this. If nothing else, the documentation should be improved.

The font glitches are a known issue, some numerical robustness issues I haven't gotten around to fixing yet.

Thanks for your interest!

from vello.

oleid avatar oleid commented on August 31, 2024

Just discovered flip myself, now I feel stupid :D

How where the benchmarks from the blog post performed? Merely run the CLI tool without any additional options? Given the tiger svg from this repo, my old RX 470 seems to be faster than a 1060, which seems odd.

from vello.

raphlinus avatar raphlinus commented on August 31, 2024

Yes, by running the cli tool. This tends not to trigger a "boost" mode, I've notice it run about 20% faster running with continuous presentation. Regarding relative performance, the code uses shared memory (LDS) extensively, which is considerably faster (relative to other metrics) on AMD than other cards. I think between those two things is probably explains it, though you always learn things when you explore deeply into performance measurement.

from vello.

oleid avatar oleid commented on August 31, 2024

Would you like to keep this issue open as source of information for others or rather close it?

from vello.

eliasnaur avatar eliasnaur commented on August 31, 2024

The font glitches are a known issue, some numerical robustness issues I haven't gotten around to fixing yet.

I'd like to work on this. If you have pointers to where the issues are, I'd appreciate it.

from vello.

raphlinus avatar raphlinus commented on August 31, 2024

This is a good starter issue - it probably won't require a lot of code to fix (and that will be likely localized to path_coarse.comp), but it does require some insight and understanding.

The source of the glitch is a numerical inconsistency between two "orientation problems". One is to detect where the path segment crosses the top edge of a tile (that's "xray" on line 215 of current master), and the other is where it crosses the left edge ("y_edge" on line 234). The answer is reworking the slope calculation so that these two test are always consistent with each other. Obviously one thing that makes it a little tricky is that the segment can be vertical or horizontal, in which case the slope is infinite.

I suggest taking a look at the Pathfinder codebase, as Patrick has changed it to use a similar algorithm (also inspired by RAVG) and has addressed the numerical robustness issues. Patrick might be able to point you to the relevant commit in the code if you don't easily find it yourself.

Best of luck!

from vello.

eliasnaur avatar eliasnaur commented on August 31, 2024

I have some progress on this. The diff below empirically fixes all visible glitches from a setup where I carefully watch a slowly rotating tiger. The approach is sound in that the changes steer the intersection decisions away from inconsistencies. Unfortunately, the code is complicated enough that deficiencies are no longer obvious; I'm going to attempt a rewrite of the algorithm to tackle that.

diff --git gpu/shaders/path_coarse.comp gpu/shaders/path_coarse.comp
index 658af0eb..67f79871 100644
--- gpu/shaders/path_coarse.comp
+++ gpu/shaders/path_coarse.comp
@@ -174,10 +174,20 @@ void main() {
                 b = invslope; // Note: assumes square tiles, otherwise scale.
                 a = (p0.x - (p0.y - 0.5 * float(TILE_HEIGHT_PX)) * b) * SX;
 
-                int x0 = int(floor((xmin) * SX));
-                int x1 = int(ceil((xmax) * SX));
-                int y0 = int(floor((ymin) * SY));
-                int y1 = int(ceil((ymax) * SY));
+                float sxmax = xmax * SX;
+                float symax = ymax * SY;
+
+                int x0 = int(floor(xmin * SX));
+                int x1 = int(ceil(sxmax));
+                int y0 = int(floor(ymin * SY));
+                int y1 = int(ceil(symax));
+
+                if (sxmax == ceil(sxmax)) {
+                    x1 += 1;
+                }
+                if (symax == ceil(symax)) {
+                    y1 += 1;
+                }
 
                 x0 = clamp(x0, bbox.x, bbox.z);
                 y0 = clamp(y0, bbox.y, bbox.w);
@@ -191,19 +201,38 @@ void main() {
                 // Consider using subgroups to aggregate atomic add.
                 uint tile_offset = atomicAdd(alloc, n_tile_alloc * TileSeg_size);
                 TileSeg tile_seg;
+
+                int p0_xray = int(floor(p0.x*SX));
+                int p1_xray = int(floor(p1.x*SX));
+                int xray, last_xray;
+                if (p0.y < p1.y) {
+                    xray = p0_xray;
+                    last_xray = p1_xray;
+                } else {
+                    xray = p1_xray;
+                    last_xray = p0_xray;
+                }
                 for (int y = y0; y < y1; y++) {
                     float tile_y0 = float(y * TILE_HEIGHT_PX);
-                    if (tag == PathSeg_FillCubic && min(p0.y, p1.y) <= tile_y0) {
-                        int xray = max(int(ceil(xc - 0.5 * b)), bbox.x);
-                        if (xray < bbox.z) {
-                            int backdrop = p1.y < p0.y ? 1 : -1;
-                            TileRef tile_ref = Tile_index(path.tiles, uint(base + xray));
-                            uint tile_el = tile_ref.offset >> 2;
-                            atomicAdd(tile[tile_el + 1], backdrop);
-                        }
+                    int xbackdrop = max(xray + 1, bbox.x);
+                    if (tag == PathSeg_FillCubic && y > y0 && xbackdrop < bbox.z) {
+                        int backdrop = p1.y < p0.y ? 1 : -1;
+                        TileRef tile_ref = Tile_index(path.tiles, uint(base + xbackdrop));
+                        uint tile_el = tile_ref.offset >> 2;
+                        atomicAdd(tile[tile_el + 1], backdrop);
+                    }
+
+                    int xx0 = int(floor(xc - c));
+                    int xx1 = int(ceil(xc + c));
+                    xx1 = max(xx1, xray + 1);
+                    xx0 = clamp(xx0, x0, x1);
+                    xx1 = clamp(xx1, x0, x1);
+
+                    int next_xray = xray;
+                    if (y == y1 - 1) {
+                        next_xray = last_xray;
                     }
-                    int xx0 = clamp(int(floor(xc - c)), x0, x1);
-                    int xx1 = clamp(int(ceil(xc + c)), x0, x1);
+
                     for (int x = xx0; x < xx1; x++) {
                         float tile_x0 = float(x * TILE_WIDTH_PX);
                         TileRef tile_ref = Tile_index(path.tiles, uint(base + x));
@@ -214,12 +243,36 @@ void main() {
                         float y_edge = 0.0;
                         if (tag == PathSeg_FillCubic) {
                             y_edge = mix(p0.y, p1.y, (tile_x0 - p0.x) / dx);
-                            if (min(p0.x, p1.x) < tile_x0 && y_edge >= tile_y0 && y_edge < tile_y0 + TILE_HEIGHT_PX) {
-                                if (p0.x > p1.x) {
+                            bool intersects = min(p0.x, p1.x) < tile_x0 && y_edge >= tile_y0 && y_edge < tile_y0 + TILE_HEIGHT_PX;
+
+                            if (y < y1 - 1) {
+                                if (intersects && x > next_xray && next_xray >= xray) {
+                                    next_xray = x;
+                                } else if (intersects && x <= next_xray && next_xray <= xray) {
+                                    next_xray = x - 1;
+                                }
+                            }
+                            if (next_xray < xray) {
+                                intersects = next_xray < x && x <= xray;
+                            } else if (next_xray > xray) {
+                                intersects = xray < x && x <= next_xray;
+                            } else {
+                                intersects = false;
+                            }
+
+                            float s = sign(p0.x - p1.x);
+                            if (min(p0.x, p1.x) < tile_x0 && max(p0.x, p1.x) >= tile_x0) {
+                                if (s > 0) {
                                     tile_seg.end = vec2(tile_x0, y_edge);
                                 } else {
                                     tile_seg.start = vec2(tile_x0, y_edge);
                                 }
+                            }
+
+                            if (intersects) {
+                                if (tile_seg.end.x == tile_seg.start.x) {
+                                    tile_seg.start.x += s*1e-4;
+                                }
                             } else {
                                 y_edge = 1e9;
                             }
@@ -231,6 +284,7 @@ void main() {
                     }
                     xc += b;
                     base += stride;
+                    xray = next_xray;
                 }
 
                 n_out += 1;

from vello.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.