Code Monkey home page Code Monkey logo

Comments (9)

JoshuaWise avatar JoshuaWise commented on June 7, 2024 2

I had this problem as well, so I created a new thread pool library specifically for this purpose. Its API is heavily inspired by Piscina. In my tests, I found it to perform an order of magnitude faster than using WHATWG ReadableStream. Here it is: https://github.com/WiseLibs/wise-workers

On caveat though: I didn't implement atomics for thread communication, because I didn't see any performance improvement in my use-cases.

Maybe my library can serve an inspiration for Piscina. It wasn't really that complicated to implement support for generators. It might seem complicated at first, because you'd think you'd need to have a lot of coordination for pausing and resuming the generator across threads, but actually that behavior isn't really desirable because the point of using worker threads is to have both threads working at the same time (i.e., one thread producing data, the other one consuming data, simultaneously). So in a multi-threaded situation, you never actually want to pause the generator. In my library, that's exactly how it works; it just runs the generator eagerly to maximize parallelism.

It's worth noting that ReadableStream might be more desirable for some use-cases due to its support for backpressure, but in my case it wasn't relevant.

from piscina.

mcollina avatar mcollina commented on June 7, 2024

For maximum performance and the ideal user experience, React has built a streaming server side renderer that renders out chunks of HTML that can be streamed to the browser as they are produced, instead of having to wait until the entire render is complete before sending the first byte. This means the browser can start rendering, start prefetching other resources, etc etc, and is no extra work for the people building the React apps to support. See https://reactjs.org/docs/react-dom-server.html#rendertonodestream for more information.

As far as I know, React renderToNodeStream worsen the performance of React SSR at scale. It's an antipattern that's best not to be used. renderToNodeStream uses more CPU than renderToString - trading latency for scalability is never a great choice in Node.js environments.

The problem with React SSR is not initial latency, it is the fact that you can easily engulf your event loop with more task that it can process in the given time allocation. As for any CPU bound activity, you want to lower the amount of CPU being used, not increase it.


Supporting a stream or async iterator as a result would be great nevertheless.

from piscina.

airhorns avatar airhorns commented on June 7, 2024

We can talk this out in a different channel if you like as it isn't super relevant to Piscina, but I do think renderToNodeStream justifies this feature request from Piscina none the less, here's why:

As far as I know, React renderToNodeStream worsen the performance of React SSR at scale. It's an antipattern that's best not to be used.

Can you find any documentation that supports this? I am most interested in the best experience for the user, and the best experience for the user would allow streaming a result out to them before it has completed rendering. I prioritize user experience over CPU efficiency -- that's what React SSR is explicitly about. Also -- why would the React team add it to their API and support it if it was such a bad idea?

Especially for big renders, if the render takes 100ms, you've already blown your human-perception-of-instant time budget before dealing with network latency. When sending JS bundles built with a bundler that might have link preload=s or other split chunks that need to be hinted to the browser to get a good TTI, you need to get the browser downloading the assets as soon as possible. You also don't always know exactly what assets to send ahead of time because of React.lazy.

renderToNodeStream uses more CPU than renderToString - trading latency for scalability is never a great choice in Node.js environments.
The problem with React SSR is not initial latency, it is the fact that you can easily engulf your event loop with more task that it can process in the given time allocation. As for any CPU bound activity, you want to lower the amount of CPU being used, not increase it.

Is the extra CPU just because there are a few more promises and stack pops? That seems like pretty small overhead. For big pages, minimizing TTFB makes a lot of sense, and for small pages would it really dominate? It would also let more than one render run at once on a worker thread, which I think wouldn't increase throughput but would increase resource utilization as you wouldn't be forced to have a gazillion threads and kernel level context switching.

Also, as React adds SSR support for Suspense and Concurrent mode, there are going to be stack pops and promise awaits anyways, and the implementation of renderToNodeStream may improve. React doesn't yet have suspense support during SSR, but all the existing solutions like react-ssr-prepass are async at least, and I think could be streaming if they wanted to be. Libraries like styled-components also include server side streaming renderers themselves so they can stream out CSS blocks as they are rendered and style the halfway sent documents too! Also, I know node's high watermark for stream chunk sending is 15kb which many pages will be under the size of, but for the pages that are over that size, which I am building many of, I would like my users to be happy!

from piscina.

jasnell avatar jasnell commented on June 7, 2024

So this is something that I've had on my list for a bit and have played around with some. There is some complexity in that the only way to achieve the highest performance it would need a native addon and a custom transferable object implementation (currently only possible if done within node.js core). It is possible to implement a streaming model on top of MessagePort and transferable typedarrays tho. See the example here: https://github.com/piscinajs/piscina/blob/current/examples/stream/stream.mjs

from piscina.

airhorns avatar airhorns commented on June 7, 2024

Do you think it'd be worth building support for streams / async iterators over a message port to the library itself? It might be nice to have a battle tested, shared implementation of that that bakes in the smart folks' performance knowledge for doing it the not-super-fast way while we figure out if node might add the bits needed to make it really fast?

from piscina.

mcollina avatar mcollina commented on June 7, 2024

We can talk this out in a different channel if you like as it isn't super relevant to Piscina, but I do think renderToNodeStream justifies this feature request from Piscina none the less, here's why:

As far as I know, React renderToNodeStream worsen the performance of React SSR at scale. It's an antipattern that's best not to be used.

Can you find any documentation that supports this? I am most interested in the best experience for the user, and the best experience for the user would allow streaming a result out to them before it has completed rendering. I prioritize user experience over CPU efficiency -- that's what React SSR is explicitly about. Also -- why would the React team add it to their API and support it if it was such a bad idea?

I cannot speak for the React team. As from somebody working on and developing Node.js core, it does not make any sense to do it. The benefit of React SSR come from caching: I can render some HTML and place in a CDN forever. That's the sweet spot for React SSR.

Especially for big renders, if the render takes 100ms, you've already blown your human-perception-of-instant time budget before dealing with network latency. When sending JS bundles built with a bundler that might have link preload=s or other split chunks that need to be hinted to the browser to get a good TTI, you need to get the browser downloading the assets as soon as possible. You also don't always know exactly what assets to send ahead of time because of React.lazy.

renderToNodeStream uses more CPU than renderToString - trading latency for scalability is never a great choice in Node.js environments.

The problem with React SSR is not initial latency, it is the fact that you can easily engulf your event loop with more task that it can process in the given time allocation. As for any CPU bound activity, you want to lower the amount of CPU being used, not increase it.

Is the extra CPU just because there are a few more promises and stack pops? That seems like pretty small overhead. For big pages, minimizing TTFB makes a lot of sense, and for small pages would it really dominate? It would also let more than one render run at once on a worker thread, which I think wouldn't increase throughput but would increase resource utilization as you wouldn't be forced to have a gazillion threads and kernel level context switching.

Unfortunately the biggest hit comes from the garbage collector. A large React page allocates a lot of data because there are a lot of DOM nodes. Because the rendering is spread over time, most of that data is retained until the page has finished rendering: it will move to "old space", quickly balooning memory usage. Data in old space is expensive to collect from a CPU point of view.

As most Node.js applications are deployed with 1 core available, the GC will "steal" CPU cycles for rendering other pages. The result is that very often it will take more server time to send the full page via stream rather than a string. You will need 1.5x-2x the number of servers to use streaming.

The best approach for getting low-latency to the users is to have the pages cached in a CDN.

from piscina.

airhorns avatar airhorns commented on June 7, 2024

Because the rendering is spread over time, most of that data is retained until the page has finished rendering: it will move to "old space", quickly balooning memory usage. Data in old space is expensive to collect from a CPU point of view.

The GC can still move things to the old generation during synchronous calls, can't it? From what I know, minor collection cycles are triggered by growth in new space allocations, not by time. If streaming rendering vs synchronous rendering generate roughly the same amount of garbage plus or minus some promises and stream chunks, then they should promote roughly the same amount to the old space and cause roughly the same amount of slow major collections. What am I missing? Streaming rendering should also let the heap bloat less because chunks can be flushed and GC'd before the whole render is complete?

The best approach for getting low-latency to the users is to have the pages cached in a CDN.

Sure -- if you can do that though, why not pre-render using next.js or something like that? Even if you can't pre-render and are just using read through caching by being the origin for a CDN, the extra cost of server side rendering is amortized across all the other cache hit requests to the CDN. I am most worried about the situations where you can't pre-render and where the cache is ineffective because that generates the highest volume of server side rendering, like applications behind auth walls that include some user details in the response, or beta flag state, or any of the bits that make caching web applications hard. I feel like those situations would apply the most pressure to a server side renderer and should govern the design, and in those situations the user experience matters a lot to me.

I also think you are still prioritizing lower server costs over TTFB which compromises the experience and I think if that's the case someone who cares should benchmark and prove that it really does lower server costs and we should trade away experience for it.

from piscina.

airhorns avatar airhorns commented on June 7, 2024

Opened gadget-inc/fastify-renderer#13 for us to continue this discussion and not bother the piscina folks!

from piscina.

jasnell avatar jasnell commented on June 7, 2024

#170 adds an example that uses the new whatwg streams support in node.js to stream data into and out of a running worker. I believe that addresses the need here! :-)

from piscina.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.