Code Monkey home page Code Monkey logo

Comments (7)

natemoo-re avatar natemoo-re commented on May 12, 2024 11

Also wanted to share some diagrams that describe how we expect to break this project down.

The current build in Astro 3.x is a single bundle step with a large module graph. Referencing astro:content pulls in every module that exists inside of every collection, leading to a huge module graph that Rollup struggles to process.

current

Phase One of this incremental build project will focus on refactoring Content Collections out to a self-contained build step. Instead of treating astro:content as the entrypoint, the collection items themselves are treated as the entrypoints and astro:content is regenerated after. This keeps the module graph small and efficient, while opening up an opportunity to cache the outputs for unchanged collection items. The rest of the build remains the same.

incremental-one

Phase Two of the incremental build project will build on top of the learnings and patterns established during Phase One. This step will focus on making the main server build more efficient and tracing exactly which pages need to be rebuilt. This will extend the benefits of incremental builds beyond the previously established Content Collections use case. Treating this as a separate phase will allow us to hone our approach before tackling the more generalized solution.

incremental-two

from roadmap.

natemoo-re avatar natemoo-re commented on May 12, 2024 10

Ideally, this is something that could be solved generically on the Vite / Rollup level so that every framework could benefit from this. I'm really not sure if that's on the table, though, since the ultimate goal is to bypass Vite / Rollup as much as possible. If this was easy to solve incremental builds in a generic way, it would have been done already.

My current sketch for an API is very straightforward from the user's perspective:

// astro.config.mjs
import { defineConfig } from 'astro/config'

export default defineConfig({
  build: { incremental: true },
  experimental: { incremental: true } // until this is stable
})

Unfortunately that's where the simplicity ends. To implement this, we'll likely need to:

  • Generate a serializable module graph that contains every possible build input. This will need to track all module relationships (hence the "graph" part).
  • Given the module graph, generate a stable checksum for each file. This will allow us to determine which parts of the graph have changed.
  • On every build, we'll need to generate a new module graph and compare it to the old one. Use the checksum to determine which files need to be invalidated (changes bubble all the way up to an entry point). If any subtree of the module graph has changed shape, that also invalidates relevant portions of the module graph.
  • Pass any invalidated modules as inputs to Vite. Thankfully Astro already controls every Vite input!
  • As Vite generates new outputs for invalid parts of the graph, we can restore valid parts of the graph from our cache (likely in node_modules/.cache/astro or node_modules/.astro).
    • Ideal scenario: the entire module graph is valid, so the entire output is restored.
    • Worst case scenario: the entire module graph is invalid, so the entire output needs to be generated from scratch. This is currently what we do on every build.
  • Any invalidated modules should be removed from the existing cache.
  • Now that out output has been merged into a repaired state, execute it to prerender our .html files. (We can't skip directly to restoring the .html files because the .js output can depend on external data that we don't know about.)
  • Populate the cache with our output for next time.
  • Remove the chunks needed for prerendering from our dist folder

from roadmap.

natemoo-re avatar natemoo-re commented on May 12, 2024 5

Exciting news! I've spent the last month investigating quite a few approaches to this problem and we're ready to move forward with the first phase of our plan.

Pretty immediately, we hit a major problem with the way Content Collections are currently architected. Invalidating a single article would have a waterfall effect that would invalidate the entire collection it belonged to so every page that referenced that collection would need to be rebuilt. We also were able to verify that the size of the module graph was the single biggest contributor to extremely slow builds. This is not particularly surprising, as module graphs have long been identified as the main bottleneck for JS build tools, but it's nice to have confirmation that this holds true for Astro.

Our first step towards incremental builds will be an internal refactor to the way that Content Collections are generated. Instead of treating Content Collection entries as part of the larger module graph, Astro will treat them as individual entrypoints for a separate build process. Not only does this drastically reduce the size of the main module graph, it should allow us to detect and rebuild only the Content Collection entries that change between builds.

Note

To begin, this refactor will only benefit users that make heavy use of Content Collections. We hope to use this effort to develop internal patterns and primitives that will inform later incremental build improvements. Stay tuned!

from roadmap.

natemoo-re avatar natemoo-re commented on May 12, 2024 3

Graduating to a full-fledged RFC. #763

from roadmap.

fparedlo avatar fparedlo commented on May 12, 2024 2

This is amazing, hope it comes in the next release!

from roadmap.

EyePulp avatar EyePulp commented on May 12, 2024 1

@natemoo-re Thanks for the clarity detail and documentation of your approach. I'm eager for the performance improvements.

More selfishly, I'm hopeful this opens the door to selective page renders. Our use case has a lot configuration options within a single astro project, up to and including rendering or not rendering individual pages. We can solve it today by using the dynamic route feature, but something more explicit and declarative would be very welcome, and it feels like incremental builds might offer that.

Regardless, thanks for the work!

from roadmap.

heyitsdoodler avatar heyitsdoodler commented on May 12, 2024

Is there a branch where work on phase 2 is being conducted?

from roadmap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.