joist-orm / joist-orm Goto Github PK

a TypeScript ORM for Postgres

JavaScript 0.63% Dockerfile 0.16% Makefile 0.08% TypeScript 99.14%

typescript orm postgresql nodejs dataloader unitofwork

joist-orm's Issues

findOrCreate Param Docs in the wrong order

In the docs for EntityManager#findOrCreate the documented parameters are not in the same order as the method parameters. upset and ifNew are reversed.

Move hooks and validation into classes

As seen in the docs:

class Author extends AuthorCodegen {
  constructor(em: EntityManager, opts: AuthorOpts) {
    super(em, opts);
  }
}

authorConfig.beforeFlush(async () => ...);

authorConfig.afterCommit(async () => ...);

At first glance, this feels confusing. As a developer, I'd assume the hooks should be placed on the model/entity. Something like this could make more sense:

class Author extends AuthorCodegen {
  static beforeFlush = [
    async () => ...,
    async () => ...,
  ]
}

Secondly, coming from the rails hook hell, it's pretty easy to have hooks being placed in different parts of the app, causing hook hell and losing track of where changes are being made and what hooks are being made.

DATABASE_CONNECTION_INFO is not ergonomic

Why not just use the defacto standard connection URI string. E.g. DATABASE_URI?

Stuffing JSON in an environment string is difficult in deployments that get the connection information from the environment.

Add class-per-table inheritance support

Upstream & improve toMatchEntity

Support things like:

// Currently what we support
await expect(author).toMatchEntity({
  firstName: 2,
  books: [{ title: "b1" }],
});

// Currently our version only supports `author: { id: ... }`, we need
// know "do you expect to drill any further into this entity?"
await expect(book).toMatchEntity({
  author: author
});

// Add support for lists of entities
await expect(author.books.get).toMatchEntities([
  { title: "b1" },
  { title: "b1" },
]);

// Add support for unloaded collections?
await expect(author.books).toMatchEntities([
  { title: "b1" },
  { title: "b1" },
]);

Ability to have `uuid` for the primary key?

Add changes original entity

When a m2o changes, it'd be nice to be entity.changes.originalParent and have it handle "is this a new/loaded entity", "is this just an id"?

It would probably need to be async.

Add toMatchEntities

Allow tests to have always loaded entities

In tests, it would be a large DX win to be able to traverse the object graph w/all relations already loaded.

I.e. we could have factories return DeepNew<...> versions of entities that just assumed all relations are always loaded.

This would require either:

the entire DB would already be loaded into the EntityManager (which in tests, the data set is usually small enough for this to be doable), or
the DeepNew<...> would be just a heuristic that is "usually true", but might still fail at runtime if the test/code-under-test did not actually trigger relation to be loaded. For tests that use a single EntityManager, this heuristic would likely always be true. For tests that use multiple EntityManagers, i.e. resolver tests, it would probably be "usually true", but not 100%.

Just talking out loud, we could envision a EntityManager.loadEntireDatabase that just pulled the entire db into the memory. A naive implementation, that just did one SELECT per table + one SELECT per o2m/m2m/etc. relation, seems like it'd be incredibly inefficient and not doable.

A fancy implementation idea would be to leverage the flush_database trick and use a select_all_tables stored procedure that returned a SELECT * across all tables as a UNION. This would give us a single wire call + result set with "all the things", which would be a huge mess, but we could have a dedicated/internal method that knew how to decipher the mega-result set into a) individual entities and b) to force/preload all of the o2m/m2m/o2o collections appropriately.

Raw SQL Query Builder / Spike kysely integration

As a full knex replacement:

https://www.jakso.me/blog/kysely-a-type-safe-sql-query-builder-for-typescript

Ability to use native enum instead of separate table

Investigate degenerate em.find behavior

When auto-batching ~100+ em.find calls, the behavior gets terrible. Is this inherent or because of a bad implementation detail?

How to handle transiently invalid required fields

Currently a first_name column that is not null in the db is represented as Author.firstName: string i.e. non-optional / non-null / non-undefined in the domain model.

This is the most convenient (and accurate) for already-persisted / already-validated entities, as it simplifies business logic that assumes the field is required, as well as mapping to APIs i.e. GraphQL APIs with firstName: String! where firstName is exposed to the outside world as "a required string".

(If Author.firstName was modeled as string | undefined, then the API layer exposing firstName would have to constantly do a if !firstName fail() check, even though in the common/read case, it will always be set.)

Joist generally enforces field required-ness on creation of the entity, i.e. em.create(Author, { firstName: ... }) requires a string (at least at the type system level).

However, while creating / updating the domain model through methods like Author.setPartial or em.createOrUpdatePartial, the firstName is allowed to temporarily become undefined, to trigger validation rules about the invalidity of the firstName.

This means that the types can transiently be incorrect, i.e. Author.firstName might return undefined.

A few options to avoid this seem like:

Prevent Author.firstName from ever becoming undefined, by rejecting the setPartial or createOrUpdatePartial with a validation error (i.e. the parse don't validate approach).

This is probably the ideal, but has some wrinkles:

a) What about bulk updates, i.e. what is the API for bulk updating 20 Author (seems solvable)

b) Would validation rules become split between "pre-undefined / pre-mutation checks" (indicated via a ValidationError before entities are even mutated with the proposed new values) as well as "post-undefined / post-mutation checks" (i.e. a similar ValidationError but after entities have been mutated (with null-safe value) but then go on to fail other "more than just required" validation rules)?

I.e. more complex validation rules are cross-entity and cross-field, and need a way to access both the incoming/new values as well as existing/current values (unless those are changing as well).

These sort of rules seem best expressed in the "not parsing" approach, i.e. by just assuming the entities exist as they are / have been mutated, and then reaching into them to validate the current state.

And if we conclude that "post-mutation" validation rules are generally necessary/good, then it seems simplest to have all validation rules work that way, instead of creating a pre-mutation/post-mutation split.
A more tactical approach is to just return empty states from getters if/while the entity happens to be invalid

I.e.:
```
const a = em.create(Author, { firstName: "bob" });
a.setPartial({ firstName: null });
console.log(a.firstName); // prints empty string
```
- Con: works for primitives, but what about relations? I.e. how does book.author.get (which is required) return a default value while it's transiently unset?
Do nothing and keep the "most of the time"-ness aspect of Author.firstName

To a certain extent, in information management systems (i.e. not missile guidance systems), a tongue-in-cheek observation is "is any field ever really required"? I.e. while going through data entry/validation/etc. it is common/normal for a field to a) not be set yet, and b) this is still a invalid state, i.e. don't allow it, but it temporarily exists. Maybe that's fine.

This could also be a pragmatic-ism of JavaScript/TypeScript, in that yes "it's bad" that Author.firstName is undefined when the type system thinks it should be string; but nothing is going to seg fault. At best, a validation error will happen and the transaction will be rolled back; at worse an NPE will happen.

Such that Author.firstName: string becomes not a guarantee, but an assertion of intent (except for persisted data, where it remains an invariant, this issue is solely about transient mutation of domain objects).
Pass hooks a partialized version of entities

Most of the consumers of "wait, Author.firstName is undefined" would likely be internal validation rules and hooks that are observing mid-mutation entities.

Given that we pass hooks their entity, we could use a mapped type to partialize all of the fields, so that instead of config.beforeFlush(author => ...) the author.firstName would actually return undefined.
- Pro: Works for both primitives and relations like book.author.get (we've had actual bugs from this)
- Pro: Seems like a pretty simple mapped type to implement

Foreign keys with multiple columns generate invalid entity codegen files

With a constraint that looks like this:

alter table projects 
  add constraint project_has_client_as_collaborator 
  FOREIGN KEY (id, client_id) 
  REFERENCES project_collaborators (project_id, homeowner_id)
  DEFERRABLE INITIALLY DEFERRED;

Results in the generation of incorrect interface/class fields:

export interface ProjectOpts {
  ...
  id: ProjectCollaborator;
}

export interface ProjectFilter {
  id?: ValueFilter<ProjectId, never>;
  ...
  id?: EntityFilter<ProjectCollaborator, ProjectCollaboratorId, FilterOf<ProjectCollaborator>, never>;
}

export interface ProjectOrder {
  id?: OrderBy;
  ...
  id?: ProjectCollaboratorOrder;
}

export class ProjectCodegen {
  ...
  readonly id: Reference<Project, ProjectCollaborator, never> = new ManyToOneReference<
    Project,
    ProjectCollaborator,
    never
  >(this as any, ProjectCollaborator, "id", "projects", true);
  ...
}

Suggested solution is to filter out foreign keys on multiple columns.

Investigate knex pg-driver support

knex/knex#4327

brianc/node-postgres#1993

Soft Delete Support

Automatically at deleted_at: null to any em.finds that don't otherwise include it
Disconnect soft-deleted entities from any other not (?) soft-deleted entity

Add filtering to lens

Instead of:

(await ptm.load((ptm) => ptm.project.stages.projectItems))
  .filter((pi) => pi.isSelection)
  .map((pi) => pi.collaboration.load()),
    );

Be able to express this as:

await ptm.load((ptm) => ptm.project.stages.projectItems(pi => pi.isSelection).collaboration);

Or maybe even find-style clauses that could push down into a WHERE clause:

await ptm.load((ptm) => ptm.project.stages.projectItems({ homeownerSelection: null }).collaboration);

Add jsonb filtering support

Provide helpful error messages for unique constraint violations

Unique constraint violations are often a result of user input and thus require providing user-friendly errors on how to resolve. A nice quality of life improvement would be for Joist to catch unique_violation Postgres errors and re-raise them as ValidationErrors (or equivalent) with a generated user-friendly message. This would reduce duplicate logic by the caller to either pre-query for a value matching the constraint or attempting to parse the error string.

A message using the column name(s) within the constraint (pulled at codgen time) such as {columnName} has already been taken could be generated by default, with an optional config override:

config.uniqueConstraint("activities_unique_unit_id_sub_milestone_id", {
  message: "This Submilestone has already been added to the parent Milestone.",
});

Fix remaining InMemoryDriver test failures, add to CI

Follow up to #258

Consider exposing the domain objects as interfaces

This is kind of an implementation detail, but currently our Loaded type looks like:

export type Loaded<T extends Entity, H extends LoadHint<T>> = T &
  {
    [K in keyof T]: H extends NestedLoadHint<T>
      ? LoadedIfInNestedHint<T, K, H>
      : H extends ReadonlyArray<infer U>
      ? LoadedIfInKeyHint<T, K, U>
      : LoadedIfInKeyHint<T, K, H>;
  };

Note the T & at the beginning.

This is to work around the fact that, for classes with private fields, a mapped type can never be treated as equivalent to the class:

microsoft/TypeScript#471

A potentially work around for this is using a Public<T> type:

class FooImpl { ... }
export type Foo = Public<FooImpl>

Such that everyone using Foo thinks they're using your class, but they're really using a mapped type, which has stripped out the private fields, and makes it so that we can now have mapped types of Foo that can substitute/polymorphically satisfy Foo itself.

Granted, the FooImpl is kind of ugly, but given that Joist already controls imports/exports in a very deliberate way (to avoid circular reference issues), we could potentially do something like:

// Author.ts, use the normal Author name
export class Author extends AuthorCodegen { ... }

// entities.ts
import { Author as AuthorImpl } from "./Author"
export type Author = Public<AuthorImpl>

And because we already have the restriction that all imports of Author must go through import { Author } from ./src/entities, the rest of the codebase would get the mapped type version for free.

Note that, for our current Loaded this indirection/de-class-ification is not necessary, but I'm thinking ahead to ReactiveHints where I think we want to be able to actively remove attributes from the type, and so we would not be able to use the T & approach.

I think where this really becomes a problem is code like:

author.books.add(b);

Which is currently typed as add(Book). But if you've got a mapped / loaded / reactive version of the Book, that's not the same.

Maybe the relations should be typed with add(BookLike) which, besides being a mapped typed, would be so minimal that it was "just the book entity", i.e. the core id: BookId.

Support accepting external updated_ats

I.e.

const author = await em.load(Author, "a:1", "...8:00am...");
// Maybe too cute
const author = await em.load(Author, "a:[email protected]:00am...");

Add N+1 safe Collection.count & LargeCollection.count

I.e. being able to do:

const count = await task.children.count();

Across N tasks in parallel and have it still be 1 SQL query.

Bonus points if this handles WIP changes, i.e. the SQL query says "20 children", but we have local changes that will add 2 new children, and remove 1 existing child. Doing this would require not a COUNT(*) but bringing back a list of per-entity children ids, so that we could do adds/removes on the list of ids that are currently in the database.

Support singular table names

I know I am picking nits here.

But I am still wondering. What is the underlying technical limitation for that?

Foreign key where clauses do not work with `in` clause format

✅ works: project: { id: args.input.projectId }

Results in a proper join clauses and where project.id IN (?) hits this case

❎ does not work: project: { id: { in: args.input.projectId } }

Results in an incorrect where clause: where project.id = ? hits this case

The second case should be updated to hit the same case as the working example.

Allow passing ids to em.create

Take example:

const em = newEntityManager();
  const company = await em.create(Company, { name: 'company 1' });

  const person = await em.create(Person, {
    firstName: 'asdsd',
    email: '[email protected]',
    company: company, // <---- what does this do? 
  });

I noticed that we cannot set companyId on create of the Product entity. Is that intended?

Think about a run.sh replacement

Loading
- Production code path should never load the env
- Tests need to load the env from Webstorm/vscode, done in setupTests?
  - Could use setupTestEnv.js, set process.env.STAGE=local and require dotenv-stages
- Scripts need to load the env (migrations, console), done in run.sh?
  - Could set STAGE=local and pass -r dotenv-stages
- docker-compose needs to load the env, done in docker-compose.yml?
  - Could set STAGE=local+docker and NODE_OPTIONS=-r dotnev-stages

STAGE=local+docker,test
Look for NODE_ENV=test?
How do detect within docker?

Remove DATABASE_CONNECTION_INFO

This is an old RDS-ism that we don't need anymore. We could probably use something from node-postgres to do DB_... handling for us.

Support join tables between entities and enums

Current Joist supports m2m between entities, but not a m2m between an entity and an enum.

Note that we do support enum array columns, which can often present the same model w/o the join table, but it would be nice to support both for good measure.

Enum array columns look like:

  b.addColumn("development_commitments", { cost_types: enumArrayColumn("cost_type") });

docs: querying

Would be great to add to docs what queries are supported. Examples:

equality
in
LIKE
where or
where not
nested/associated
select
order

docs: Add code tables page

docs: Add non-goals page

I.e. support for NoSQL and support for esoteric/legacy schemas.

Support default values for enums

Use joins for populating multiple levels

Currently a populate hint like:

const a1 = await em.find(Author, { id: "a:1" }, { books: "bookReviews" } )

Issues three SQL queries:

SELECT * FROM author WHERE id = 'a:1'
SELECT * FROM books WHERE author_id IN (1);
SELECT * FROM book_reviews WHERE book_id in (2,3,4,5);

Most ORMs that implement preloading will issue a single SQL call:

SELECT a.*, b.*, br.*
  FROM author a
  JOIN books b ON b.author_id = a.id
  JOIN book_reviews br ON br.book_id = b.id 
  WHERE a.id = 'a:1';

Joist does the 1st for three reasons:

Simplicity of implementation. The current 1-query-per-level queries leverage the same auto-batching infra that we get from using dataloader for "avoiding N+1s in a for loop", so we essentially got hint-based preloading for free.
1-per-level queries integrate nicely with custom references/collections/async properties

Currently preloading is very generic and just calls await ...load() on whatever relation field is being preloaded. This makes it very easy for our other not-strictly-db-relations modeling to integrate with the prepopulate model in a very natural way.

If we were to implement JOIN-based prepopulation, we'd have to separate the pre-load hint into JOIN-able hints vs. load-able hints, and if a preload hint has both, then do an initial JOIN for the JOIN-able data, and then do a 2nd pass to call .load() on the non-JOIN able properties
Similar to point 1, with the disclaimer our current app is low-volume, the "1 query per level" has been a sufficient solution to the N+1 problem so far.

Teach `entityResolver` about GraphQL look aheads

Currently joist-graphqls integration, i.e. entityResolver, does not bother to do GraphQL look aheads.

E.g. when doing query { author { books { bookReviews } } }, the code that loads the author or the books does not bother trying to figure out that bookReviews is also going to be loaded.

This is because of three reasons:

GraphQL look aheads are just generally complicated to implement and we haven't needed it yet, and
Due to not using JOINs for preloading (see #238), it doesn't actually matter that entityResolver doesn't do look aheads, b/c they'd result in the same non-JOIN .load() calls that we do today anyway, and
Due to Joist's / entityResolver's purposeful "non 1-to-1 mapping between DB schema and GQL API", any sub-query that has filters is not something that we can automatically turn into SQL.

E.g. for a query like query { author { books(filter: { title: "a" }) { bookReviews } } }, we purposefully let the books.filter be pretty free-form (to match the exact needs of the UI that uses it), and so cannot auto-WHERE-ize it

Assuming that #238 is implemented, this 2nd rationale would change, and it'd be worth investigating pulling in GraphQL look aheads.

Granted, the 3rd rationale is unlikely to change, so look aheads would probably only work for non-argument / non-filtered relations.

Add config placeholder

When an entity is first created, include a config.placeholder() line like:

import { someEntityConfig as config } from "./entities";

export class SomeEntity { ... }

config.placeholder();

Where placeholder is a noop, but it keeps the config import in place so that when the developer is ready to add custom rules/hooks, they don't have to type out / know about the someEntityConfig as config convention.

Use dockerize for `make db` in integration tests

If the DB doesn't start up right away, you'll get a failure:

Creating integration-tests_db_1 ... done
docker-compose exec db ./reset.sh
psql: could not connect to server: No such file or directory
	Is the server running locally and accepting
	connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
make: *** [db] Error 2

Using jwilder/dockerize would allow us to hold until the DB's up and ready.

Adding entities to a m2m should dedup on save

Add Large Collections

A large collection is a m2m or o2m that is too large to actually load as entities, so if publisher.authors was a LargeCollection it would not have a .load() method and would not be a validation populate hint or reactive hint.

Support Single Table Subclasses

In the past I've generally preferred class-per-table subclasses, but it might be neat to support single-table subclasses with a scheme like:

Designate a type column and determines the subclass in joist-codegen.json:

UserEvent: {
  fields: {
    type: {
      subclasses: {
        COMMENTED: { name: "CommentUserEvent", fields: ["comment"] },
        LIKED: { name: "LikeUserevent", fields: ["post", "author"] },
      }
    }
  }

Or maybe re-arranged slightly:

UserEvent: {
  subclassDiscriminator: "type",
  subclasses: {
    COMMENTED: { name: "CommentUserEvent", fields: ["comment"] },
    LIKED: { name: "LikeUserEvent", fields: ["post", "author"] },
  }

And then codegen would output:

UserEventCodegen.ts as-is
UserEvent.ts as-is
CommentUserEventCodegen.ts, new, extends UserEvent, marks comment as required
CommentUserEvent.ts, new, touched once, for subclass-specific business logic.

Questions:

How do handle m2os to "only a subclass" i.e. only a CommentUserEvent. In CPTI (class per table), the m2o FK just points to the subclass column

Add ManyToMany.includes and OneToMany.includes

Add methods like:

const hasFollower: Promise<boolean> = collaboration.entityFollowers.includes(user);

That would do a SELECT * FROM entity_followers WHERE collaboration_id = 1 and user_id = 2 so that we could detect m2m / o2m presence without loading the entire collection.

Also create the SELECT via a dataloader so that multiple field resolvers that call .includes(...) will be batched into a single SQL call.

docs: Add migrations entry

Add lazy column support

Basic idea:

In joist-codegen.json add a lazy: true to a given column (primitive only?)
Generate the column as someBigColumn: LazyColumn
Can be access via await author.someBigColumn.load()
Can be populated with await em.populate(author, [..., "someBigColumn"]) and then accessed synchronously author.someBigColumn.get

Optimize reverse hint following

Currently if there are multiple reverse rules, they are treated as separate entities, e.g.:

// BookReview.ts
bookReviewConfig.addRule({ book: "author" }, () => ...);

// Book.ts
bookConfig.addRule("author", () => ...);

Both of these end us as reverse hints in the Author config, e.g.:

["books", "reviews"]
["books"]

But then in followReverseHint we invoke books.load() multiple times. Really instead of multiple separate reverse hint arrays, we need a single tree of hints, i.e:

* books
  * reviews
* ...other collections....

Update jsonb column values to be immutable

And use immutable.js or immer.js for easy deep modifications.

pg-structure error on postgres 14

Running into this:

TypeError: typeKinds[kind] is not a constructor
    at /Users/mo/Desktop/code/playing-around/orm-comparison/node_modules/pg-structure/src/main.ts:171:12
    at Array.forEach (<anonymous>)
    at addTypes (/Users/mo/Desktop/code/playing-around/orm-comparison/node_modules/pg-structure/src/main.ts:166:8)
    at addObjects (/Users/mo/Desktop/code/playing-around/orm-comparison/node_modules/pg-structure/src/main.ts:368:3)
    at pgStructure (/Users/mo/Desktop/code/playing-around/orm-comparison/node_modules/pg-structure/src/main.ts:465:3)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
    at async /Users/mo/Desktop/code/playing-around/orm-comparison/node_modules/joist-codegen/src/index.ts:190:16

When console logging, I see that datemultirange is not supported on pg-structure.

{
  oid: 4535,
  arrayOid: 6155,
  schemaOid: 11,
  classOid: 0,
  kind: 'm',
  category: 'R',
  notNull: false,
  default: null,
  sqlType: null,
  arrayDimension: 0,
  name: 'datemultirange',
  values: '{}',
  comment: 'multirange of dates',
  relationKind: null
} m

Simplify hooks

Currently we have the following hooks:

beforeDelete
beforeFlush
beforeCreate
beforeUpdate
afterValidation
afterCommit

With a few thoughts:

We copied the "before" / "after" terminology from Rails, but we really don't have "beforeFoo" and "afterFoo" for every "foo" stage, and it's not clear we should have that.
We currently don't use afterValidation
We have 3 afterCommits that push data into external systems
We've been tripped by beforeFlush running before beforeCreate b/c we want to apply defaults in beforeCreate that are seen by beforeFlush

Rough proposal, remove the before/after naming convention, and fix the ordering:

onCreate (fired only on create, and before onSave)
onUpdate (fired only on update, and before onSave)
onSave (fired on either create or update, and after onCreate / onUpdate
onDelete (fired only on delete)
onCommit the new afterCommit

Explore queries issuing in parallel

AFAIU, Postgres itself (the server) has great support for "1 connection making multiple / simultaneous in-flight queries", but that the client driver libpq only just gained this functionality (todo add link).

And so I'm not sure if the pg package we use is leveraging this capability, or if it's artificially serializing queries.

I.e. for Joist, we could send UPDATE books ..., INSERT authors ..., UPDATE book_reviews ..., basically all of our mutations, essentially all simultaneously, but I believe/assume the driver is currently serializing them.

newTestInstance unnecessarily creates associations for non-required fields

If a field is not required, and not specified in the opts, then newTestInstance should not associate it to an existing entity.

If I have:

Author {
  book?: Book
}

Then call:
const a1 = newTestInstance(em, Author, { name: "A1", book: { title: "I'm a book by A1" } });
const a2 = newTestInstance(em, Author, { name: "A2" });

// This will fail
expect(a1.book).not.toEqual(a2.book)

Hooks can see invalid versions of entities

I.e. a beforeCreate is passed an Author will all of the required fields as-is; but b/c the hook sees the entities before the validation rules have verified the types are correct, something like author.name.startsWith could actually fail.

Seems like we should pass the hooks a "more correct" version of the entity with the required fields turned to optional.

Note that we've solved this for validation rules, b/c the "required fields" rules run before other rules.

The wrinkle is that hooks run even before required-field rules b/c they can set defaults for fields.

joist-orm / joist-orm Goto Github PK

joist-orm's Issues

Recommend Projects

Recommend Topics

Recommend Org