Code Monkey home page Code Monkey logo

ferretdb's Introduction

FerretDB

Go Reference

Go codecov

Security Packages Docs

FerretDB was founded to become the de-facto open-source substitute to MongoDB. FerretDB is an open-source proxy, converting the MongoDB 5.0+ wire protocol queries to SQL - using PostgreSQL or SQLite as a database engine.

flowchart LR
  A["Any application\nAny MongoDB driver"]
  F{{FerretDB}}
  P[(PostgreSQL)]
  S[("SQLite")]

  A -- "MongoDB protocol\nBSON" --> F
  F -- "PostgreSQL protocol\nSQL" --> P
  F -. "SQLite library\nSQL" .-> S

Why do we need FerretDB?

MongoDB was originally an eye-opening technology for many of us developers, empowering us to build applications faster than using relational databases. In its early days, its ease-to-use and well-documented drivers made MongoDB one of the simplest database solutions available. However, as time passed, MongoDB abandoned its open-source roots; changing the license to SSPL - making it unusable for many open source and early-stage commercial projects.

Most MongoDB users do not require any advanced features offered by MongoDB; however, they need an easy-to-use open-source document database solution. Recognizing this, FerretDB is here to fill that gap.

Scope and current state

FerretDB is compatible with MongoDB drivers and popular MongoDB tools. It functions as a drop-in replacement for MongoDB 5.0+ in many cases. Features are constantly being added to further increase compatibility and performance.

We welcome all contributors. See our public roadmap, a list of known differences with MongoDB, and contributing guidelines.

Quickstart

Run this command to start FerretDB with PostgreSQL backend:

docker run -d --rm --name ferretdb -p 27017:27017 ghcr.io/ferretdb/all-in-one

Alternatively, run this command to start FerretDB with SQLite backend:

docker run -d --rm --name ferretdb -p 27017:27017 -e FERRETDB_HANDLER=sqlite ghcr.io/ferretdb/all-in-one

This command will start a container with FerretDB, PostgreSQL/SQLite, and MongoDB Shell for quick testing and experiments. However, it is unsuitable for production use cases because it keeps all data inside and loses it on shutdown. See our Docker quickstart guide for instructions that don't have those problems.

With that container running, you can:

  • Connect to it with any MongoDB client application using MongoDB URI mongodb://127.0.0.1:27017/.
  • Connect to it using MongoDB Shell by just running mongosh. If you don't have it installed locally, you can run docker exec -it ferretdb mongosh.
  • For the PostgreSQL backend, connect to it by running docker exec -it ferretdb psql -U username ferretdb. FerretDB uses PostgreSQL schemas for MongoDB databases. So, if you created some collections in the test database using any MongoDB client, you can switch to it by running SET search_path = 'test'; query and see a list of PostgreSQL tables by running \d psql command.
  • For the SQLite backend, connect to it by running docker exec -it ferretdb sqlite3 /state/<database>.sqlite. So, if you created some collections in the test database using any MongoDB client, run docker exec -it ferretdb sqlite3 /state/test.sqlite and see a list of SQLite tables by running .tables command.

You can stop the container with docker stop ferretdb.

We also provide binaries and packages for various Linux distributions, as well as Go library package that embeds FerretDB into your application. See our documentation for more details.

Building and packaging

Note

We strongly advise users not to build FerretDB themselves. Instead, use binaries, Docker images, or packages provided by us.

FerretDB could be built as any other Go program, but a few generated files and build tags could affect it. See there for more details.

Managed FerretDB at cloud providers

Documentation

Community

If you want to contact FerretDB Inc., please use this form.

ferretdb's People

Contributors

adetunjii avatar aleksi avatar alphab avatar b1ron avatar chilagrow avatar codenoid avatar dependabot[bot] avatar doodgematvey avatar ekalinin avatar fadyat avatar fashander avatar fenogentov avatar gevorgyg avatar henvic avatar krishnasindhur avatar kropidlowsky avatar noisersup avatar opensauce avatar pboros avatar polyal avatar princejha95 avatar ptrfarkas avatar raeidish avatar ribaraka avatar rumyantseva avatar seeforschauer avatar slavabobik avatar thuan1412 avatar wazir-ahmed avatar wqhhust avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ferretdb's Issues

Support embedded use-case

We should provide a non-internal package that would allow Go developers to embed FerretDB functionality into their programs. They would use an official MongoDB Go driver to connect to an address (typically, but necessary, on localhost) that our code would listen on.

Let's put it into /ferretdb directory. Let's start with the most basic public interface:

  • no way to configure mode (it should be "normal");
  • no way to enable or configure logging (it should be disabled);
  • no way to configure metrics (they should be disabled);
  • etc.

To get a connection string (URI), they would call some method after starting the embedded instance.

We will add more functionality based on the community requests.

Let's not move some existing code into that package but copy the smallest possible amount of code and provide the smallest public API possible. Of course, we will still be able to make breaking changes in it before 1.0, but we should not do them lightly.

One possible implementation could be:

  • configuration type Config struct { GoodName string }
  • a single Run(ctx context.Context, config *Config) error function that runs embedded proxy until ctx is canceled.

See #733.

`make env-up` fails on fast computers

PostgreSQL Docker image starts the temporary server, initializes it, then stops it. Waiting for port availability is not enough as we can hit the temporary server, and then the test would fail when the temporary server is stopped.

Instead, we should test with psql query like SELECT 1. Additionally, our psql line actually uses a UNIX socket, not a port; that should be fixed too.

Return sizes in `listDatabases`

listDatabases command should return sizes of each database (PostgreSQL schema) as well as PostgreSQL database total size on disk.

The latter is easy: just call SELECT pg_database_size(current_database()) and convert result. pg_database_size is documented there.

The former is only slightly harder. Let's call pg_total_relation_size on each table. Getting a table list is already implemented in the MsgListCollections method; small refactoring to remove duplication might be needed.

Tests should be added too.

See linked commit for code locations.

Support limit parameter for delete operations

We should support limit: 0 for deleting all matching documents and limit: 1 for deleting only one. Since PostgreSQL does not support DELETE โ€ฆ LIMIT โ€ฆ, we should instead do a subquery:
DELETE FROM xxx WHERE _jsonb->_id IN (SELECT _jsonb->_id FROM xxx WHERE xxx LIMIT 1).

Support querying documents

  • Support exact matches (including fields order): db.collection.find({'value': {'foo': 'bar', 'baz': 42}})
  • Support several ANDed conditions
  • Add more tests for dot notation and deeply nested documents ({"foo.bar.baz": {"qux.qux2": 42}})

We might already support some of those points, but we should add integration tests (moving them from handler tests and dance), and maybe add more tests for weird cases and fix them.

Support basic cursors commands

Tasks

  1. b1ron w84thesun
  2. code/chore
    noisersup
  3. area/cursors code/feature not ready
    chilagrow
  4. area/aggregations area/cursors code/enhancement
    chilagrow
  5. area/cursors code/chore not ready
    chilagrow
  6. area/cursors code/enhancement
    chilagrow

Add basic TLS support

It should be possible to make FerretDB listen on a TLS port.

Let's add flags:

  • --listen-tls to specify a listening address;
  • --listen-tls-cert-file to specify a PEM file containing a certificate chain;
  • --listen-tls-key-file to specify a PEM file containing an unencrypted private key;

Integration tests should have -target-tls boolean flag that would switch them to use insecure (accepting wrong hostname and invalid or self-signed certificates) TLS connections for both in-process and external target ports. For the in-process listener, a self-signed certificate (keep it in the memory) should be generated only once, not for each test.

Not in the scope:

  • TLS over Unix sockets;
  • client certificates;
  • embeddable ferretdb package.

Support basic aggregation pipelines

Tasks

  1. code/chore
    AlekSi
  2. code/feature
    chilagrow
  3. code/feature
    chilagrow
  4. code/feature
    chilagrow
  5. code/feature
    chilagrow
  6. code/feature
    chilagrow
  7. code/feature
    chilagrow
  8. code/enhancement
    chilagrow noisersup

MANGOOOOOOS WOOOO

SOMEONE FINALLY DID IT LEZGOOOOOO ๐Ÿฅญ๐Ÿฅญ๐Ÿฅญ๐Ÿฅญ๐Ÿฅญ

Support OpLog

We want to support a subset MongoDB's OpLog that is enough for applications and frameworks tailing it, such as Meteor.
It is not a goal (yet) to make it good enough for replication.

See:

We should support the "old" format, with full $set for updates, not the new format with diff.

Tasks

  1. code/chore
    rumyantseva
  2. code/chore
    rumyantseva
  3. area/oplog code/chore
  4. area/oplog code/chore
    rumyantseva
  5. code/chore not ready
    rumyantseva

Add basic metrics

Let's track the number of incoming requests by message type, command, and response/error status. Expose them as Prometheus metrics on debug endpoint, and dump them on MangoDB termination. That will be useful for testing applications in proxy-diff mode.

Misleading claims about MongoDB License

MongoDB abandoned its open-source roots, changing the license to SSPL - making it unusable for many open source and commercial projects.

SSPL is not OSI approved. ( https://opensource.org/node/1099 )

But the claim that SSPL licensed software is unusuable for many opensource / commercial projects is not true. MongoDB, or any SSPL licensed software is totally usable as-is for 99% projects. Only exception is where the database itself is provided as a service. It is usable even in that case too, but brings extra open sourcing requirements. That's all.

Claiming that MongoDB is not usable for "open source and commercial projects" is quite wrong IMO.

Please check.

(Also, thanks for the interesting project)

Support multi-sectional OP_MSGs

  • Remove Documents field.
  • Unexport Sections field.
  • Add Document method that could be used to implement lazy reading in the future.

Improve compatibility with real specific apps and libraries

Tasks

  1. 2 of 3
    not ready
  2. 0 of 2
    not ready
  3. 8 of 12
    not ready
  4. 9 of 12
    not ready
  5. 7 of 9
    not ready
  6. 5 of 7
    not ready
  7. 1 of 4
    not ready
    b1ron
  8. 8 of 23
    not ready
  9. 2 of 7
    not ready
    b1ron
  10. 8 of 18
    not ready
  11. 8 of 9
    not ready
  12. 7 of 10
    not ready
  13. 5 of 19
    not ready
    AlekSi
  14. 0 of 1
    not ready
  15. 0 of 1
    not ready
  16. 2 of 4
    not ready
  17. 1 of 1
    not ready
  18. 1 of 2
    not ready
    ferretdb-bot
  19. 9 of 13
    b1ron
  20. 0 of 8
    not ready
    b1ron
  21. 1 of 1
    not ready
  22. 0 of 1
    not ready
  23. 0 of 1
    not ready
  24. 0 of 2
    not ready
  25. 9 of 11
    not ready
  26. 0 of 3
    not ready
  27. 0 of 1
    not ready
  28. 0 of 1
    not ready
    ferretdb-bot
  29. 0 of 1
    not ready
    ferretdb-bot
  30. 0 of 1
    not ready
    ferretdb-bot
  31. not ready

Support capped collections and tailable cursors

https://www.mongodb.com/docs/manual/core/capped-collections/
https://www.mongodb.com/docs/manual/core/tailable-cursors/

Blocks #76.

Tasks

  1. area/diag backend/pg backend/sqlite code/chore not ready
    chilagrow
  2. code/chore
    b1ron
  3. area/admin area/oplog backend/sqlite code/chore community good first issue hacktoberfest
    Akhil-2001 AlekSi
  4. area/admin area/oplog backend/pg code/chore
    rumyantseva
  5. area/admin area/oplog code/feature community good first issue hacktoberfest
    adetunjii chilagrow
  6. area/cursors area/oplog code/chore
    noisersup
  7. area/diag code/chore community good first issue hacktoberfest
    Aditya1404Sal chilagrow
  8. area/oplog backend/pg backend/sqlite code/feature
    chilagrow rumyantseva
  9. area/oplog backend/pg backend/sqlite badly estimated code/feature
    chilagrow
  10. area/oplog code/feature
    noisersup
  11. code/chore
    b1ron
  12. area/oplog backend/sqlite code/chore
    rumyantseva
  13. area/cursors area/indexes area/oplog code/enhancement
    henvic
  14. area/diag code/feature good first issue
    b1ron
  15. area/oplog code/feature not ready
    noisersup
  16. area/oplog code/feature
    rumyantseva
  17. 2 of 2
    area/cursors code/chore
    noisersup
  18. 3 of 3
    area/cursors area/oplog code/feature not ready
    noisersup
  19. area/cursors code/chore
    noisersup
  20. area/cursors area/oplog code/enhancement not ready
    noisersup
  21. area/cursors area/oplog code/enhancement not ready
    noisersup

Pushdown more queries

Tasks

  1. code/feature code/tigris
    w84thesun
  2. code/chore
    noisersup
  3. code/feature
    noisersup
  4. code/feature
    noisersup
  5. code/enhancement
    rumyantseva
  6. code/chore
    noisersup
  7. code/feature
    noisersup
  8. code/enhancement
    chilagrow noisersup
  9. code/enhancement
    noisersup
  10. code/enhancement
    noisersup
  11. code/enhancement
    noisersup

Pushdown sorting too.

Support user management commands

FerretDB should manage users by itself and not rely on the backend to do that:

  • FerretDB should store credentials in the admin.system.users collection;
  • FerretDB should enforce authentication (check passed username and password against stored) itself;
  • FerretDB should not change the backend connection string (PostgreSQL URI, etc) it was started with;
  • FerretDB should not create users in the backend.

Among other things, this feature is required for some authentication mechanisms like SCRAM-SHA-256, which in turn is needed for some applications and drivers.

Note that this issue covers authentication only; authorization is #174.

Links

SASL / SCRAM

PLAIN


Tasks

  1. area/auth code/feature
    henvic
  2. area/auth code/feature
    henvic
  3. area/auth code/feature
    henvic
  4. area/auth code/feature
    henvic
  5. area/auth code/chore
    henvic
  6. area/auth code/chore not ready
  7. area/auth code/feature
    henvic
  8. code/chore
    b1ron
  9. area/auth code/chore
    b1ron henvic
  10. area/auth code/feature not ready
    henvic
  11. area/auth code/chore not ready
    chilagrow henvic
  12. area/auth code/chore
    AlekSi
  13. code/chore not ready
    chilagrow
  14. area/auth backend/sqlite code/feature not ready
    ferretdb-bot
  15. area/auth code/chore
    chilagrow henvic
  16. code/chore not ready
    ferretdb-bot
  17. area/auth code/chore not ready
    ferretdb-bot
  • Update listCommands to include requiresAuth.
  • Update connectionStatus to include authInfo.

Build Docker images for git tags (releases) on CI

We already build Docker images on CI for PRs and pushes to main, but not for tags.

See also #323.

Support transaction commands

Support transactions and the following commands:

  • abortTransaction
  • commitTransaction

Implicit transactions for inserts are covered by #200. See also #866.

Blocked by #153.

Please vote for that issue if you use MongoDB transactions.

Tasks

  1. code/chore
    b1ron

Test with CockroachDB

We should run our tests (both locally and on CI) with CockroachDB. There is no separate handler for it; instead, a regular pg handler should be used. We start CockroachDB side-by-side with PostgreSQL (and Tigris), don't try to replace the latter with the former.

Our changes in Taskfile(s) and CI configuration should be more or less copy&pasted. We might want to do small refactoring, but that's not required. We will revisit that at #85.

But changes in the envtool, if any, should be good; its refactoring is in scope.

DoD:

  • tests could be run agains CockroachDB both locally (with task) and on CI;
  • it should be possible to run FerretDB with CockroachDB with task run-cockroach.
  • if we see that CockoachDB is not fully compatible, let's document what doesn't work.

Improve query performance

There is room for query performance improvements.

Some of them will not require us to break strict compatibility with MongoDB. For example, we could move document sorting to a stored procedure that would do everything on the PostgreSQL side but (with a lot of logic and tests) can still be fully compatible with MongoDB. Or we could write a custom PostgreSQL operator class / data type / extension to do more work on the backend side.

Other performance improvements may require us to break strict compatibility while keeping query results sane (and even saner than MongoDB). For example, we could push down ORDER BY query part as is. It will not sort some things the same way as MongoDB, but maybe that's fine.

All that functionality should be under testing flags until the next major version.

Tasks

  1. code/chore deps
    AlekSi craigpastro
  2. code/chore good first issue
    craigpastro rumyantseva
  3. code/chore
    noisersup
  4. code/feature
    b1ron

Windows Instructions

Forgive me for being that guy, I know we have unix and mac covered, what about windows? ๐Ÿ˜„

Support array querying

It should be possible to Query an Array. Research dot notation support for arrays. Add more tests.

As a result of this task we should also have a better estimation where dot notation is supported and where it isn't.

Duplicate keys in JSON and fuzzing problem

dvyukov/go-fuzz-corpus#3
golang/go#14750
golang/go#48298

Something like that might be needed in the future:

// check if j contains objects with duplicate keys that not handled by encoding/json yet:
// * https://github.com/dvyukov/go-fuzz-corpus/issues/3
// * https://github.com/golang/go/issues/14750
// * https://github.com/golang/go/issues/48298
{
	var o1 interface{}
	if err := json.Unmarshal([]byte(j), &o1); err != nil {
		t.Skip(err)
	}

	b, err := json.Marshal(o1)
	if err != nil {
		t.Skip(err)
	}

	var o2 interface{}
	if err := json.Unmarshal(b, &o2); err != nil {
		t.Skip(err)
	}

	if !reflect.DeepEqual(o1, o2) {
		t.Skip("j contains objects with duplicate keys that not handled by encoding/json yet")
	}
}

Test coverage?

Hi,

great project. I have heard that MongoDB team has a test suite, that is also used by Amazon and Azure Teams to test the coverage of MongoDB features.

Have you tested this and what is the current coverage of the feature set?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.