Area [x] Plutus Application Framework Related to the Plut

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Chain Index questions and improvement proposals about plutus-apps HOT 6 CLOSED

input-output-hk commented on May 3, 2024

Chain Index questions and improvement proposals

from plutus-apps.

Comments (6)

j-mueller commented on May 3, 2024 1

Thanks for this list. The points are very sensible and it would be great to work together on this. The question is just what order should we do the items in, so that we don't do the same work twice.

At IOG we are currently looking at the resource usage (touching on item 9). It would probably make sense for you to start with error handling (2). Then you could look at the way scripts are stored (4) - maybe we just need a new column in the scripts table to record the type of script. If you're ok with that then we should open a new github issue for each item and track the work there.

Some comments on individual points:

get all transactions of a script in a single query

One of the design goals of the chain index is to only store an amount of data proportional to the ledger state, not to the size of the blockchain. We achieve this by a kind of reference counting on the UTXO set - we only store data of transactions that still have at least one unspent output in the UTXO set. If all outputs of a transaction are spent then it will eventually be deleted by the garbage collect function. Because of this, the query would only give you all transactions of a script that still have unspent outputs in the UTXO set.

Why are transactions (ChainIndexTx) stored in full in the database?

Generally, disk space is cheap (esp. when considering that the space needed by the chain index is proportional to the ledger state, not to the length of the blockchain). And being able to look up a ChainIndexTx from a given TxOutRef is quite useful.

The _citxCardanoTx field of the ChainIndexTx is an escape hatch, in case that a dapp developer is interested in data that's present in the transaction but not currently stored in the chain index database. One example of this would be transaction metadata. It would be fine to add a configuration flag to decide whether to store the full transaction in _citxCardanoTx.

Add an extension/customization interface for developers to structure their caches/UTXO pools for faster queries and UTXO discovery

Yes I think that's the logical next step for the chain index. We need to think about how best to do it though - it would be nice if we could re-use the FingerTree and beam code as much as possible. Happy to discuss your idea in more detail in a separate issue.

every performance improvement is still significant

Agree, and we're actively working on improving this.

from plutus-apps.

sjoerdvisscher commented on May 3, 2024 1

@kk-hainq I think item 1 and in particular Swagger support would be really helpful!

from plutus-apps.

kk-hainq commented on May 3, 2024

Thanks for this list. The points are very sensible and it would be great to work together on this. The question is just what order should we do the items in, so that we don't do the same work twice.

Thank you for the reply. It would be our pleasure to contribute!

At IOG we are currently looking at the resource usage (touching on item 9). It would probably make sense for you to start with error handling (2). Then you could look at the way scripts are stored (4) - maybe we just need a new column in the scripts table to record the type of script. If you're ok with that then we should open a new github issue for each item and track the work there.

I agree that (2) and (4) would make a sensible start. At the other end, we can leave (1), (3), and (10) behind until things are stable enough to avoid constant updates. (7) and (8) can be further broken down as well.

We'll open two new issues for (2) and (4) tomorrow and hopefully, we can close them by the end of the week. Then start working on user configurations and customized behaviors next week. Just tell us if you have anything more urgent that needs a helping hand.

One of the design goals of the chain index is to only store an amount of data proportional to the ledger state, not to the size of the blockchain. We achieve this by a kind of reference counting on the UTXO set - we only store data of transactions that still have at least one unspent output in the UTXO set. If all outputs of a transaction are spent then it will eventually be deleted by the garbage collect function. Because of this, the query would only give you all transactions of a script that still have unspent outputs in the UTXO set.

I agree that we need to be very conscious about this and that the current design makes sense. I just wonder if we could make it more configurable, given that both our security and dApp (most apps I work with require several stats endpoints) works desire historical data. It would be super useful if:

The chain index can be configured to only store and scale with preset scripts. They can be a dApp's own scripts and competitors'. The rest of the whole blockchain might as well be irrelevant hence dropped.
The dApp can define what is "garbage" to be collected. For example, to only store and track transactions to the preset scripts in the last 12 months.
The dApp can extend the chain index to fold data. This allows the chain index to cheaply maintain and rapidly serve statistical data without actually storing historical data.

But yeah, we could say that these are nice-to-haves. The priority should always be for efficient UTXO discovery, which indeed focuses on the live UTXO set.

Generally, disk space is cheap (esp. when considering that the space needed by the chain index is proportional to the ledger state, not to the length of the blockchain). And being able to look up a ChainIndexTx from a given TxOutRef is quite useful.

True, I agree with both. In that case, I guess it is worth adding a few more columns to make the database more queriable in general? We'll keep integrating the chain index with our works to find more common use cases.

The _citxCardanoTx field of the ChainIndexTx is an escape hatch, in case that a dapp developer is interested in data that's present in the transaction but not currently stored in the chain index database. One example of this would be transaction metadata. It would be fine to add a configuration flag to decide whether to store the full transaction in _citxCardanoTx.

It should also simplify a lot of related code, which is indeed nice. As we scale we would love to fine-tune to make sure every field of a stored transaction is useful, which not only helps with storage but also processing speed. But yeah, should just be a nice-to-have that can wait.

Yes I think that's the logical next step for the chain index. We need to think about how best to do it though - it would be nice if we could re-use the FingerTree and beam code as much as possible. Happy to discuss your idea in more detail in a separate issue.

Yeah, I think we should keep most of the chain index intact even for customization. Customizing DB schema and effect structure would be very helpful, but we can always start with callback functions and a lot of ... -> Bool-like functions for the dApp to configure. For straightforward configurations, we should just embed them in the config file. We'll open a dedicated issue for this end by next week.

from plutus-apps.

kk-hainq commented on May 3, 2024

@j-mueller @silky @sjoerdvisscher Sorry for the random tagging but what should we prioritize next? We have two drafts in #71, #72 that can proceed if you think they make sense. We are ready to work on anything well-scoped enough too.

from plutus-apps.

kk-hainq commented on May 3, 2024

@sjoerdvisscher Sure, I'll try to get the Swagger thing done this week.

from plutus-apps.

kk-hainq commented on May 3, 2024

While many issues are still relevant, we have been following another direction rendering this content outdated in nature. Will just open more issues when needed.

from plutus-apps.

Chain Index questions and improvement proposals about plutus-apps HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent