As mentioned in the SPEC for TicKV, one huge limitation of TicKV is Fragmentation. Thi

TicKV Fragmentation about tock HOT 4 OPEN

AndrewImwalle commented on May 28, 2024

TicKV Fragmentation

from tock.

Comments (4)

alistair23 commented on May 28, 2024

Thanks for raising this.

You are correct that fragmentation is an issue. It's a hard problem to solve though. Two of the key goals of TicKV are:

 * Power loss resilient
 * Maintain data integrity and detect media errors

So implementing a garbage collection is difficult as we want to ensure that no data is lost on power loss.

The first solution I would like to propose would be a per-region operation.
* Called when a key is invalidated/zeroized.

I don't think this is a good idea. This would add over head to removing keys, which the user doesn't necessarily want. If we do add a defrag option it should be something that is explicitly called.

* Checks how much space of the region would be recovered by performing the operation.
  This would likely be a % of the region.

* Valid keys that are inside the region would be appended to nearby regions.
  This would be done as it would if the current region was already full.

* After all valid keys are appended elsewhere, the region is erased.

Do we move the keys back? If not that increases lookup time for future gets. If we do move them back we are fragmenting other parts.

The benefit of this solution is that it would maintain uniform wear in the regions and would not require a reserved region. The drawback is that the keys will exist outside their original regions, which would raise the lookup time. Additionally, I am unsure as to what threshold would be best. I would appreciate feedback on what the optimal threshold would be.

Exactly!

The threshold really is going to be application specific. We could make it an option, but that seems confusing. It might be best to just defrag the first region that can be and then return. Allowing an application to call the function until all regions have been searched. That way applications can do it in the background when required, without taking the entire overhead of going through all of flash.

from tock.

alistair23 commented on May 28, 2024

The second solution I would like to propose would be a part-wide operation. This one is similar to the idea listed in the SPEC, with a few optimizations to limit the amount of wear on the chip.
* Move the valid keys across multiple regions to the reserved region

* When the reserved region is full, erase the regions that have their valid keys in the reserved region

* Move the valid objects back to their original regions

* Erase the reserved region
This still ensures that we don’t lose data on power loss, but still has the same problems as the solution in the SPEC. There would have to be a garbage collecting state to restore after power loss, and the requirement on wear leveling would still be broken on the reserved region. I also don’t know when this could be called.

That's a good improvement but still lots of drawbacks

from tock.

alistair23 commented on May 28, 2024

I don't have any other solutions.

Is fragmentation an issue for you? If it's blocking a use case of yours then we should focus on a solution. I would lean towards your first option. One big worry is what happens if we loose power after copying the keys. We end up with two copies of the data on resume.

I do think having a defrag function that is specifically called and only runs on one or a small number of regions is the way to go. That way we don't block flash accesses for long while doing this.

Either way we somehow need to track state to resume the operation if we get interrupted. That might be easier with a reserved region

from tock.

AndrewImwalle commented on May 28, 2024

Fragmentation is a blocking use case for me.

I am going to assume that we would utilize parts of my first solution proposal, but applying it to one region and allowing it to be explicitly called. For power loss, setting the state and tracking the region that we are defragmenting to restart/resume the process would prevent copying the key as trying to append a key that exists would return an error code KeyAlreadyExists. Similar functionality for region tracking is built into the garbage collect with the State and Rubbish State.

The threshold will have to be application specific, but we will need a threshold else we could end up calling this function every time there in an invalidate/zeroise key which would cause unnecessary wear on the part. This threshold would just be a % of the region that is able to be recovered.

from tock.

TicKV Fragmentation about tock HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent