Code Monkey home page Code Monkey logo

async-append-only-log's People

Contributors

arj03 avatar barbarrosa avatar cryptix avatar kylemaas avatar staltz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

async-append-only-log's Issues

Compaction

This is going to be a moving target, as I explore solutions for this.

Basic premise is: once there have been a couple of deletes (i.e. overwrites with zero bytes), we can produce a new log that has all the real records, minus the zero bytes.

The algorithm is an idea Dominic wrote here: ssbc/ssb-db2#306 (comment)


  • log.bipf.compaction as WAL existing only during compactions
  • firstUncompactedBlock: first block that may have holes and needs compaction
  • firstUnshiftedRecord: first record that has still not been "moved left" during compaction

Improve performance of isBufferZero

I had an idea to improve the performance of this.

If we end up "merging" deleted records, then we'll have huge records, and it'll be slow to check whether all the next (suppose) 50MB are zero bytes.

Instead what we can do is pass in an option to customize isBufferZero, similar to how we have opts.validateRecord that is passed in. This way, ssb-db2 could pass in a custom isBufferZero which just checks the first 10 bytes whether they are zero or not, because I suspect we never have an ssb-db2 record in the log where the first 10 bytes are zero and the next bytes after that are non zero.

Cannot read property 'slice' of undefined

I got this in manyverse:

01-04 19:57:01.221  7006  7063 E NODEJS-MOBILE: TypeError: Cannot read property 'slice' of undefined
01-04 19:57:01.221  7006  7063 E NODEJS-MOBILE:     at Request._callback (/data/data/se.manyver/files/nodejs-project/index.js:41590:68)
01-04 19:57:01.221  7006  7063 E NODEJS-MOBILE:     at Request.Trwq+KqeWzxsbUet+Omb+wC9uH2QMBcZNBUiXOCqCv4=.Request.callback (/data/data/se.manyver/files/nodejs-project/index.js:45141:8)
01-04 19:57:01.221  7006  7063 E NODEJS-MOBILE:     at onwrite (/data/data/se.manyver/files/nodejs-project/index.js:75540:31)
01-04 19:57:01.221  7006  7063 E NODEJS-MOBILE:     at FSReqCallback.wrapper [as oncomplete] (fs.js:615:5)

https://github.com/ssb-ngi-pointer/async-append-only-log/blob/8d6626831cd0c61ab97c45d31a0a4717f2723f26/index.js#L280

Compaction state written even if not used

I was checking out the latest db2 changes in the browser and things are not working because the compaction module has a few things that needs fixing before it can. The first is stateFileExists() because it uses fs directly. I was working around that, it requires the function to be async, but that seems to be okay. Then I ran into another problem with writeUInt32LE not being available (callstack is save() & stop()). This seems to be because browserify is using an old version of the buffer module. In any case, I was wondering why it was even try to write this file in the first place. I havn't used the API, so I didn't expect this to do anything. Maybe it would be good to change so doesn't write the file if the file doesn't exist in the first place? Just checking my assumptions about this before making changes :)

RangeError [ERR_OUT_OF_RANGE]

I was doing some not-so-careful experiments, and this meant I don't remember the reproduction steps perfectly, but I do know that I killed the app and then resumed db2 migration a couple of times. This happened suddenly:

RangeError [ERR_OUT_OF_RANGE]: The value of "offset" is out of range. It must be >= 0 and <= 65534. Received 80228
    at boundsError (internal/buffer.js:81:9)
    at Buffer.readUInt16LE (internal/buffer.js:238:5)
    at Object.getDataNextOffset (/data/data/se.manyver/files/nodejs-project/index.js:25310:31)
    at Stream.SyceLWe5Nm2mmym/v2ltZ2hAhKqlddxQJQd6z2avDTE=.Stream._handleBlock (/data/data/se.manyver/files/nodejs-project/index.js:41959:40)
    at /data/data/se.manyver/files/nodejs-project/index.js:42027:26
    at Request._callback (/data/data/se.manyver/files/nodejs-project/index.js:25269:9)
    at Request.Trwq+KqeWzxsbUet+Omb+wC9uH2QMBcZNBUiXOCqCv4=.Request.callback (/data/data/se.manyver/files/nodejs-project/index.js:42692:8)
    at onread (/data/data/se.manyver/files/nodejs-project/index.js:27150:31)
    at FSReqCallback.wrapper [as oncomplete] (fs.js:520:5) {
  code: 'ERR_OUT_OF_RANGE'
}

Make multiprocess-safe

Right now if two processes open the same log file and are both able to write, they can garble the log by writing over each others' data due to each using a write offset held in non-shared RAM.

Looking at the code, I'm not even sure how you'd fix this. I'll have to study this some more. It'd be good to at least document this behavior so people know not to do it. But it'd sure be nice if it had some method of actually fixing this.

RangeError when reading record data length

I got this crash report from Manyverse. I still don't know what the reproduction steps are, but reporting all the details I can, maybe it helps figure this out.

Versions

  • App: Manyverse 0.2210.3
  • OS: iOS 21.6.0 (is this an actual iOS version? that's what the crash claims)
  • Node runtime: 12.19.0
  • ssb-db2: 4.2.1
  • jitdb: 7.0.5
  • async-append-only-log: 4.3.7
  • random-access-file: 2.2.1

Resources

  • RAM: 3.6 GiB of which only 146.6 MiB free
  • Processor count: 6
  • Architecture: arm64

Code block

const HEADER_SIZE = 2 // uint16
function size(dataBuf) {
  return HEADER_SIZE + dataBuf.length
}
function readDataLength(blockBuf, offsetInBlock) {
  return blockBuf.readUInt16LE(offsetInBlock) //             <-----------------------
}
function readSize(blockBuf, offsetInBlock) {
  const dataLength = readDataLength(blockBuf, offsetInBlock)
  return HEADER_SIZE + dataLength
}

Stack trace

RangeError: The value of "offset" is out of range. It must be >= 0 and <= 65534. Received 95084
  File "internal/buffer.js", line 81, col 9, in boundsError
  File "internal/buffer.js", line 238, col 5, in Buffer.readUInt16LE
  File "nodejs-project/index.js", line 109582, col 19, in Object.readDataLength
  File "nodejs-project/index.js", line 111380, col 31, in Object.getDataNextOffset
  File "nodejs-project/index.js", line 47642, col 44, in Stream._handleBlock
  File "nodejs-project/index.js", line 47701, col 27, in Stream._resumeCallback
  File "nodejs-project/index.js", line 111353, col 9, in Request.onRAFReadDone [as _callback]
  File "nodejs-project/index.js", line 138474, col 8, in Request.callback
  File "nodejs-project/index.js", line 24593, col 31, in onread
  File "fs.js", line 520, col 5, in FSReqCallback.wrapper [as oncomplete]

Crash: RangeError [ERR_OUT_OF_RANGE]

Got this crash report from Manyverse, so it means async-append-only-log version 3.0.1

App: se.manyver 0.2101.5-beta-googlePlay (95)
Device: Xiaomi Redmi Note 4 (arm64-v8a | armeabi-v7a | armeabi)
OS: Android 9 (SDK 28)
User comment: the app just crashed

The value of \\\"offset\\\" is out of range. It must be an integer. Received NaN
RangeError [ERR_OUT_OF_RANGE]: The value of \\\"offset\\\" is out of range. It must be an integer. Received NaN
    at boundsError (internal/buffer.js:75:11)
    at Buffer.readUInt16LE (internal/buffer.js:238:5)
    at getData (/data/data/se.manyver/files/nodejs-project/index.js:63060:25)
    at /data/data/se.manyver/files/nodejs-project/index.js:63075:7
    at Request._callback (/data/data/se.manyver/files/nodejs-project/index.js:63054:9)
    at Request.Trwq+KqeWzxsbUet+Omb+wC9uH2QMBcZNBUiXOCqCv4=.Request.callback (/data/data/se.manyver/files/nodejs-project/index.js:44790:8)
    at onread (/data/data/se.manyver/files/nodejs-project/index.js:75871:31)
    at FSReqCallback.wrapper [as oncomplete] (fs.js:520:5)

Manually interpreting the modules in the stack trace:

The value of \\\"offset\\\" is out of range. It must be an integer. Received NaN
RangeError [ERR_OUT_OF_RANGE]: The value of \\\"offset\\\" is out of range. It must be an integer. Received NaN
    at boundsError (internal/buffer.js:75:11)
    at Buffer.readUInt16LE (internal/buffer.js:238:5)
    at getData (async-append-only-log)
    at get (async-append-only-log)
    at Request._callback (raf.read callback in async-append-only-log)
    at Request.prototype.callback (random-access-storage)
    at RandomAccessFile.prototype._read callback onread (random-access-file)
    at FSReqCallback.wrapper [as oncomplete] (fs.js:520:5)

Update: And same crash for an entirely different user.

App: se.manyver 0.2101.5-beta-googlePlay (95)
Device: Google Pixel 3a (arm64-v8a | armeabi-v7a | armeabi)
OS: Android 11 (SDK 30)
User comment: N/A


And another

App: se.manyver 0.2101.5-beta-googlePlay (95)
Device: Google Pixel 5 (arm64-v8a | armeabi-v7a | armeabi)
OS: Android 11 (SDK 30)
User comment: crashes on startup every time

NaN RangeError when deleting records

Similar to #89 but seems different enough to be its own issue

Versions

  • Manyverse desktop on master branch today (2022-10-24), maybe reproducible on 0.2210.3
  • OS: Linux
  • Node runtime: 12.19.0
  • ssb-db2: 6.2.5
  • jitdb: 7.0.5
  • async-append-only-log: 4.3.7
  • random-access-file: 2.2.1

Reproduction

Block someone and wait for Manyverse to use ssb-friends-purge to call db2 deleteFeed() (and in turn,) AAOL del().

Stack trace

RangeError [ERR_OUT_OF_RANGE]: The value of "position" is out of range. It must be an integer. Received NaN
    at Object.read (node:fs:653:3)
    at RandomAccessFile._read (/home/staltz/oss/manyverse/desktop/index.js:20533:6)
    at Request._run (/home/staltz/oss/manyverse/desktop/index.js:114867:40)
    at RandomAccessFile.RandomAccess.run (/home/staltz/oss/manyverse/desktop/index.js:114775:12)
    at RandomAccessFile.RandomAccess.read (/home/staltz/oss/manyverse/desktop/index.js:114720:8)
    at getBlock (/home/staltz/oss/manyverse/desktop/index.js:93605:11)
    at del (/home/staltz/oss/manyverse/desktop/index.js:93665:7)
    at Object.waitForLogLoaded [as del] (/home/staltz/oss/manyverse/desktop/index.js:93984:12)
    at Timeout._onTimeout (/home/staltz/oss/manyverse/desktop/index.js:5522:38)
    at listOnTimeout (node:internal/timers:557:17)

Stack trace expanded

(not an actual git diff, I'm just using diff syntax to show the line that throws)

1 random-access-file

 RandomAccessFile.prototype._read = function (req) {
   var self = this
   var data = req.data || this._alloc(req.size)
   var fd = this.fd
 
   if (!req.size) return process.nextTick(readEmpty, req)
-  fs.read(fd, data, 0, req.size, req.offset, onread)
 
   function onread (err, read) {
     if (err) return req.callback(err)
     if (!read) return req.callback(createReadError(self.filename, req.offset, req.size))
 
     req.size -= read
     req.offset += read
 
     if (!req.size) return req.callback(null, data)
     fs.read(fd, data, data.length - req.size, req.size, req.offset, onread)
   }
 }

2 random-access-storage

 Request.prototype._run = function () {
   var ra = this.storage
   ra._pending++
 
   this._sync = true
 
   switch (this.type) {
     case READ_OP:
-      if (this._openAndNotClosed()) ra._read(this)
       break

3 random-access-storage

 RandomAccess.prototype.run = function (req) {
   if (this._needsOpen) this.open(noop)
   if (this._queued.length) this._queued.push(req)
-  else req._run()
 }

4 random-access-storage

 RandomAccess.prototype.read = function (offset, size, cb) {
-  this.run(new Request(this, READ_OP, offset, size, null, cb))
 }

5 async-append-only-log

  function getBlock(offset, cb) {
    const blockIndex = getBlockIndex(offset)

    if (cache.has(blockIndex)) {
      debug('getting offset %d from cache', offset)
      const cachedBlockBuf = cache.get(blockIndex)
      cb(null, cachedBlockBuf)
    } else {
      debug('getting offset %d from disc', offset)
      const blockStart = getBlockStart(offset)
-     raf.read(blockStart, blockSize, function onRAFReadDone(err, blockBuf) {
        cache.set(blockIndex, blockBuf)
        cb(err, blockBuf)
      })
    }
  }

6 async-append-only-log

  function del(offset, cb) {
    if (compaction) {
      cb(delDuringCompactErr())
      return
    }
    const blockIndex = getBlockIndex(offset)
    if (blocksToBeWritten.has(blockIndex)) {
      onDrain(function delAfterDrained() {
        del(offset, cb)
      })
      return
    }

    if (blocksWithDeletables.has(blockIndex)) {
      const blockBuf = blocksWithDeletables.get(blockIndex)
      gotBlockForDelete(null, blockBuf)
    } else {
-     getBlock(offset, gotBlockForDelete)
    }
    function gotBlockForDelete(err, blockBuf) {
      if (err) return cb(err)
      const actualBlockBuf = blocksWithDeletables.get(blockIndex) || blockBuf
      Record.overwriteWithZeroes(actualBlockBuf, getOffsetInBlock(offset))
      deletedBytes += Record.readSize(actualBlockBuf, getOffsetInBlock(offset))
      blocksWithDeletables.set(blockIndex, actualBlockBuf)
      scheduleFlushDelete()
      cb()
    }
  }

offset is out of range

We got a crash in production (Manyverse):

The value of \\\"offset\\\" is out of range. It must be >= 0 and <= 65534. Received 74465
RangeError [ERR_OUT_OF_RANGE]: The value of \\\"offset\\\" is out of range. It must be >= 0 and <= 65534. Received 74465
    at boundsError (internal/buffer.js:81:9)
    at Buffer.readUInt16LE (internal/buffer.js:238:5)
    at Object.getDataNextOffset (/data/data/se.manyver/files/nodejs-project/index.js:57782:31)
    at Stream._handleBlock (/data/data/se.manyver/files/nodejs-project/index.js:39782:32)
    at Stream._resume (/data/data/se.manyver/files/nodejs-project/index.js:39844:14)
    at Request._callback (/data/data/se.manyver/files/nodejs-project/index.js:57746:9)
    at random-access-storage Request.callback (/data/data/se.manyver/files/nodejs-project/index.js:41966:8)
    at random-access-file onread (/data/data/se.manyver/files/nodejs-project/index.js:68823:31)
    at FSReqCallback.wrapper [as oncomplete] (fs.js:520:5)

@arj03

Crash writeWithFSync on a null block

I am not sure exactly what triggered this crash but I'll document it as well as I can.

I'm running AAOL 4.3.5 and jitdb 7.0.1 and ssb-db2 4.2.1 in Manyverse. Some 30s after app startup (and all other queries working as normally), in the middle of making several ssb-db2 deleteFeed calls (no compaction yet), I got this crash for the ~20th deleteFeed call:

Screenshot from 2022-08-03 11-46-12

The lines of code are in random-access-storage v1.4.3:

https://github.com/random-access-storage/random-access-storage/blob/50dd02fd07d5d1690a3346684f8aac74dd76d017/index.js#L65

and

raf.write(blockStart, blockBuf, function onRAFWriteDone(err) {

which can only mean we tried to write a blockBuf that was null not a Buffer. Seems very strange that we would end up with a null blockBuf.

Add cache flag to stream

This allows a stream to not blow up the cache for a running application when doing full index.

More pausing stream bugs

@arj03 By the way, I just did a quick check, I updated asyncAOL in db2, then replaced tooHot with an always-true promise, so that it pauses after every delivery of a record. Then I tried running the db2 tests, some passed, but others deadlocked in an infinite loop. So I'll work on fixing these pausing bugs until all db2 tests pass when we do such always-pause behavior. That would hopefully significantly improve reliability of db2 in maxCpu mode.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.