unread = await dexie_db.email .where("folder_name") .equals("INBOX") .filt

<div class="highlight highlight-source-ts notranslate position-relative overflow-auto"

Is the 'count' method executed in the database or in RAM? about dexie.js HOT 9 OPEN

Bloodcapillary commented on June 1, 2024

Is the 'count' method executed in the database or in RAM?

from dexie.js.

Comments (9)

dfahlander commented on June 1, 2024

when only using a 'where' clause count is executed natively in IndexedDB. In your example there is a filter attached and this makes dexie have to count the rows manually instead. It's doing this by iterating every row that matches the where clause and testing it against the provided JS filter.

If there are thousands of unread emails in user's inbox, the count could take some time (500 ms or so).

To optimize this you could flag unread messages differently. Right now you are using 'flags'. Even if it could be indexed using multiEntry index, it cannot be a part of a compound index. So if you could complement this flag with an indexed property such as 'last_opened' (a timestamp), you could have a compound index of [folder_name+last_opened] and query the count as such:

// count all unseen emails
unread = await dexie_db.email
.where({
  folder_name: "INBOX",
  last_opened: 0 // make sure upgrade db and set this property on all items.
})
.count();

// count all seen emails:
unread = await dexie_db.email
.where("[folder_name+last_opened]")
.between(["INBOX", 1], ["INBOX", Dexie.maxKey])
.count();

You could also use a property like "seen" and set it to 1 or 0. The reason I picked a timestamp was just in case this info could be relevant in other scenarios such as listing the 10 most recently opened emails with a natively index based query (where-clause based with a .limit(10)).

from dexie.js.

Bloodcapillary commented on June 1, 2024

when only using a 'where' clause count is executed natively in IndexedDB. In your example there is a filter attached and this makes dexie have to count the rows manually instead. It's doing this by iterating every row that matches the where clause and testing it against the provided JS filter.

If there are thousands of unread emails in user's inbox, the count could take some time (500 ms or so).

To optimize this you could flag unread messages differently. Right now you are using 'flags'. Even if it could be indexed using multiEntry index, it cannot be a part of a compound index. So if you could complement this flag with an indexed property such as 'last_opened' (a timestamp), you could have a compound index of [folder_name+last_opened] and query the count as such:
// count all unseen emails
unread = await dexie_db.email
.where({
  folder_name: "INBOX",
  last_opened: 0 // make sure upgrade db and set this property on all items.
})
.count();

// count all seen emails:
unread = await dexie_db.email
.where("[folder_name+last_opened]")
.between(["INBOX", 1], ["INBOX", Dexie.maxKey])
.count();
You could also use a property like "seen" and set it to 1 or 0. The reason I picked a timestamp was just in case this info could be relevant in other scenarios such as listing the 10 most recently opened emails with a natively index based query (where-clause based with a .limit(10)).

In the example above, flags are also an index, but I'm not sure if there's a possibility to optimize the search for folder_name while also searching if flags contain 'SEEN'.

from dexie.js.

dfahlander commented on June 1, 2024

No, as I explained, it's not possible to combine a multiEntry index (flags) with a normal index (folder_name) for the purpose of counting, but it is with a compound index containing two single-valued properties.

from dexie.js.

Bloodcapillary commented on June 1, 2024

No, as I explained, it's not possible to combine a multiEntry index (flags) with a normal index (folder_name) for the purpose of counting, but it is with a compound index containing two single-valued properties.

Does this mean that later on, if I want to perform fuzzy searches on the content serving as the index, I can only retrieve it through filters in RAM?

from dexie.js.

dfahlander commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

from dexie.js.

Bloodcapillary commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

Assuming I have an index named 'content',And 'content' is an index containing a piece of text.Now, I want to perform a fuzzy search using a word or a sentence,In this scenario, does it necessitate pulling all email data into memory for computation?

from dexie.js.

Bloodcapillary commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

Sorry, my English is not very good.The description of the issue might be somewhat inaccurate.

from dexie.js.

dfahlander commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

Assuming I have an index named 'content',And 'content' is an index containing a piece of text.Now, I want to perform a fuzzy search using a word or a sentence,In this scenario, does it necessitate pulling all email data into memory for computation?

If doing .filter() you can do fuzzy search without loading all records into RAM but loading them one-by-one and testing against your filter. This will not load all records to RAM but it is not as fast as loading several records into RAM and filtering afterwards - but again - that takes more memory.

To optimize both RAM and memory load the table in chunks of 100 or so (depending on how much RAM you can allow and how big each object is) and search in the array chunks. To do that you need to loop through the chunks as follows:

const CHUNKSIZE = 100; // Could be 1000 also depending on how much RAM you are willing to sacrifice.
const result = [];
let startKey = -Infinity;
while (true) {
  const chunk = await db.yourTable
    .where(':id').above(startKey)
    .limit(CHUNKSIZE)
    .toArray();
  result.push(...chunk.filter(yourFilteringFunction));
  if (chunk.length < CHUNKSIZE) break; // done
  startKey = chunk.at(-1).id; // Replace .id with your primary key property.
}

This way you never keep more than CHUNKZISE records in memory at the same time but still do a pretty fast fuzzy search.

from dexie.js.

Bloodcapillary commented on June 1, 2024

const CHUNKSIZE = 100; // Could be 1000 also depending on how much RAM you are willing to sacrifice.
const result = [];
let startKey = -Infinity;
while (true) {
  const chunk = await db.yourTable
    .where(':id').above(startKey)
    .limit(CHUNKSIZE)
    .toArray();
  result.push(...chunk.filter(yourFilteringFunction));
  if (chunk.length < CHUNKSIZE) break; // done
  startKey = chunk.at(-1).id; // Replace .id with your primary key property.
}

I don't understand the question. You can still use the index for flags as you've been doing before.

Assuming I have an index named 'content',And 'content' is an index containing a piece of text.Now, I want to perform a fuzzy search using a word or a sentence,In this scenario, does it necessitate pulling all email data into memory for computation?

If doing .filter() you can do fuzzy search without loading all records into RAM but loading them one-by-one and testing against your filter. This will not load all records to RAM but it is not as fast as loading several records into RAM and filtering afterwards - but again - that takes more memory.

To optimize both RAM and memory load the table in chunks of 100 or so (depending on how much RAM you can allow and how big each object is) and search in the array chunks. To do that you need to loop through the chunks as follows:
const CHUNKSIZE = 100; // Could be 1000 also depending on how much RAM you are willing to sacrifice.
const result = [];
let startKey = -Infinity;
while (true) {
  const chunk = await db.yourTable
    .where(':id').above(startKey)
    .limit(CHUNKSIZE)
    .toArray();
  result.push(...chunk.filter(yourFilteringFunction));
  if (chunk.length < CHUNKSIZE) break; // done
  startKey = chunk.at(-1).id; // Replace .id with your primary key property.
}
This way you never keep more than CHUNKZISE records in memory at the same time but still do a pretty fast fuzzy search.

Thank you for your help

from dexie.js.

Is the 'count' method executed in the database or in RAM? about dexie.js HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent