Code Monkey home page Code Monkey logo

Comments (9)

dfahlander avatar dfahlander commented on June 1, 2024

when only using a 'where' clause count is executed natively in IndexedDB. In your example there is a filter attached and this makes dexie have to count the rows manually instead. It's doing this by iterating every row that matches the where clause and testing it against the provided JS filter.

If there are thousands of unread emails in user's inbox, the count could take some time (500 ms or so).

To optimize this you could flag unread messages differently. Right now you are using 'flags'. Even if it could be indexed using multiEntry index, it cannot be a part of a compound index. So if you could complement this flag with an indexed property such as 'last_opened' (a timestamp), you could have a compound index of [folder_name+last_opened] and query the count as such:

// count all unseen emails
unread = await dexie_db.email
.where({
  folder_name: "INBOX",
  last_opened: 0 // make sure upgrade db and set this property on all items.
})
.count();

// count all seen emails:
unread = await dexie_db.email
.where("[folder_name+last_opened]")
.between(["INBOX", 1], ["INBOX", Dexie.maxKey])
.count();

You could also use a property like "seen" and set it to 1 or 0. The reason I picked a timestamp was just in case this info could be relevant in other scenarios such as listing the 10 most recently opened emails with a natively index based query (where-clause based with a .limit(10)).

from dexie.js.

Bloodcapillary avatar Bloodcapillary commented on June 1, 2024

when only using a 'where' clause count is executed natively in IndexedDB. In your example there is a filter attached and this makes dexie have to count the rows manually instead. It's doing this by iterating every row that matches the where clause and testing it against the provided JS filter.

If there are thousands of unread emails in user's inbox, the count could take some time (500 ms or so).

To optimize this you could flag unread messages differently. Right now you are using 'flags'. Even if it could be indexed using multiEntry index, it cannot be a part of a compound index. So if you could complement this flag with an indexed property such as 'last_opened' (a timestamp), you could have a compound index of [folder_name+last_opened] and query the count as such:

// count all unseen emails
unread = await dexie_db.email
.where({
  folder_name: "INBOX",
  last_opened: 0 // make sure upgrade db and set this property on all items.
})
.count();

// count all seen emails:
unread = await dexie_db.email
.where("[folder_name+last_opened]")
.between(["INBOX", 1], ["INBOX", Dexie.maxKey])
.count();

You could also use a property like "seen" and set it to 1 or 0. The reason I picked a timestamp was just in case this info could be relevant in other scenarios such as listing the 10 most recently opened emails with a natively index based query (where-clause based with a .limit(10)).

In the example above, flags are also an index, but I'm not sure if there's a possibility to optimize the search for folder_name while also searching if flags contain 'SEEN'.

from dexie.js.

dfahlander avatar dfahlander commented on June 1, 2024

No, as I explained, it's not possible to combine a multiEntry index (flags) with a normal index (folder_name) for the purpose of counting, but it is with a compound index containing two single-valued properties.

from dexie.js.

Bloodcapillary avatar Bloodcapillary commented on June 1, 2024

No, as I explained, it's not possible to combine a multiEntry index (flags) with a normal index (folder_name) for the purpose of counting, but it is with a compound index containing two single-valued properties.

Does this mean that later on, if I want to perform fuzzy searches on the content serving as the index, I can only retrieve it through filters in RAM?

from dexie.js.

dfahlander avatar dfahlander commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

from dexie.js.

Bloodcapillary avatar Bloodcapillary commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

Assuming I have an index named 'content',And 'content' is an index containing a piece of text.Now, I want to perform a fuzzy search using a word or a sentence,In this scenario, does it necessitate pulling all email data into memory for computation?

from dexie.js.

Bloodcapillary avatar Bloodcapillary commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

Sorry, my English is not very good.The description of the issue might be somewhat inaccurate.

from dexie.js.

dfahlander avatar dfahlander commented on June 1, 2024

I don't understand the question. You can still use the index for flags as you've been doing before.

Assuming I have an index named 'content',And 'content' is an index containing a piece of text.Now, I want to perform a fuzzy search using a word or a sentence,In this scenario, does it necessitate pulling all email data into memory for computation?

If doing .filter() you can do fuzzy search without loading all records into RAM but loading them one-by-one and testing against your filter. This will not load all records to RAM but it is not as fast as loading several records into RAM and filtering afterwards - but again - that takes more memory.

To optimize both RAM and memory load the table in chunks of 100 or so (depending on how much RAM you can allow and how big each object is) and search in the array chunks. To do that you need to loop through the chunks as follows:

const CHUNKSIZE = 100; // Could be 1000 also depending on how much RAM you are willing to sacrifice.
const result = [];
let startKey = -Infinity;
while (true) {
  const chunk = await db.yourTable
    .where(':id').above(startKey)
    .limit(CHUNKSIZE)
    .toArray();
  result.push(...chunk.filter(yourFilteringFunction));
  if (chunk.length < CHUNKSIZE) break; // done
  startKey = chunk.at(-1).id; // Replace .id with your primary key property.
}

This way you never keep more than CHUNKZISE records in memory at the same time but still do a pretty fast fuzzy search.

from dexie.js.

Bloodcapillary avatar Bloodcapillary commented on June 1, 2024
const CHUNKSIZE = 100; // Could be 1000 also depending on how much RAM you are willing to sacrifice.
const result = [];
let startKey = -Infinity;
while (true) {
  const chunk = await db.yourTable
    .where(':id').above(startKey)
    .limit(CHUNKSIZE)
    .toArray();
  result.push(...chunk.filter(yourFilteringFunction));
  if (chunk.length < CHUNKSIZE) break; // done
  startKey = chunk.at(-1).id; // Replace .id with your primary key property.
}

I don't understand the question. You can still use the index for flags as you've been doing before.

Assuming I have an index named 'content',And 'content' is an index containing a piece of text.Now, I want to perform a fuzzy search using a word or a sentence,In this scenario, does it necessitate pulling all email data into memory for computation?

If doing .filter() you can do fuzzy search without loading all records into RAM but loading them one-by-one and testing against your filter. This will not load all records to RAM but it is not as fast as loading several records into RAM and filtering afterwards - but again - that takes more memory.

To optimize both RAM and memory load the table in chunks of 100 or so (depending on how much RAM you can allow and how big each object is) and search in the array chunks. To do that you need to loop through the chunks as follows:

const CHUNKSIZE = 100; // Could be 1000 also depending on how much RAM you are willing to sacrifice.
const result = [];
let startKey = -Infinity;
while (true) {
  const chunk = await db.yourTable
    .where(':id').above(startKey)
    .limit(CHUNKSIZE)
    .toArray();
  result.push(...chunk.filter(yourFilteringFunction));
  if (chunk.length < CHUNKSIZE) break; // done
  startKey = chunk.at(-1).id; // Replace .id with your primary key property.
}

This way you never keep more than CHUNKZISE records in memory at the same time but still do a pretty fast fuzzy search.

Thank you for your help

from dexie.js.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.