Comments (4)
1 - Yes - Indexing may not work as well outside of git repositories for now. It's recommended to use inside of git repositories for the time being, but for now, if this workaround works for you, then it works for me!
2 - Hmm, I am not sure exactly what is going on, since I do not know the commands you were running, but I suspect that you are not passing the path/s to the chat
command? If no path is specified, then it is basically ChatGPT in the command line. Running the chat
command with a question or query and a path or paths will automatically index all of the files within the path/s. In order to query them, you need to pass the path/s again, or some files contained within, and it will reuse the index previously created, but it will only use the portion of the index which is contained within the path in the query.
3 - Currently, a small amount of data is stored for each file you index in a json file. It would be a large amount of data if you index many files, but it probably works out to 50-500 characters per file depending on the size + some metadata about the file. They are stored indefinitely, but you can delete or view portions of your index with mf inspect *path*
and mf delete *path*
.
I hope this answers your questions. If you have any more please ask. We'll be working on improving documentation moving forward. As a heads up though, we are looking to rework the indexing and querying mechanism soon, so it should be much cheaper and faster, but will be done differently.
from mindflow.
- For what purpose is that JSON stored? How is it used? I was expecting to have full file in the context.
If I have several files indexed in the folder, and I make a change in one of the files, can I re-index only one file?
from mindflow.
The file text is what is ultimately used as context, however, when querying, we use a vector embedding similarity approach. We found that we got much better results when comparing vector embeddings with a summary instead of the file text/code directly. Because the summaries were slow and sometimes costly to generate, we save them here. What is ultimately given to chat GPT though is not the summary, but the text of the file - we don't need to save that though.
from mindflow.
To your second point, yes. This is how it should work. It will only re-index that changed file.
from mindflow.
Related Issues (19)
- need to fix conversation overflowing based on hard token limit HOT 1
- make mf diff faster, add progress bar HOT 1
- TEST ISSUE
- add docs to functions (Args, Returns, Raises) (if any)
- mf login {OPENAI_API_KEY} results in empty .config/mindflow/DB.json on MacOS HOT 10
- I get an error like this when i want to use mindflow HOT 3
- Cannot run in dev mode getting error No module named 'mindflow' HOT 10
- Can I change the openai api endpoint? HOT 7
- Mypy Coroutine type issue
- Fix lazy initialization of configurable objects as properties HOT 1
- Embeddings at inference HOT 1
- Testing HOT 1
- Fix outputs
- Get GPT 3.5. Turbo 0301 working.
- Test everything
- Configuration menu is only working for MacOS HOT 1
- Cleanup requirements.txt HOT 1
- getting No index for requested hashes. Please generate index for passed content. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mindflow.