Code Monkey home page Code Monkey logo

Comments (8)

vladak avatar vladak commented on May 23, 2024

How is the indexer run ? Was this initial or incremental reindex ? Is the directory in question part of some repository ?

from opengrok.

tarangchikhalia avatar tarangchikhalia commented on May 23, 2024

This is an incremental reindex. The directory is part of a repository which is copied from the remote server to the opengrok server (No SCM) but I have seen this error in many git repositories.

from opengrok.

vladak avatar vladak commented on May 23, 2024

Can you raise indexer log level to FINER or higher and post the logs around the log entries that start with Starting file collection and such for a case which encounters the directory problem ? This line and any subsequent lines that contain DefaultIndexChangedListener would help.

from opengrok.

tarangchikhalia avatar tarangchikhalia commented on May 23, 2024

Here are the logs with FINEST settings.

Jan 11, 2024 3:21:46 PM org.opengrok.indexer.index.IndexDatabase logIgnoredUid                                                                                                                                            [373/1934]
FINEST: ignoring deleted document for '/<project>/version.json' at 20240106111117766                                                                                                                          
Jan 11, 2024 3:21:46 PM org.opengrok.indexer.index.DefaultIndexChangedListener fileRemove                                                                                                                                           
FINE: Remove: '/<project>/version.json'                                                                                                                                                                       
Jan 11, 2024 3:21:46 PM org.opengrok.indexer.index.DefaultIndexChangedListener fileRemoved                                                                                                                                          
FINER: Removed: '/<project>/version.json'                                                                                                                                                                     
Jan 11, 2024 3:21:46 PM org.opengrok.indexer.util.Statistics logIt                                                                                                                                                                  
INFO: Done file collection for directory '/<project>' (took 15 ms)                                                                                                                                            
Jan 11, 2024 3:21:46 PM org.opengrok.indexer.index.IndexDatabase update                                                                                                                                                             
INFO: Starting indexing of directory '/<project>'                                                                                                                                                             
Jan 11, 2024 3:21:46 PM org.opengrok.indexer.index.IndexDatabase lambda$indexParallel$4                                                                                                                                             
WARNING: ERROR addFile(): '/var/opt/opengrok/<dir_path>'                                                                                                                               
java.io.FileNotFoundException: /var/opt/opengrok/<dir_path> (Is a directory)                                                                                                           
        at java.base/java.io.FileInputStream.open0(Native Method)                                                                                                                                                                   
        at java.base/java.io.FileInputStream.open(FileInputStream.java:219)                                                                                                                                                         
        at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)                                                                                                                                                       
        at org.opengrok.indexer.index.IndexDatabase.getAnalyzerFor(IndexDatabase.java:1217)                                                                                                                                         
        at org.opengrok.indexer.index.IndexDatabase.addFile(IndexDatabase.java:1129)                                                                                                                                                
        at org.opengrok.indexer.index.IndexDatabase.lambda$indexParallel$4(IndexDatabase.java:1781)                                                                                                                                 
        at java.base/java.util.stream.Collectors.lambda$groupingByConcurrent$59(Collectors.java:1304)                                                                                                                               
        at java.base/java.util.stream.ReferencePipeline.lambda$collect$1(ReferencePipeline.java:575)                                                                                                                                
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)                                                                                                                                        
        at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1655)                                                                                                                                 
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)                                                                                                                                          
        at java.base/java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:290)                                                                                                                                           
        at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)                                                                                                                                          
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)                                                                                                                                                
        at java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)                                                                                                                                              
        at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)                                                                                                                                                
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateParallel(ForEachOps.java:159)                                                                                                                                    
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(ForEachOps.java:173)                                                                                                                              
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
        at java.base/java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:661)
        at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:575)
        at org.opengrok.indexer.index.IndexDatabase.lambda$indexParallel$5(IndexDatabase.java:1770)
        at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1448)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)

Jan 11, 2024 3:21:46 PM org.opengrok.indexer.index.IndexDatabase lambda$indexParallel$4

from opengrok.

vladak avatar vladak commented on May 23, 2024

Can you also provide the line that contains Starting file collection ?

from opengrok.

vladak avatar vladak commented on May 23, 2024

I went through the related code in IndexDatabase and for the initial reindex I don't see a way there can be an entry in the IndexDownArgs that would correspond to a directory. The indexDown() recursive function that is executed when reindexing from scratch (or when history based reindex is off for some reason) traverses the directory tree like this:

for (File file : files) {
String path = parent + File.separator + file.getName();
if (!accept(dir, file, ret)) {
handleSymlink(path, ret);
} else {
if (file.isDirectory()) {
indexDown(file, path, args, progress);
} else {
processFile(args, file, path);
progress.increment();
}
}
}

The accept() call detects any allowed symlinks. The isDirectory() follows symlinks so even if the file is forbidden symlink, it will be still processed in the else branch as a directory, i.e. the indexDown() will recursively descend into that directory. The IndexDownArgs is modified (within this code path) only in the processFile() method and this method is always called for non-directory entries.

The IndexDownArgs is further modified in processTrailingTerms() from within update() however that only happens for pre-existing index documents.

The history based reindex (which is always non-initial) that is done in indexDownUsingHistory() is different story. There the accept() call that identifies allowed symlinks is not used so it could happen that processFileIncremental() which is the work horse for this indexing mode actually adds an IndexDownArgs entry that is a directory. For Git specifically, I don't think there is a way for the Git file tree traversal could contain directories (since in Git a directory can be added to the Git index only if non-empty) however if the entry is a symlink pointing to a directory, that is possible.

That's why I asked about the Starting file collection log entry so that I can see for which indexing mode this happens.

from opengrok.

tarangchikhalia avatar tarangchikhalia commented on May 23, 2024

Sorry for the delay. The project that was encountering this issue isn't showing it now. I am trying to reproduce it in a test environment.

from opengrok.

vladak avatar vladak commented on May 23, 2024

Sorry for the delay. The project that was encountering this issue isn't showing it now. I am trying to reproduce it in a test environment.

It definitely depends on the changes done since the last reindex. For history based reindex that would be the file trees in the newly added changesets.

from opengrok.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.