Code Monkey home page Code Monkey logo

Comments (34)

vsoch avatar vsoch commented on September 27, 2024

ah I think I see the error - you have a branch without commits.

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

okay, I fixed that potential bug - let me know the details / if further work is needed, and feel free to close the issue if not. Happy Friday!

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Ciao @vsoch!
I was about to write you, I don't really know what could have triggered the error, I only remember that I was trying to switch to manual Trigger Builds when everything broke up (branch without commits? Sure? Sounds strange to me). I'll let you know if I get again into that.

Happy Friday too!

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Btw, I have another problem, the Hub cannot build successfully any image. The status on the dashboard is "Error", but when I look into the log I can't find any actual error, indeed the build seems to me successfully completed! Of course I managed to build them on my laptop.

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

Let me look into this, I don't see any indication of error either, so some miscommunication must have happened between the builder and application.

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

I can't find any indication of why this might be in the log, other than a chance weird order of user issued commands. For example , it looks like you cancelled a build or two and then probably tried to restart? What might have happened is that the builder finished, received the "kill" message, but then sent back the log to indicate ERROR (because no final image post was received first). The best way to debug this would be to delete all current containers, and then either trigger a new build or re-connect the connection freshly. The container and metadata do exist, but the hub doesn't know about them for some reason. Did you happen to click rebuild? I might need to disable this feature until I can be sure it is working correctly.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

I’ve never seen any of my containers built, neither any “rebuild” button in the dashboard, and yes, at the beginning (with the builds triggered by new commits) maybe I messed up something continuously reissuing rebuilds. However, last thing I remember was setting manual triggering, deleting all containers with the “trash” button and reissuing new build processes... Should I try again?

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

let me test an active instance before that - if there is still mystery then what we can do is have you redo, and I'll watch the streaming server logs for any hints.

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

ok, I found and fixed a bug I think was the issue. Please try again! Thanks for helping with this, it's hard to predict some of these edge cases.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Hey @vsoch, seems like nothing changed, I deleted the containers and then restarted the build but I get still the same error...

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

hey @luigipertoldi - I watched it re-build on its own, and the image produces ok but the POST authentication does not. What this means is that there is a mismatch between the metadata, and I need to figure out what that is. I still think it's something to do with changing things around (and having a wrong identifier) but this shouldn't happen. So for now please don't delete / cancel the containers - I have it now manually running and I"m walking through the steps until I get to the 400 error. Yes, I've been working on this since I woke up 7:30am !

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

This is what is called user support ahah! I hope you'll find the culprit and fix it without wasting your whole weekend...

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

okay, so I found the reason, but I haven't a clue why it got that way. Well I think it was still an issue with a server hiccup when metadata needed to be set, and then it was never set. The issue was that the container id (important to validate the payload) was set as the collection payload. When I changed this running manually, it went through smoothly and the container is done.

So - what I think we should do (but please don't do this today, or even tomorrow, I want at least a one day weekend!) is to start a build, fresh, and write down exactly the buttons you push, steps you do. If it's a one time hiccup, it won't happen again. If there is somewhere that a collection id is being set for the container id, it will reproduce, and I'll need that careful list of steps (and specific selections. and status of collection before you did it) so I can walk through the whole process. It does take a few hours to replicate what happens magically in under a minute, so yes, please let's not do this tomorrow :)

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

No worries! Tomorrow I'll be as far as possible from my laptop, Sunday is sacred also here in Europe :) Also, for me there's no urgency to fix this in few hours.

Many many many thanks for your time, I feel kinda guilty for not being able to help in debugging and maybe let you save some of your time. Have a nice one-day-weekend!

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

No worries! I wouldn't truly be an open source software engineer if I were required to do it. Fulfillment comes best this way! :)

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

Exactly, and maybe thrown a door at you or a projectile turkey.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Ciao Vanessa, today I restarted from scratch: I deleted the whole collection and then recreated, no more, no less than these two actions (so I retained all the default settings, such as builds triggered by commit). Again, the build process fails as before...

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

And so it begins... again! It must be something different with your collection because I'm seeing 200 responses on other builds, but the post to receive the builder doesn't even happen (suggesting that it's not getting past nginx or something like that). Here is what I see for the instance - the receive gets status 400 (not authorized) and then the finish ping stops everything:

uwsgi_1   | [pid: 522|app: 0|req: 17812/17810] 35.197.1.169 () {42 vars in 739 bytes} [Sat Nov 25 08:32:12 2017] POST /hooks/build/gce/receive => generated 26 bytes in 22 msecs (HTTP/1.1 400) 4 headers in 116 bytes (1 switches on core 0)
uwsgi_1   | [pid: 522|app: 0|req: 17816/17814] 35.197.1.169 () {42 vars in 733 bytes} [Sat Nov 25 08:32:41 2017] POST /hooks/build/gce/finish => generated 0 bytes in 966 msecs (HTTP/1.1 200) 3 headers in 107 bytes (1 switches on core 0)
...
uwsgi_1   | [pid: 522|app: 0|req: 18363/18361] 35.197.1.169 () {42 vars in 739 bytes} [Sat Nov 25 12:39:40 2017] POST /hooks/build/gce/receive => generated 26 bytes in 16 msecs (HTTP/1.1 400) 4 headers in 116 bytes (1 switches on core 0)
uwsgi_1   | [pid: 522|app: 0|req: 18364/18362] 35.197.1.169 () {42 vars in 733 bytes} [Sat Nov 25 12:40:11 2017] POST /hooks/build/gce/finish => generated 0 bytes in 12811 msecs (HTTP/1.1 200) 3 headers in 107 bytes (1 switches on core 0)

Anyway, I'm running it to test (again) so please don't delete / change the collection currently as is. I am also going to be updating the builders for 2.4.1 soon and can do further testing if needed. Sorry about the trouble! Messages that end in "..." always spell to me "hanging disappointment."

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

Also, it's not called a Singularityfile, just "Singularity.<tag". It seems to work ok because technically I check for if the name contains "Singularity," but just for future note! And I'm not 100% sure it's not involved with the current issue, but probably 90% :)

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

okay great news! I definitely figured it out. The issue is that the POST was actually too big given the apps. The fix was to remove the "files" added to the apps (so they won't be included in the view).

https://singularity-hub.org/containers/949

When I update the builder I will make some changes to not include the full list of files (maybe just the top level directories) to keep the post to a reasonable response. Building new images (for you) won't work until I do this, but I should be able to in the next few days because 2.4.1 is out. Thank goodness.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Great! I'll wait for it.... (I added a dot so now it means "anxiously waiting")

Thanks so much, you finally did it!

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

hey @luigipertoldi ! I updated the builder and singularity python to remove a (hopefully) large enough component of the app files, and triggered a build for your other recipe: https://singularity-hub.org/collections/297. This also meant I changed your collection builder to be using 2.4.1. The build had a return value of 1, and it looks like some (new?) error with pandas indexing:

Traceback (most recent call last):
File "", line 1, in 
File "/usr/local/lib/python3.5/dist-packages/singularity-2.4.1-py3.5.egg/singularity/build/google.py", line 237, in run_build
params=params)
File "/usr/local/lib/python3.5/dist-packages/singularity-2.4.1-py3.5.egg/singularity/build/main.py", line 182, in run_build
most_similar = os_sims['SCORE'].idxmax()
File "/usr/local/lib/python3.5/dist-packages/pandas/core/series.py", line 1357, in idxmax
i = nanops.nanargmax(_values_from_object(self), skipna=skipna)
File "/usr/local/lib/python3.5/dist-packages/pandas/core/nanops.py", line 74, in _f
raise TypeError(msg.format(name=f.__name__.replace('nan', '')))
TypeError: reduction operation 'argmax' not allowed for this dtype

in the log here https://singularity-hub.org/containers/950/log

So I'll need to (again) manually debug this build to figure out what particulars for your image are in the index (that is disliked) and then a fix for it.

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

okay, I fixed the build, re-ran, it looks good! https://singularity-hub.org/collections/297

Please mess around / take a look, and when you are satisfied I'll finalize the tag update, make the builder default for new collections, and notify everyone about the builder.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Great! Seems to work but later I'll do some tests and let you know.

And now for something completely different (yes, build bots hate me): I added another collection that uses the present one as a base and some private submodules. The build fails and I suppose this is because of missing read permissions, as for the Docker Hub. If this is the case, what should I do?

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

Can you share the log with me? The collection is private.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Now it's public!

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

There are known permissions issues when doing an extraction, I've seen messages like this before:

Exporting contents of shub://luigipertoldi/baseos-containers:g4.9.6 to /root/build
tar: ./etc/gshadow: Warning: Cannot open: Permission denied
tar: ./etc/gshadow-: Warning: Cannot open: Permission denied
tar: ./etc/shadow: Warning: Cannot open: Permission denied
tar: ./etc/shadow-: Warning: Cannot open: Permission denied
tar: ./etc/tcsd.conf: Warning: Cannot open: Permission denied
tar: ./usr/bin/ssh-agent: Warning: Cannot open: Permission denied
tar: ./usr/bin/staprun: Warning: Cannot open: Permission denied
tar: ./usr/libexec/openssh/ssh-keysign: Warning: Cannot open: Permission denied
tar: ./var/lib/tpm: Warning: Cannot open: Permission denied

Are any of those files required for the build? That would be the first thing to address. For the permission denied error, I would post an issue on the Singularity proper board and reference the "secure build" image on Singularity Hub. My guess is that the secure build doesn't allow changes outside of a scoped build tree, and the extraction violates that. If it's some issue with build from shub itself, the developer who implemented that bit should have a look too.

This error is ok, it just means a label was redefined:

ERROR org.label-schema.usage.singularity.runscript.help found in /usr/local/var/singularity/mnt/container/.singularity.d/labels.json and overwrite set to False.

Quick question - I see two Singularity files in your build history, but only one in the repo. Do you know why?

For the MGDODIR, are these variables needed to compile?

appenv MGDO
    export LD_LIBRARY_PATH="/scif/apps/root/lib:/scif/apps/clhep/lib:$LD_LIBRARY_PATH"
    export MGDODIR="/scif/src/MGDO"

These won't be exported / defined until runtime, so if you need them to build you need to export them in an install section.

+ ./configure --enable-tam --enable-streamers CXXFLAGS=-std=c++11 --with-clhep=/scif/apps/clhep --with-rootsys=/scif/apps/root --prefix=/scif/apps/MGDO
/bin/sh: line 3: ./configure: No such file or directory

I am wondering if the %files section copied files as expected? Did you try an ls of that folder before issuing the command? Is there any reason you are creating another subfolder "src" under /scif/apps instead of maintaining each in their respective folders? The idea is that we would want to find the dependency files in /scif/apps/MGDO because it's more clear they belong to that tool.

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

I can build the images on my laptop, so every error we see here is related to the new building environment.

Are any of those files required for the build?

No AFAIK, as you can see from the recipe file I did not touch them.

Quick question - I see two Singularity files in your build history, but only one in the repo. Do you know why?

I have a Singularityfile for each of my two branches, so I enabled the manual build triggers and specified the actual names of the recipe files.

For the MGDODIR, are these variables needed to compile?

No, I need them run-time. BTW as I said at the beginning I can successfully build the images on my laptop.

I am wondering if the %files section copied files as expected?

This is the actual point, if it works like in the Docker Hub, I expect a failure in cloning the private submodules (well, I don't actually know if the bot caches the whole repo...)

Is there any reason you are creating another subfolder "src" under /scif/apps?

Yes, I need to keep the source code in the image, so I decided to put it under /scif/src. Then I install the compiled code in the standard location under /scif/apps/MGDO.

Is there any reason you are creating another subfolder "src" under /scif/apps instead of maintaining each in their respective folders? The idea is that we would want to find the dependency files in /scif/apps/MGDO because it's more clear they belong to that tool.

I don't really understand what are you saying here, do you mean that keeping the source files under, for example, /scif/apps/MGDO/src/ would be better?

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

I can build the images on my laptop, so every error we see here is related to the new building environment.

ok cool, it might be good to be sure you are using 2.4.1 too.

I don't really understand what are you saying here, do you mean that keeping the source files under, for example, /scif/apps/MGDO/src/ would be better?

Yes, I think so. If someone has an algorithm to parse over files related to one of your apps, they will miss the entire code base that was built from. There is no rule about "only bin and lib" belonging in the app folder.

This is the actual point, if it works like in the Docker Hub, I expect a failure in cloning the private submodules (well, I don't actually know if the bot caches the whole repo...)

Can you make them (not private) or use some other method?

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

ok cool, it might be good to be sure you are using 2.4.1 too.

Yep!

Can you make them (not private) or use some other method?

Unfortunately no, it's closed-source software (yep, absolutely awful, and this is still a reality in today's science).

Should I post an issue about this somewhere else? I really suspect it's the same as for the Docker Hub.

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

Generally, if this is truly closed source then you should possibly not be using the community / open source infrastructure, because even a "private" repository is being handled by the tool. I think what would help is to show a minimal working example for what would make your repo work. I can't comment on where to post there - Singularity Hub thus far is me so I would be the main person you would chat with. For Docker Hub, have you tried their issue boards or even something like Twitter?

from baseos-containers.

gipert avatar gipert commented on September 27, 2024

Hi @vsoch, sorry for the late reply but I was kinda busy.
I tested a bit the new builder and it seems to work well, I got no more failures after your last fix.

It would be nice to further debug the issue with private submodules, I could try to set up a mwe but I really don't have time to invest in it. So for now I'll find some alternative solution :/

Anyway, thanks so much for your help, at least you fixed a lot of things with the new builder!

P.S. Wouldn't it be nice to give to possibility to the users to simply upload to the hub pre-builded singularity images, just like the Docker Hub?

from baseos-containers.

vsoch avatar vsoch commented on September 27, 2024

Yes it would! For Singularity Hub, allowing that kind of access is (currently anyway) outside of the scope of what I can manage for an application. However I liked the idea so much that I did implement a solution - this is how a Singularity Registry works --> https://singularityhub.github.io/sregistry/ and then different registries can serve images available command line via singularity, and have complete control of the build, and pushing. Here are the tiny set of registries we have --> https://singularityhub.github.io/containers/

from baseos-containers.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.