Code Monkey home page Code Monkey logo

Comments (22)

geerlingguy avatar geerlingguy commented on June 3, 2024

Hmm... this is a tough one; I've hit this from time to time as well, and it seems that it's an issue with Fedora's servers not being as reliable as I'd like them to be :(

We could add in a failed_when: false and catch a failed condition, then re-run the download or something like that, but it feels hacky. Any other way to do this, simply? For now, I just run the playbook a second time if this happens, and it usually is fine after that.

from ansible-role-repo-epel.

solomongifford avatar solomongifford commented on June 3, 2024

Some of our playbooks are literally 1.5 hours to run - so thats why we're looking for a better solution.

Wonder if we can do something like:

register: result
until: result.somecondition
retries: 5
delay: 10

from ansible-role-repo-epel.

solomongifford avatar solomongifford commented on June 3, 2024

Here's the output when it errors:

failed: [vagrantstaging.blackmesh.com] => {"attempts": 0, "changed": false, "failed": true, "rc": 1}
msg: Package at http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm could not be installed

Here's the least hacky way I was able to solve:

- name: Install EPEL repo.
  yum:
    name: "{{ epel_repo_url }}"
    state: present
  when: ansible_distribution == 'CentOS'
  register: result
  until: '"failed" not in result'
  retries: 5
  delay: 10

I've verified this works in all three conditions (already installed, new install, and failed install that worked on a retry).

from ansible-role-repo-epel.

rrauenza avatar rrauenza commented on June 3, 2024

It could also skip the install if it is already installed -- it seems to attempt the download even when the rpm is already installed.

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

Has there been a proper fix for this issue. I see this very often now and don't know what to do :(

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

I tried with max retries as 5. Still it fails. I get the following message.

msg: Task failed as maximum retries was encountered

from ansible-role-repo-epel.

geerlingguy avatar geerlingguy commented on June 3, 2024

@anirudh-wa - you maybe able to add in a failed_when: false for the task so it effectively never fails.

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

@geerlingguy I added the following and I still see the same error.

when: ansible_distribution == 'CentOS'
register: result
until: '"failed" not in result'
retries: 5
delay: 10

Is there anything am I missing ?

from ansible-role-repo-epel.

geerlingguy avatar geerlingguy commented on June 3, 2024

Try with the last line added as well (the 'failed_when' line):

when: ansible_distribution == 'CentOS'
register: result
until: '"failed" not in result'
retries: 5
delay: 10
failed_when: false

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

Thanks. I just added it. I hope it works. Will let you guys know how it went.

from ansible-role-repo-epel.

geerlingguy avatar geerlingguy commented on June 3, 2024

fingers crossed

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

It still fails for me :(

from ansible-role-repo-epel.

geerlingguy avatar geerlingguy commented on June 3, 2024

@anirudh-wa - Maybe you have an old version of the role? Can you delete the geerlingguy.epel role entirely, and reinstall it through galaxy? It could be the URL is from an old version of the role and outdated.

Also, please run the playbook with -vvvv for maximum output, and paste the failed task output here so we can look into it further.

I haven't been having any issues for a while now.

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

I downloaded the lastest source yesterday night and tested on it. I still get that error sometimes.

I ran it with -vvv and just got this message.

"FAILED - RETRYING: TASK: Install packages with YUM (4 retries left). Result was: {u'msg': u'Package at http://xxx.abcd.rpm could not be installed', u'failed': True, u'changed': False, u'rc': 1}"

I stopped it there coz it was taking too much of time

from ansible-role-repo-epel.

solomongifford avatar solomongifford commented on June 3, 2024

@anirudh-wa - Have you tried actually running the yum commands that the ansible role is running? Is it failing ever? Always? Sometimes? It appears from this conversation that the ansible role isn't at fault - the original "fix" I proposed above simply retries the yum tasks multiple times so that if its a timeout issue it will eventually (hopefully) succeed. If you're still having a scenario where it succeeds sometimes but fails other times you could simply up the number of retries from 5 to 50 - but if its failing 100% of the time, then the retry won't fix the underlying issue.

---
- name: Install EPEL repo.
  yum:
    name: "{{ epel_repo_url }}"
    state: present
  when: ansible_distribution == 'CentOS'
  register: result
  until: '"failed" not in result'
  retries: 5
  delay: 10

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

It is failing sometimes. I also tried running yum commands which ansible role is running. It works fine as well. If that would have failed, my package installation should always fail, which is not the case. I feel that waiting for 5 retries itself is so long. No way anyone waits for 50 retries. That would be the end of the day for ansible if people were to wait so long.

from ansible-role-repo-epel.

solomongifford avatar solomongifford commented on June 3, 2024

Fair enough. I've come to the conclusion that either the ansible yum module is timing out prematurely - or - the epel repo is unreliable. I've not dug deeper - but in any case, I don't think that the issue is this role. The bandaid I suggest just masks the underlying issue.

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

Yeah. I wish this is resolved soon enough. I am doing a work around, which I hate it. I am relaunching my ec2-instance and trying to install the rpm again, coz you never know how long the retries are going to take.

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

Thanks a lot guys for all the help. You were quick to respond :)

from ansible-role-repo-epel.

solomongifford avatar solomongifford commented on June 3, 2024

Relaunching the entire ec2-instance on fail is essentially the same conceptual bandaid as doing the retry, only with a LOT more impact.

Anyway, no problem.

from ansible-role-repo-epel.

anirudh-wa avatar anirudh-wa commented on June 3, 2024

I need one more help, don't know if I can ask here. I am running the ansible-playbook command from a shell script. Just wanted to know if the ansible-playbook returns anything after it finishes. I would like to catch that true/false status in a shell variable(in case it returns) and do my work accordingly.

from ansible-role-repo-epel.

geerlingguy avatar geerlingguy commented on June 3, 2024

You can test this pretty quickly by doing the following on your local machine:

  1. Create a test.yml playbook with the following contents:

    - hosts: localhost
      connection: local
    
      tasks:
        - command: "false"
  2. Run ansible-playbook test.yml

  3. After the playbook fails, run echo $? (gives status of previous command).

  4. Switch the false to true, and do steps 2-3 again.

Judging by that output, ansible-playbook outputs rc of 2 on playbook fail, and 0 on success.

from ansible-role-repo-epel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.