Code Monkey home page Code Monkey logo

Comments (12)

buggithubs avatar buggithubs commented on August 27, 2024

Hi 根据报错提示,node_export服务未启动成功

  • 检查deploy_user是否为root
  • 检查目标主机,可以查看下{{ deploy_dir }}/log/node_export.log日志
  • 在目标主机使用deploy_user手动拉起测试
  • 检查主机之间网络是否延迟丢包

from docs-cn.

qinix avatar qinix commented on August 27, 2024

@buggithubs

1、deploy_user 是 root
2、目标主机对应日志文件不存在
3、手动运行 node_exporter 测试可以访问到监听的端口
4、主机之间全部是内网,没有延迟丢包

from docs-cn.

qinix avatar qinix commented on August 27, 2024

成功打出日志了,错误内容如下:

/mnt/vdb1/deploy/scripts/run_node_exporter.sh: line 3: ulimit: open files: cannot modify limit: Operation not permitted

from docs-cn.

buggithubs avatar buggithubs commented on August 27, 2024

@qinix

  1. 服务不推荐使用root用户启动,启动脚本检测到root用户启动会退出
  2. 日志提示没有权限修改ulimit -n 1000000

  1. 建议使用小用户启动用户(注意文件及目录权限)
  2. 建议查看小用户 ulimit 配置,必须大于 40960.

from docs-cn.

qinix avatar qinix commented on August 27, 2024

@buggithubs

  1. deploy_user 是 root,但 become_user 是 tidb,这也是你们文档推荐的
  2. ansible 安装已经自动修改了 /etc/security/limits.conf,增加了 ulimit 限制。并且我手动将所有限制都增大一样报错

from docs-cn.

buggithubs avatar buggithubs commented on August 27, 2024

@qinix

  1. 检查下文件权限与启动用户是否一致,文件权限是不是tidb
  2. 在tidb用户环境变量下执行 ulimit -Hn 查看是否为100 0000,如果是,可以注释run_node_exporter.sh脚本第三行,再次手动启动。
  3. 目前遇到是系统问题,请根据提示检查下相关信息。

from docs-cn.

qinix avatar qinix commented on August 27, 2024

@buggithubs

  1. 文件权限没问题。
  2. 试过重启系统,仍旧不行,很奇怪。
  3. 使用 tidb 用户执行 ulimit -a 看到 /etc/security/limits.conf 的设置是生效了的。
  4. 已经通过注释脚本内的 ulimit 命令成功部署。

from docs-cn.

an520184 avatar an520184 commented on August 27, 2024

fatal: [tidb-node-3]: FAILED! => {"changed": false, "elapsed": 300, "failed": true, "msg": "Timeout when waiting for search string 200 OK in tidb-node-3:2379"}
fatal: [tidb-node-2]: FAILED! => {"changed": false, "elapsed": 300, "failed": true, "msg": "Timeout when waiting for search string 200 OK in tidb-node-2:2379"}
fatal: [tidb-node-1]: FAILED! => {"changed": false, "elapsed": 300, "failed": true, "msg": "Timeout when waiting for search string 200 OK in tidb-node-1:2379"}

遇到同样的问题,然后也注释掉了脚本内的 ulimit 命令,
但是pd一直起不来
tidb-node-1 : ok=21 changed=3 unreachable=0 failed=1
tidb-node-2 : ok=18 changed=2 unreachable=0 failed=1
tidb-node-3 : ok=18 changed=2 unreachable=0 failed=1
tidb-node-4 : ok=20 changed=3 unreachable=0 failed=0
tidb-node-5 : ok=20 changed=3 unreachable=0 failed=0
tidb-node-6 : ok=20 changed=3 unreachable=0 failed=0
tidb-node-7 : ok=20 changed=3 unreachable=0 failed=0

pd日志里面有这一行:
run server failed: expected IP in URL for binding (http://tidb-node-1:2380)

不知道什么原因..

from docs-cn.

buggithubs avatar buggithubs commented on August 27, 2024

@an520184

  1. inventory.ini 文件内不推荐使用 主机名替代 IP 使用。
  2. 主机名在各主机是否已设置静态解析?
  3. 根据目前的日志查看,启动失败是因为 主机之间无法链接。

from docs-cn.

an520184 avatar an520184 commented on August 27, 2024

@buggithubs
因为我这边的机器ssh端口不是22,所以我在~/.ssh/config中 添加了对应的host映射,
也在/etc/hosts文件中添加了解析. 如果不用主机名代替ip,那怎么指定ssh port 呢?

from docs-cn.

buggithubs avatar buggithubs commented on August 27, 2024

@an520184

  1. 修改 tidb-ansible 目录下 ansible.cfg 可以指定 ssh 远程端口。#remote_port = 22
  2. 静态解析必须是所有主机都要添加所有主机名的静态解析,不是只单独指定自己的。

from docs-cn.

an520184 avatar an520184 commented on August 27, 2024

@buggithubs
好的,我用ip再试一下..多谢回复.

from docs-cn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.