Code Monkey home page Code Monkey logo

Comments (10)

Homura2333 avatar Homura2333 commented on September 25, 2024

Log显示:
image
看起来没什么问题

from graph-learn.

Homura2333 avatar Homura2333 commented on September 25, 2024

顺便一提,执行

export DgsServiceIP=$(kubectl get ingress --namespace default dgs-u2i-frontend-ingress --output jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo $DgsServiceIP

也无法输出service ip,是空的。
执行kubectl get ingress dgs-u2i-frontend-ingress之后返回如下:
image
ADDRESS字段是空的。这又是为什么呢?是否跟我没有正确配置好nginx controller有关系呢?
执行 kubectl get svc返回如下:
image
有没有人能帮助我?

from graph-learn.

goldenleaves avatar goldenleaves commented on September 25, 2024

能否给出serving worker里面更详细的日志,目前看来你给出的log只显示了serving worker向coordinator注册了,但是没有显示自己处于ready状态。只有当serving worker处于ready之后,相关的service port才可以被访问。 @Homura2333

from graph-learn.

Homura2333 avatar Homura2333 commented on September 25, 2024

能否给出serving worker里面更详细的日志,目前看来你给出的log只显示了serving worker向coordinator注册了,但是没有显示自己处于ready状态。只有当serving worker处于ready之后,相关的service port才可以被访问。 @Homura2333

更详细的log也只能看到这些:
image
还有什么其他方法能看到更有用的log吗?

from graph-learn.

Homura2333 avatar Homura2333 commented on September 25, 2024

进入容器内部,/serving_workdir/package/bin下的log如下:
image

from graph-learn.

goldenleaves avatar goldenleaves commented on September 25, 2024

@Homura2333 看起来是serving worker只是向coordinator注册了,但是没有拿到init info,因此根本没有启动,能够把coordinator的日志也贴一下嘛

from graph-learn.

Homura2333 avatar Homura2333 commented on September 25, 2024

@Homura2333 看起来是serving worker只是向coordinator注册了,但是没有拿到init info,因此根本没有启动,能够把coordinator的日志也贴一下嘛

image

from graph-learn.

goldenleaves avatar goldenleaves commented on September 25, 2024

@Homura2333 从coordinator的日志来看serving worker的启动流程是正常的,但是init之后的信息并没有出现,说明serving worker在init阶段卡住或者出现了错误,建议检查一下是不是设置的port或者k8s存储出现冲突之类的原因,然后删掉这个release重新拉起试一试。

from graph-learn.

KaisennHu avatar KaisennHu commented on September 25, 2024

也是在搭建DGS时,执行完以下命令后,DGS相关pod都error。
执行命令如下:
helm install dgs-u2i DGS/dgs --set frontend.ingressHostName="dynamic-graph-service.info" --set-file graphSchema=./conf/u2i/schema.u2i.json --set kafka.dl2spl.servers=[localhost:9092] --set kafka.dl2spl.topic="record-batches" --set kafka.dl2spl.partitions=4 --set kafka.spl2srv.servers=[localhost:9092] --set kafka.spl2srv.topic="sample-batches" --set kafka.spl2srv.partitions=4 --set glog.toConsole=true
pod状态如下:
image
pod的log显示,并没有对应的目录,例如/coordinator_workdir、/serving_workdir、/sampling_workdir等:
image
想请问这个问题应该如何解决?是否是因为容器镜像有问题?

from graph-learn.

goldenleaves avatar goldenleaves commented on September 25, 2024

也是在搭建DGS时,执行完以下命令后,DGS相关pod都error。 执行命令如下: helm install dgs-u2i DGS/dgs --set frontend.ingressHostName="dynamic-graph-service.info" --set-file graphSchema=./conf/u2i/schema.u2i.json --set kafka.dl2spl.servers=[localhost:9092] --set kafka.dl2spl.topic="record-batches" --set kafka.dl2spl.partitions=4 --set kafka.spl2srv.servers=[localhost:9092] --set kafka.spl2srv.topic="sample-batches" --set kafka.spl2srv.partitions=4 --set glog.toConsole=true pod状态如下: image pod的log显示,并没有对应的目录,例如/coordinator_workdir、/serving_workdir、/sampling_workdir等: image 想请问这个问题应该如何解决?是否是因为容器镜像有问题?

@Homura2333 ,看起来是pod里面的container没有成功下载运行相关的package,你可以在你的k8s集群中执行wget https://graphlearn.oss-cn-hangzhou.aliyuncs.com/package/dgs-built-1.0.0.tgz确认是否能够下载这个pacakge,或者进一步排查网络问题。

from graph-learn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.