Code Monkey home page Code Monkey logo

Comments (14)

kahkhang avatar kahkhang commented on August 16, 2024

Can I have more information about how this error occurred? Which step did this happen at? What OS are you using to run this script? How did you manage to get these logs? Thanks!

from kube-linode.

c835722 avatar c835722 commented on August 16, 2024

Sure.

Running this script from Operating System and Version:
ProductName: Mac OS X
ProductVersion: 10.12.6
BuildVersion: 16G29

The job gets up to the step:
⠴ [3551780] Provisioning master node (might take a while)

The logs came from lish via ssh shell connection to the CoreOS master. Nb. The worker never got instantiated.

I expand upon the logs snippet above with the following which may also be significant (SSH HOST keys removed).

localhost login: [   58.181279] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[   58.188999] Bridge firewalling registered
[   58.211967] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   58.346774] Initializing XFRM netlink socket


This is localhost (Linux x86_64 4.12.7-coreos) 10:42:15
eth0: 45.33.63.159 2600:3c01::f03c:91ff:fe92:9cab

localhost login: [   58.459666] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready


This is localhost (Linux x86_64 4.12.7-coreos) 10:42:16
eth0: 45.33.63.159 2600:3c01::f03c:91ff:fe92:9cab

localhost login:

This is localhost (Linux x86_64 4.12.7-coreos) 10:42:21
eth0: 45.33.63.159 2600:3c01::f03c:91ff:fe92:9cab

localhost login: [   66.667355] SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
[   66.735262] SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
[   66.762174] SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)
[   66.784263] SELinux: mount invalid.  Same superblock, different security settings for (dev mqueue, type mqueue)

Let me know any other debugging information that might assist or whether it is user error on my behalf. :-)

Cheers.

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

Hmm it seems like this might be an error associated with Lish. Do you mean that the master node failed to provision, and it keeps on reinstalling again? Would it be possible if you can run mktemp in a terminal, cd to that directory, then as [3551780] Provisioning master node (might take a while) appears, tail the latest file in the tmp directory. The output of the running command is piped to that file, though I should make a more user-friendly way of getting these logs when I find time :)

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

Just to add, if you do a ssh core@IP where IP is the IP of your linode, you should be able to ssh inside your coreos instance :) You can also run bootstrap.sh manually, which is what this step does. Sorry I can't provide further assistance because I'm unable to replicate this issue. Please feel free to ask if you need further assistance :)

from kube-linode.

c835722 avatar c835722 commented on August 16, 2024

If I run

./kube-linode.sh > /var/folders/qd/53s75cq93057_4tb4nvy84x80000gn/T/tmp.srQ6E3z0

then all I can see at the tail of the file is the spinner character in its various forms.

The CoreOS linode does appear in the web console.

I can ssh such that I see its signature.

Where/what do I find the initial root password?

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

Ah sorry what I meant is that when the Provisioning master node (might take a while) appears, a file will be generated in the /var/folders/qd/53s75cq93057_4tb4nvy84x80000gn/T/ directory containing the logs of that particular command. You can just look at the latest file generated in that directory and tail it for the logs. The core user is in the sudoers list and so should have root access. If you are referring to the root password of the install partition, it's randomly generated so you might have to add an echo command in create_install_disk under linode-utilities.sh to view it. Does the master node still fail to provision? Does it get stuck at that command then reinstall again?

from kube-linode.

c835722 avatar c835722 commented on August 16, 2024

I am able to ssh into the coreos instance with uid core which is available in the web console.
The worker instance appears not to get instantiated (for me).
I will try to surface more debugging information later in the day.
Thank you for your assistance in these matters.
I can see from the amount of effort you have put in here that you have created a very useful k8s implementation which is why I am keen to get this issue sorted and get off the ground :)

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

Ah okay so the master node did get instantiated? If you run kubectl get nodes do you see any information? Regarding the worker node, 1gb ram seems too little for a node, if you increase it to 2gb (plan 2) and run the script again does it complete?

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

You're welcome! I haven't had the time to thoroughly test this, so thanks so much for helping me iron these issues!

from kube-linode.

c835722 avatar c835722 commented on August 16, 2024
~/.usr/local/products/kubernetes/linode/kube-linode   master  kubectl get nodes                                                                   Mon 21 Aug 12:21:17 2017
The connection to the server 172.104.13.41:6443 was refused - did you specify the right host or port?

When you are trying to reproduce the issue are you using the following settings ?

MASTER_PLAN=2
WORKER_PLAN=1
NO_OF_WORKERS=1

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

Unfortunately, I haven't been able to reproduce this issue with the settings you provided. Could you tail the logs in the manner I've described above, you should get something like this:

/var/folders/lv/vxlglhv50gn7znp3z1r_ft240000gn/T $ tail -f tmp.KLqNhAa3
Writing asset: /home/core/assets/tls/etcd-client.crt
Writing asset: /home/core/assets/manifests/etcd-peer-tls.yaml
Writing asset: /home/core/assets/manifests/etcd-server-tls.yaml
Writing asset: /home/core/assets/manifests/etcd-client-tls.yaml
Writing asset: /home/core/assets/auth/kubeconfig
Writing asset: /home/core/assets/manifests/kube-apiserver-secret.yaml
Writing asset: /home/core/assets/manifests/kube-controller-manager-secret.yaml
Starting temporary bootstrap control plane...
Waiting for api-server...
W0821 02:50:38.663502    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:50:43.662912    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:50:48.661615    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:50:53.661564    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:50:58.661661    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:03.661501    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:08.673669    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:13.661709    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:18.661548    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:23.661459    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:28.661671    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:33.661662    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:38.661491    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:43.661499    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:48.661573    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:53.661518    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:51:58.661510    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:03.661680    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:08.661632    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:13.661648    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:18.666760    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:23.661691    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:28.661627    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:33.661624    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:38.661656    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:43.661550    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:48.661578    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:53.661363    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:52:58.661616    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:53:03.661625    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:53:08.661667    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:53:13.661662    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 0
W0821 02:53:18.793107    1122 create.go:31] Unable to determine api-server readiness: API Server http status: 500
Creating self-hosted assets...
I0821 02:53:23.825501    1122 log.go:19] secret "etcd-client-tls" created
	created         etcd-client-tls secret
I0821 02:53:23.905127    1122 log.go:19] deployment "etcd-operator" created
	created           etcd-operator deployment
I0821 02:53:23.979532    1122 log.go:19] secret "etcd-peer-tls" created
	created           etcd-peer-tls secret
I0821 02:53:24.046840    1122 log.go:19] secret "etcd-server-tls" created
	created         etcd-server-tls secret
I0821 02:53:24.111702    1122 log.go:19] service "etcd-service" created
	created            etcd-service service
I0821 02:53:24.186263    1122 log.go:19] secret "kube-apiserver" created
	created          kube-apiserver secret
I0821 02:53:24.241166    1122 log.go:19] daemonset "kube-apiserver" created
	created          kube-apiserver daemonset
I0821 02:53:24.256999    1122 log.go:19] poddisruptionbudget "kube-controller-manager" created
	created kube-controller-manager poddisruptionbudget
I0821 02:53:24.327549    1122 log.go:19] secret "kube-controller-manager" created
	created kube-controller-manager secret
I0821 02:53:24.375573    1122 log.go:19] deployment "kube-controller-manager" created
	created kube-controller-manager deployment
I0821 02:53:24.433138    1122 log.go:19] deployment "kube-dns" created
	created                kube-dns deployment
I0821 02:53:24.503136    1122 log.go:19] service "kube-dns" created
	created                kube-dns service
I0821 02:53:24.554288    1122 log.go:19] daemonset "kube-etcd-network-checkpointer" created
	created kube-etcd-network-checkpointer daemonset
I0821 02:53:24.628999    1122 log.go:19] configmap "kube-flannel-cfg" created
	created        kube-flannel-cfg configmap
I0821 02:53:24.686609    1122 log.go:19] daemonset "kube-flannel" created
	created            kube-flannel daemonset
I0821 02:53:24.734972    1122 log.go:19] daemonset "kube-proxy" created
	created              kube-proxy daemonset
I0821 02:53:24.745900    1122 log.go:19] poddisruptionbudget "kube-scheduler" created
	created          kube-scheduler poddisruptionbudget
I0821 02:53:24.796120    1122 log.go:19] deployment "kube-scheduler" created
	created          kube-scheduler deployment
I0821 02:53:24.816849    1122 log.go:19] clusterrolebinding "system:default-sa" created
	created       system:default-sa clusterrolebinding
I0821 02:53:24.922822    1122 log.go:19] daemonset "pod-checkpointer" created
	created        pod-checkpointer daemonset
	Pod Status:           etcd-operator	Pending
	Pod Status:                kube-dns	Pending
	Pod Status:        pod-checkpointer	Pending
	Pod Status:          kube-apiserver	Pending
	Pod Status:          kube-scheduler	Pending
	Pod Status: kube-controller-manager	Pending
	Pod Status:        pod-checkpointer	Pending
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Pending
	Pod Status: kube-controller-manager	Pending
	Pod Status:           etcd-operator	Pending
	Pod Status:                kube-dns	Pending
	Pod Status:        pod-checkpointer	Running
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Pending
	Pod Status: kube-controller-manager	Pending
	Pod Status:           etcd-operator	Pending
	Pod Status:                kube-dns	Pending
	Pod Status:        pod-checkpointer	Pending
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Pending
	Pod Status: kube-controller-manager	Pending
	Pod Status:           etcd-operator	Pending
	Pod Status:                kube-dns	Pending
	Pod Status:        pod-checkpointer	Running
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Pending
	Pod Status: kube-controller-manager	Pending
	Pod Status:           etcd-operator	Pending
	Pod Status:                kube-dns	Pending
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Running
	Pod Status: kube-controller-manager	Running
	Pod Status:           etcd-operator	Pending
	Pod Status:                kube-dns	Pending
	Pod Status:        pod-checkpointer	Running
	Pod Status:           etcd-operator	Running
	Pod Status:                kube-dns	Pending
	Pod Status:        pod-checkpointer	Running
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Running
	Pod Status: kube-controller-manager	Running
	Pod Status:                kube-dns	Running
	Pod Status:        pod-checkpointer	Running
	Pod Status:          kube-apiserver	Running
	Pod Status:          kube-scheduler	Running
	Pod Status: kube-controller-manager	Running
	Pod Status:           etcd-operator	Running
All self-hosted control plane components successfully started
Migrating to self-hosted etcd cluster...
I0821 02:55:04.926015    1122 log.go:19] warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
I0821 02:55:19.929521    1122 migrate.go:65] created etcd cluster TPR
I0821 02:55:24.999739    1122 migrate.go:76] etcd-service IP is 10.3.0.15
I0821 02:55:25.004748    1122 migrate.go:81] created etcd cluster for migration
I0821 02:55:35.007952    1122 migrate.go:86] etcd cluster for migration is now running
I0821 02:55:40.022566    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:55:45.021494    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:55:50.024298    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:55:55.027517    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:00.030381    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:05.022997    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:10.021504    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:15.021660    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:20.022081    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:25.022520    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:30.022191    1122 migrate.go:215] still waiting for boot-etcd to be deleted...
I0821 02:56:35.022433    1122 migrate.go:91] removed boot-etcd from the etcd cluster
Tearing down temporary bootstrap control plane...
secret "kubesecret" created
namespace "monitoring" created
secret "kubesecret" created
deployment "heapster" created
service "heapster" created
serviceaccount "heapster" created
clusterrolebinding "heapster" created
replicationcontroller "kubernetes-dashboard-v1.6.2" created
ingress "dashboard-ingress" created
service "kubernetes-dashboard" created
service "traefik" created
service "traefik-console" created
configmap "traefik-conf" created
deployment "traefik-ingress-controller" created
provisioned master

If the logs terminate halfway, that means some step failed. I suspect it might be a timeout issue as I did not introduce a timeout mechanism, but it works fine for me in US with the Fremont datacenter. Thanks for your patience!

Related tickets to streamline debugging and prevent timeout errors: #38 , #19.

from kube-linode.

c835722 avatar c835722 commented on August 16, 2024

Yes, it would seems its a timing issue rather than an outright bug.
I just ran the install again and was able to get both hosts to provision. 😄
Unfortunately the DNS configurations for the Domain did not get created.
I have reset all hosts and will retry again from start.
Also note that my local ~/.kube/config appears to have lost my other kube contexts (minikube and gce). Is it backed up anywhere by the kube-linode script?

from kube-linode.

c835722 avatar c835722 commented on August 16, 2024

ok. My execution time to create the 2 nodes was 47 minutes.
Both were created successfully. 👏 👏 👏
It looks like 2 of the pods (prometheus and alertmanager) failed to come up based on insufficient pod memory. (looks like it needs approx 120Mb more on 1Gb worker) (cpu utilisation also looks a little high on the worker).
So some fine tuning required like on any project.

I commend you on your work on this project. There is much to like here. Well done. I particularly like the k8s "service mesh" you are pulling together at the front of the kube. These pre-configured utility services were what attracted me to check out this project. 👍

Given #38 and #19 (yes the debugging was a little challenging!) I'll close this item. Cheers.

from kube-linode.

kahkhang avatar kahkhang commented on August 16, 2024

You're welcome! I'm glad it worked out! Unfortunately, the ~/.kube/config is backed up only once to ~/.kube/config.bak when the script runs, so if you've ran it multiple times it might have already been overriden :( I'll change the behavior to add a timestamp to the backup so this doesn't occur again.

from kube-linode.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.