Code Monkey home page Code Monkey logo

ceph-helm's Introduction

Helm Charts

Use this repository to submit official Charts for Kubernetes Helm. Charts are curated application definitions for Kubernetes Helm. For more information about installing and using Helm, see its README.md. To get a quick introduction to Charts see this chart document.

How do I install these charts?

Just helm install stable/<chart>. This is the default repository for Helm which is located at https://kubernetes-charts.storage.googleapis.com/ and is installed by default.

For more information on using Helm, refer to the Helm's documentation.

How do I enable the Incubator repository?

To add the Incubator charts for your local client, run helm repo add:

$ helm repo add incubator https://kubernetes-charts-incubator.storage.googleapis.com/
"incubator" has been added to your repositories

You can then run helm search incubator to see the charts.

Chart Format

Take a look at the alpine example chart and the nginx example chart for reference when you're writing your first few charts.

Before contributing a Chart, become familiar with the format. Note that the project is still under active development and the format may still evolve a bit.

Repository Structure

This GitHub repository contains the source for the packaged and versioned charts released in the gs://kubernetes-charts Google Storage bucket (the Chart Repository).

The Charts in the stable/ directory in the master branch of this repository match the latest packaged Chart in the Chart Repository, though there may be previous versions of a Chart available in that Chart Repository.

The purpose of this repository is to provide a place for maintaining and contributing official Charts, with CI processes in place for managing the releasing of Charts into the Chart Repository.

The Charts in this repository are organized into two folders:

  • stable
  • incubator

Stable Charts meet the criteria in the technical requirements.

Incubator Charts are those that do not meet these criteria. Having the incubator folder allows charts to be shared and improved on until they are ready to be moved into the stable folder. The charts in the incubator/ directory can be found in the gs://kubernetes-charts-incubator Google Storage Bucket.

In order to get a Chart from incubator to stable, Chart maintainers should open a pull request that moves the chart folder.

Contributing a Chart

We'd love for you to contribute a Chart that provides a useful application or service for Kubernetes. Please read our Contribution Guide for more information on how you can contribute Charts.

Note: We use the same workflow, License and Contributor License Agreement as the main Kubernetes repository.

Review Process

The following outlines the review procedure used by the Chart repository maintainers. Github labels are used to indicate state change during the review process.

  • AWAITING REVIEW - Initial triage which indicates that the PR is ready for review by the maintainers team. The CLA must be signed and e2e tests must pass in-order to move to this state
  • CHANGES NEEDED - Review completed by at least one maintainer and changes needed by contributor (explicit even when using the review feature of Github)
  • CODE REVIEWED - The chart structure has been reviewed and found to be satisfactory given the technical requirements (may happen in parallel to UX REVIEWED)
  • UX REVIEWED - The chart installation UX has been reviewed and found to be satisfactory. (may happen in parallel to CODE REVIEWED)
  • LGTM - Added ONLY once both UX/CODE reviewed are both present. Merge must be handled by someone OTHER than the maintainer that added the LGTM label. This label indicates that given a quick pass of the comments this change is ready to merge

Stale Pull Requests

After initial review feedback, if no updates have been made to the pull request for 1 week, the stale label will be added. If after another week there are still no updates it will be closed. Please re-open if/when you have made the proper adjustments.

Status of the Project

This project is still under active development, so you might run into issues. If you do, please don't be shy about letting us know, or better yet, contribute a fix or feature.

ceph-helm's People

Contributors

a-robinson avatar amandacameron avatar andresbono avatar edsiper avatar electroma avatar flah00 avatar foxish avatar gtaylor avatar h0tbird avatar jackzampolin avatar jainishshah17 avatar jotadrilo avatar kevinschumacher avatar kfox1111 avatar lachie83 avatar linki avatar mgoodness avatar migmartri avatar nitisht avatar prydonius avatar rimusz avatar rootfs avatar scottrigby avatar sebgoa avatar sstarcher avatar technosophos avatar tompizmor avatar unguiculus avatar viglesiasce avatar yuvipanda avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ceph-helm's Issues

ceph-helm for single node deployment

Is this a request for help?: yes

I am looking to use the ceph-helm chart to deploy on a single node kubernetes cluster. I have followed this tutorial and successfully got ceph running without kubernetes on a single node. Now I want to get it running within kubernetes using ceph-helm.

To clarify my understanding. I can basically follow the tutorial on the ceph documentation for kubernetes + helm. The only differences are:

  1. the single node must be labeled with both
    • ceph-monitor: ceph-mon=enabled ceph-mgr=enabled
    • ceph-osd: ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-osd-device-dev-sdc=enabled ...
  2. I must set osd pool default size = 2 and osd crush chooseleaf type = 0 in ceph-overrides.yaml

How would I complete step 2? Please correct me it step 1 is not correct.

Secrets generate error

Error from server (BadRequest): error when creating "STDIN": Secret in version "v1" cannot be handled as a Secret: v1.Secret: ObjectMeta: v1.ObjectMeta: TypeMeta: Kind: Data: decode base64: illegal base64 data at input byte 156, parsing 202 ...2QiCgo=\n"... at {"apiVersion":"v1","data":

Unable to mount volumes : timeout expired waiting for volumes to attach/mount

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? Bug report

Version of Helm and Kubernetes:

kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} 
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.7", GitCommit:"dd5e1a2978fd0b97d9b78e1564398aeea7e7fe92", GitTreeState:"clean", BuildDate:"2018-04-18T23:58:35Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"} 
helm version                                                                                                                                     root@kubernetes
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

Which chart: ceph-helm

What happened:

Unable to mount volumes for pod "mypod_default(e68c8e3e-6578-11e8-87c4-e83935e84dc8)": timeout expired waiting for volumes to attach/mount for pod "default"/"mypod". list of unattached/unmounted volumes=[vol1]

How to reproduce it (as minimally and precisely as possible):
http://docs.ceph.com/docs/master/start/kube-helm/

Anything else we need to know:

The ceph cluster is working fine

  ceph -s
  cluster:
    id:     88596d9e-b478-47a9-8208-3a6cea33d1d4
    health: HEALTH_OK
 
  services:
    mon: 1 daemons, quorum kubernetes
    mgr: kubernetes(active)
    mds: cephfs-1/1/1 up  {0=mds-ceph-mds-5696f9df5d-jbsgz=up:active}
    osd: 1 osds: 1 up, 1 in
    rgw: 1 daemon active
 
  data:
    pools:   7 pools, 176 pgs
    objects: 213 objects, 3391 bytes
    usage:   108 MB used, 27134 MB / 27243 MB avail
    pgs:     176 active+clean

Everything in th ceph namespace works fine
In the mon pod I got an image created for the pvc

rbd ls
kubernetes-dynamic-pvc-0077fdf9-6578-11e8-b1f8-b63c3e9e1eaa
kubectl get pvc                                                                                                                                  root@kubernetes
NAME                  STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
ceph-pvc              Bound     pvc-c9d07cf9-6578-11e8-87c4-e83935e84dc8   1Gi        RWO            ceph-rbd       29m

I have changed resolv.conf and added the kube-dns as nameserver, I can resolve
ceph-mon.ceph and ceph-mon.ceph.svc.local from the host node

some kubelet logs that I found related
juin 01 11:24:19 kubernetes kubelet[32612]: E0601 11:24:19.587800 32612 nestedpendingoperations.go:263] Operation for "\"kubernetes.io/rbd/[ceph-mon.ceph.svc.cluster.local:6789]:kubernetes-dynamic-pvc-0077fdf9-6578-11e8-b1f8-b63c3e9e1eaa\"" failed. No retries permitted until 2018-06-01 11:24:51.582365588 +0200 CEST m=+162261.330642194 (durationBeforeRetry 32s). Error: "MountVolume.WaitForAttach failed for volume \"pvc-004d66b7-6578-11e8-87c4-e83935e84dc8\" (UniqueName: \"kubernetes.io/rbd/[ceph-mon.ceph.svc.cluster.local:6789]:kubernetes-dynamic-pvc-0077fdf9-6578-11e8-b1f8-b63c3e9e1eaa\") pod \"ldap-ss-0\" (UID: \"f63432e0-6579-11e8-87c4-e83935e84dc8\") : error: exit status 1, rbd output: 2018-06-01 11:19:19.513914 7f1cf1f227c0 -1 did not load config file, using default settings.\n2018-06-01 11:19:19.579955 7f1cf1f20700 0 -- IP@:0/1002573 >> IP@:6789/0 pipe(0x3a2a3f0 sd=3 :53578 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:19.580065 7f1cf1f20700 0 -- IP@:0/1002573 >> IP@:6789/0 pipe(0x3a2a3f0 sd=3 :53578 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).fault\n2018-06-01 11:19:19.580437 7f1cf1f20700 0 -- IP@:0/1002573 >> 10.1.0.146:6789/0 pipe(0x3a2a3f0 sd=3 :53580 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).connect protocol feature mismatch, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:19.781427 7f1cf1f20700 0 -- 10.1.0.146:0/1002573 >> 10.1.0.146:6789/0 pipe(0x3a2a3f0 sd=3 :53584 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).**connect protocol feature mismatch**, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:20.182401 7f1cf1f20700 0 -- 10.1.0.146:0/1002573 >> 10.1.0.146:6789/0 pipe(0x3a2a3f0 sd=3 :53588 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).**connect protocol feature mismatch**, my 83ffffffffffff < peer 481dff8eea4fffb missing 400000000000000\n2018-06-01 11:19:20.983428 7f1cf1f20700 0 -- IP@:0/1002573 >> ip@:6789/0 pipe(0x3a2a3f0 sd=3 :53610 s=1 pgs=0 cs=0 l=1 c=0x3a2e6e0).conne

Idon't know it tries to connect to my kubernetes node externalip:6789 that port is only opened to the ceph-mon headless svc which is

kubectl get svc -n ceph                                                                                                                    root@kubernetes
NAME       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
ceph-mon   ClusterIP   None            <none>        6789/TCP   1h

From the kubernetes node I can telnet to the port 6789

telnet ceph-mon.ceph 6789                                                                                                                  root@kubernetes 
Trying IP@ ... 
Connected to ceph-mon.ceph. 

connect protocol feature mismatch in the kubelet logs
Could have something to do with

Important Kubernetes uses the RBD kernel module to map RBDs to hosts. Luminous requires CRUSH_TUNABLES 5 (Jewel). The minimal kernel version for these tunables is 4.5. If your kernel does not support these tunables, run ceph osd crush tunables hammer

in the ceph-helm doc

OSD init container fails on minikube

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
Bug?

Version of Helm and Kubernetes:

Client Version: version.Info{Major:"1", Minor:"7", GitVersion:"v1.7.5", GitCommit:"17d7182a7ccbb167074be7a87f0a68bd00d58d97", GitTreeState:"clean", BuildDate:"2017-08-31T09:14:02Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.0", GitCommit:"0b9efaeb34a2fc51ff8e4d34ad9bc6375459c4a4", GitTreeState:"clean", BuildDate:"2017-11-29T22:43:34Z", GoVersion:"go1.9.1", Compiler:"gc", Platform:"linux/amd64"}

Which chart:

ceph

What happened:

The OSD container fails on this line:

timeout 10 ceph ${CLI_OPTS} --name client.bootstrap-osd --keyring $OSD_BOOTSTRAP_KEYRING health || exit 1

What you expected to happen:

Success

How to reproduce it (as minimally and precisely as possible):

  • Start minikube in virtualbox mode. Attach a disk which becomes /dev/sdb. Deploy ceph-helm.

Anything else we need to know:

MON is also failing, but it appears to be failing due to a lack of storage.

$ ย kubectl logs -n ceph ceph-osd-sdb-jhfd8 -c osd-prepare-pod
+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ :
++ : osd_ceph_disk_prepare
++ : 1
++ : minikube
++ : minikube
++ : /etc/ceph/monmap-ceph
++ : /var/lib/ceph/mon/ceph-minikube
++ : 0
++ : 0
++ : mds-minikube
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : 9c0528e0-da2b-4801-a920-4e476b4c6a69
+++ uuidgen
++ : 19e4efd2-405f-4c70-b9fd-0c18d937fbdb
++ : root=default host=minikube
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : minikube
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : minikube
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-minikube/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/minikube/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-minikube/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ is_available rpm
+ command -v rpm
+ is_available dpkg
+ command -v dpkg
+ OS_VENDOR=ubuntu
+ source /etc/default/ceph
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728
+ case "$CEPH_DAEMON" in
+ OSD_TYPE=prepare
+ start_osd
+ [[ ! -e /etc/ceph/ceph.conf ]]
+ '[' 1 -eq 1 ']'
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]]
+ case "$OSD_TYPE" in
+ source osd_disk_prepare.sh
++ set -ex
+ osd_disk_prepare
+ [[ -z /dev/sdb ]]
+ [[ ! -e /dev/sdb ]]
+ '[' '!' -e /var/lib/ceph/bootstrap-osd/ceph.keyring ']'
+ timeout 10 ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring health
+ exit 1

unable to delete helm ceph namespace

Is this a request for help?:
yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:

[root@admin-node ~]# helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

[root@admin-node ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:17:28Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.0", GitCommit:"91e7b4fd31fcd3d5f436da26c980becec37ceefe", GitTreeState:"clean", BuildDate:"2018-06-27T20:08:34Z", GoVersion:"go1.10.2", Compiler:"gc", Platform:"linux/amd64"}
[root@admin-node ~]#

Which chart:
Ceph

What happened:
Unable to release ceph namespace from Helm after a few objects missing in helm status

[root@admin-node ~]# helm list
NAME    REVISION        UPDATED                         STATUS          CHART           NAMESPACE
ceph    1               Sat Jul  7 02:46:40 2018        DEPLOYED        ceph-0.1.0      ceph
[root@admin-node ~]#
[root@admin-node ~]# helm status ceph
LAST DEPLOYED: Sat Jul  7 02:46:40 2018
NAMESPACE: ceph
STATUS: DEPLOYED

RESOURCES:
==> v1/StorageClass
NAME      PROVISIONER   AGE
ceph-rbd  ceph.com/rbd  5d

==> MISSING
KIND           NAME
secrets        ceph-keystone-user-rgw
configmaps     ceph-bin-clients
configmaps     ceph-bin
configmaps     ceph-etc
configmaps     ceph-templates
services       ceph-mon
services       ceph-rgw
daemonsets     ceph-mon
daemonsets     ceph-osd-dev-vdc
deployments    ceph-mds
deployments    ceph-mgr
deployments    ceph-mon-check
deployments    ceph-rbd-provisioner
deployments    ceph-rgw
jobs           ceph-rgw-keyring-generator
jobs           ceph-osd-keyring-generator
jobs           ceph-mgr-keyring-generator
jobs           ceph-mon-keyring-generator
jobs           ceph-mds-keyring-generator
jobs           ceph-namespace-client-key-generator
jobs           ceph-storage-keys-generator
[root@admin-node ~]# helm delete ceph --purge --debug
[debug] Created tunnel using local port: '42271'

[debug] SERVER: "127.0.0.1:42271"

Error: namespaces "ceph" is forbidden: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "ceph"
[root@admin-node ~]#

There are no resources at kubernetes level. I think they got deleted with i first ran helm delete ceph

[root@admin-node ~]# kubectl get all --namespace ceph
No resources found.
[root@admin-node ~]#

Also tried removing helm tiller and doing re-init

kubectl delete deployment -n=kube-system tiller-deploy
helm init --upgrade
kubectl get po -n kube-system
helm delete ceph --purge --debug

What you expected to happen:
helm delete --purge should delete the helm namespace

How to reproduce it (as minimally and precisely as possible):

  • Deploy ceph using helm chart
  • helm delete ceph
  • helm delete ceph --purge

Anything else we need to know:

No secrets found in helm-toolkit during "make" and ceph-mon pod is going in "CrashLoopBackOff"

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:

Kubernetes v1.13.0
Client Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-dirty", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"dirty", BuildDate:"2019-01-31T06:07:25Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13+", GitVersion:"v1.13.0-dirty", GitCommit:"ddf47ac13c1a9483ea035a79cd7c10005ff21a6d", GitTreeState:"dirty", BuildDate:"2019-01-31T06:07:25Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Helm
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}

What happened:
I am trying to install Ceph using helm charts in k8s cluster and followed this document http://docs.ceph.com/docs/master/start/kube-helm/ and facing this 2 major issues

  1. If we run "make" it shows no secrets found in helm-toolkit

  2. After installing step i.e.

     helm install --name=ceph local/ceph --namespace=ceph -f  ceph-overrides.yaml
    

ceph-mon pod is going in "CrashLoopBackOff" state

    NAMESPACE          NAME                                        READY   STATUS             RESTARTS   AGE
   ceph               ceph-mds-85b4fbb478-26sw8                   0/1     Pending            0          4h56m
   ceph               ceph-mds-keyring-generator-w6xqz            0/1     Completed          0          4h56m
   ceph               ceph-mgr-588577d89f-rrd84                   0/1     Init:0/2           0          4h56m
   ceph               ceph-mgr-keyring-generator-sg75h            0/1     Completed          0          4h56m
   ceph               ceph-mon-82rtj                              2/3     CrashLoopBackOff   57         4h56m
   ceph               ceph-mon-check-549b886885-x4m7d             0/1     Init:0/2           0                 4h56m
   ceph               ceph-mon-keyring-generator-d5txp            0/1     Completed          0          4h56m
   ceph               ceph-namespace-client-key-generator-rqd2m   0/1     Completed          0          4h56m
   ceph               ceph-osd-dev-sdb-9fpd9                      0/1     Init:0/3           0          4h56m
   ceph               ceph-osd-keyring-generator-m44l4            0/1     Completed          0          4h56m
   ceph               ceph-rbd-provisioner-5cf47cf8d5-gwfnj       1/1     Running            0          4h56m
   ceph               ceph-rbd-provisioner-5cf47cf8d5-s9vvg       1/1     Running            0          4h56m
   ceph               ceph-rgw-7b9677854f-9tdwt                   0/1     Pending            0          4h56m
   ceph               ceph-rgw-keyring-generator-chm89            0/1     Completed          0          4h56m
   ceph               ceph-storage-keys-generator-sqwb2           0/1     Completed          0          4h56m
   kube-system        kube-dns-8f7866879-28pq7                    3/3     Running            0          6h2m
   kube-system        tiller-deploy-dbb85cb99-68xmk               1/1     Running            0          104m

What you expected to happen:

We want ceph-mon pod in running condition so that we can create secrets and keyrings as without ceph-mon in running state we can't create secrets. Please do let me know if I am miss anything.

Anything else we need to know:

Bad Header Magic on two of three OSD nodes

Is this a request for help?: YES


Is this a BUG REPORT? (choose one):BUG REPORT

Version of Helm and Kubernetes:

Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
chrisp@px-chrisp1:~/APICv2018DevInstall$ kubectl version
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:43:26Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
helm-ceph

What happened:
When deploying with three OSD nodes two of the nodes (and it changes each time) fail to start with bad magic header. See log below.

ceph-osd-dev-sdb-6mjwq                      0/1       Running     1          2m
ceph-osd-dev-sdb-9rgmd                      1/1       Running     0          2m
ceph-osd-dev-sdb-nrv8v                      0/1       Running     1          2m
chrisp@px-chrisp1:~/$ kubectl logs -nceph po/ceph-osd-dev-sdb-nrv8v  osd-activate-pod
+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ :
++ : osd_ceph_disk_activate
++ : 1
++ : px-chrisp3
++ : px-chrisp3
++ : /etc/ceph/monmap-ceph
++ : /var/lib/ceph/mon/ceph-px-chrisp3
++ : 0
++ : 0
++ : mds-px-chrisp3
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : 57ea3535-932a-410f-bf05-f6386e6f9b54
+++ uuidgen
++ : 472f01d3-5053-4ab4-9aef-9435fc48c484
++ : root=default host=px-chrisp3
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : px-chrisp3
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : px-chrisp3
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-px-chrisp3/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/px-chrisp3/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-px-chrisp3/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ is_available rpm
+ command -v rpm
+ is_available dpkg
+ command -v dpkg
+ OS_VENDOR=ubuntu
+ source /etc/default/ceph
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728
+ case "$CEPH_DAEMON" in
+ OSD_TYPE=activate
+ start_osd
+ [[ ! -e /etc/ceph/ceph.conf ]]
+ '[' 1 -eq 1 ']'
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]]
+ case "$OSD_TYPE" in
+ source osd_disk_activate.sh
++ set -ex
+ osd_activate
+ [[ -z /dev/sdb ]]
+ CEPH_DISK_OPTIONS=
+ CEPH_OSD_OPTIONS=
++ blkid -o value -s PARTUUID /dev/sdb1
+ DATA_UUID=5ada2967-155e-4208-86c4-21e7edfae0f1
++ blkid -o value -s PARTUUID /dev/sdb3
++ true
+ LOCKBOX_UUID=
++ dev_part /dev/sdb 2
++ local osd_device=/dev/sdb
++ local osd_partition=2
++ [[ -L /dev/sdb ]]
++ [[ b == [0-9] ]]
++ echo /dev/sdb2
+ JOURNAL_PART=/dev/sdb2
++ readlink -f /dev/sdb
+ ACTUAL_OSD_DEVICE=/dev/sdb
+ udevadm settle --timeout=600
+ [[ -n '' ]]
++ dev_part /dev/sdb 1
++ local osd_device=/dev/sdb
++ local osd_partition=1
++ [[ -L /dev/sdb ]]
++ [[ b == [0-9] ]]
++ echo /dev/sdb1
+ wait_for_file /dev/sdb1
+ timeout 10 bash -c 'while [ ! -e /dev/sdb1 ]; do echo '\''Waiting for /dev/sdb1 to show up'\'' && sleep 1 ; done'
+ chown ceph. /dev/sdb2
+ chown ceph. /var/log/ceph
++ dev_part /dev/sdb 1
++ local osd_device=/dev/sdb
++ local osd_partition=1
++ [[ -L /dev/sdb ]]
++ [[ b == [0-9] ]]
++ echo /dev/sdb1
+ DATA_PART=/dev/sdb1
+ MOUNTED_PART=/dev/sdb1
+ [[ 0 -eq 1 ]]
+ ceph-disk -v --setuser ceph --setgroup disk activate --no-start-daemon /dev/sdb1
main_activate: path = /dev/sdb1
get_dm_uuid: get_dm_uuid /dev/sdb1 uuid path is /sys/dev/block/8:17/dm/uuid
command: Running command: /sbin/blkid -o udev -p /dev/sdb1
command: Running command: /sbin/blkid -p -s TYPE -o value -- /dev/sdb1
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_mount_options_xfs
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup osd_fs_mount_options_xfs
mount: Mounting /dev/sdb1 on /var/lib/ceph/tmp/mnt.BPKzys with options noatime,inode64
command_check_call: Running command: /bin/mount -t xfs -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/tmp/mnt.BPKzys
activate: Cluster uuid is 56d8e493-f75d-43b0-af75-b5e9ed708416
command: Running command: /usr/bin/ceph-osd --cluster=ceph --show-config-value=fsid
activate: Cluster name is ceph
activate: OSD uuid is 5ada2967-155e-4208-86c4-21e7edfae0f1
activate: OSD id is 1
command: Running command: /usr/bin/ceph-conf --cluster=ceph --name=osd. --lookup init
command: Running command: /usr/bin/ceph-detect-init --default sysvinit
activate: Marking with init system none
command: Running command: /bin/chown -R ceph:ceph /var/lib/ceph/tmp/mnt.BPKzys/none
activate: ceph osd.1 data dir is ready at /var/lib/ceph/tmp/mnt.BPKzys
move_mount: Moving mount to final location...
command_check_call: Running command: /bin/mount -o noatime,inode64 -- /dev/sdb1 /var/lib/ceph/osd/ceph-1
command_check_call: Running command: /bin/umount -l -- /var/lib/ceph/tmp/mnt.BPKzys
++ grep /dev/sdb1 /proc/mounts
++ awk '{print $2}'
++ grep -oh '[0-9]*'
+ OSD_ID=1
++ get_osd_path 1
++ echo /var/lib/ceph/osd/ceph-1/
+ OSD_PATH=/var/lib/ceph/osd/ceph-1/
+ OSD_KEYRING=/var/lib/ceph/osd/ceph-1//keyring
++ df -P -k /var/lib/ceph/osd/ceph-1/
++ tail -1
++ awk '{ d= $2/1073741824 ; r = sprintf("%.2f", d); print r }'
+ OSD_WEIGHT=0.09
+ ceph --cluster ceph --name=osd.1 --keyring=/var/lib/ceph/osd/ceph-1//keyring osd crush create-or-move -- 1 0.09 root=default host=px-chrisp3
create-or-move updated item name 'osd.1' weight 0.09 at location {host=px-chrisp3,root=default} to crush map
+ log SUCCESS
+ '[' -z SUCCESS ']'
++ date '+%F %T'
+ TIMESTAMP='2018-08-05 15:05:13'
+ echo '2018-08-05 15:05:13  /start_osd.sh: SUCCESS'
+ return 0
+ exec /usr/bin/ceph-osd --cluster ceph -f -i 1 --setuser ceph --setgroup disk
2018-08-05 15:05:13  /start_osd.sh: SUCCESS
starting osd.1 at - osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
2018-08-05 15:05:13.822997 7fa44b9aae00 -1 journal do_read_entry(323584): bad header magic
2018-08-05 15:05:13.823010 7fa44b9aae00 -1 journal do_read_entry(323584): bad header magic
2018-08-05 15:05:13.835202 7fa44b9aae00 -1 osd.1 10 log_to_monitors {default=true} 

I have a disk on /dev/sdb with no partitions on all nodes, i even remove the paritions on install to ensure they are not there. I also remove /var/lib/ceph-helm

What you expected to happen:
I would expect all three pods to start.

How to reproduce it (as minimally and precisely as possible):
I followed these instructions https://github.com/helm/helm#docs
Samples from my make script
CleanUp

        ssh $(workerNode) sudo kubeadm reset --force 
        ssh $(workerNode) sudo rm -rf /var/lib/ceph-helm
        ssh $(workerNode) sudo rm -rf /var/kubernetes
        ssh $(workerNode) "( echo d ; echo 1 ; echo d ; echo w ) | sudo fdisk /dev/sdb"
        
        ssh $(workerNode2) sudo kubeadm reset --force  
        ssh $(workerNode2) sudo rm -rf   /var/lib/ceph-helm 
        ssh $(workerNode2) sudo rm -rf /var/kubernetes
        ssh $(workerNode2) "( echo d ; echo 1 ; echo d ; echo w ) | sudo fdisk /dev/sdb"
        ( echo d ; echo 1 ; echo d ; echo w ) | sudo fdisk /dev/sdb
        sudo rm -rf ~/.kube
        sudo rm -rf ~/.helm
        sudo rm -rf /var/kubernetes
        sudo rm -rf  /var/lib/ceph-helm

installCeph

        kubectl create namespace ceph
        $(MAKE) -C ceph-helm/ceph/
        kubectl create -f ceph-helm/ceph/rbac.yaml
        kubectl label node px-chrisp1 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-mon=enabled ceph-mgr=enabled
        kubectl label node px-chrisp2 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-mon=enabled ceph-mgr=enabled
        kubectl label node px-chrisp3 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-mon=enabled ceph-mgr=enabled
        helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml || helm upgrade ceph local/ceph -f ~/ceph-overrides.yaml --recreate-pods

Anything else we need to know:
K8s over three nodes with the master on one of the nodes. Trying to set up an HA K8s cluster (management will be done after ceph).

I have also tried with the latest images from docker hub to no avail.

Full k8s artifacts

NAME                                            READY     STATUS      RESTARTS   AGE
pod/ceph-mds-666578c5f5-plknd                   0/1       Pending     0          3m
pod/ceph-mds-keyring-generator-9qvq9            0/1       Completed   0          3m
pod/ceph-mgr-69c4b4d4bb-ptwv5                   1/1       Running     1          3m
pod/ceph-mgr-keyring-generator-bvqjm            0/1       Completed   0          3m
pod/ceph-mon-7lrrk                              3/3       Running     0          3m
pod/ceph-mon-check-59499b664d-c95nf             1/1       Running     0          3m
pod/ceph-mon-fk2qx                              3/3       Running     0          3m
pod/ceph-mon-h727g                              3/3       Running     0          3m
pod/ceph-mon-keyring-generator-stlsf            0/1       Completed   0          3m
pod/ceph-namespace-client-key-generator-hdqs8   0/1       Completed   0          3m
pod/ceph-osd-dev-sdb-pjw7l                      0/1       Running     1          3m
pod/ceph-osd-dev-sdb-rtgnb                      1/1       Running     0          3m
pod/ceph-osd-dev-sdb-vzbp5                      0/1       Running     1          3m
pod/ceph-osd-keyring-generator-jzj2x            0/1       Completed   0          3m
pod/ceph-rbd-provisioner-5bc57f5f64-2x5kp       1/1       Running     0          3m
pod/ceph-rbd-provisioner-5bc57f5f64-x45mz       1/1       Running     0          3m
pod/ceph-rgw-58c67497fb-sdvkp                   0/1       Pending     0          3m
pod/ceph-rgw-keyring-generator-5b4gr            0/1       Completed   0          3m
pod/ceph-storage-keys-generator-qvhzw           0/1       Completed   0          3m

NAME               TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE
service/ceph-mon   ClusterIP   None          <none>        6789/TCP   3m
service/ceph-rgw   ClusterIP   10.110.4.11   <none>        8088/TCP   3m

NAME                              DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR                                      AGE
daemonset.apps/ceph-mon           3         3         3         3            3           ceph-mon=enabled                                   3m
daemonset.apps/ceph-osd-dev-sdb   3         3         1         3            1           ceph-osd-device-dev-sdb=enabled,ceph-osd=enabled   3m

NAME                                   DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ceph-mds               1         1         1            0           3m
deployment.apps/ceph-mgr               1         1         1            1           3m
deployment.apps/ceph-mon-check         1         1         1            1           3m
deployment.apps/ceph-rbd-provisioner   2         2         2            2           3m
deployment.apps/ceph-rgw               1         1         1            0           3m

NAME                                              DESIRED   CURRENT   READY     AGE
replicaset.apps/ceph-mds-666578c5f5               1         1         0         3m
replicaset.apps/ceph-mgr-69c4b4d4bb               1         1         1         3m
replicaset.apps/ceph-mon-check-59499b664d         1         1         1         3m
replicaset.apps/ceph-rbd-provisioner-5bc57f5f64   2         2         2         3m
replicaset.apps/ceph-rgw-58c67497fb               1         1         0         3m

NAME                                            DESIRED   SUCCESSFUL   AGE
job.batch/ceph-mds-keyring-generator            1         1            3m
job.batch/ceph-mgr-keyring-generator            1         1            3m
job.batch/ceph-mon-keyring-generator            1         1            3m
job.batch/ceph-namespace-client-key-generator   1         1            3m
job.batch/ceph-osd-keyring-generator            1         1            3m
job.batch/ceph-rgw-keyring-generator            1         1            3m
job.batch/ceph-storage-keys-generator           1         1            3m```

Ceph helm not running

My env:
CentOS7, kubernetes1.11.1

ceph-overrides.yaml

network:
  public: 192.168.105.0/24
  cluster: 192.168.105.0/24

osd_devices:
  - name: dev-sdb
    device: /dev/sdb
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  #user_id: admin
  user_id: k8s

ceph/values.yaml I modified

deployment:
  ceph: true
  storage_secrets: true
  client_secrets: true
  rbd_provisioner: true
  rgw_keystone_user_and_endpoints: false

images:
  ks_user: docker.io/kolla/centos-source-heat-engine:3.0.3
  ks_service: docker.io/kolla/centos-source-heat-engine:3.0.3
  ks_endpoints: docker.io/kolla/centos-source-heat-engine:3.0.3
  bootstrap: docker.io/ceph/daemon:v3.0.5-stable-3.0-luminous-centos-7
  dep_check: docker.io/kolla/centos-source-kubernetes-entrypoint:4.0.0
  daemon: docker.io/ceph/daemon:v3.0.5-stable-3.0-luminous-centos-7
  ceph_config_helper: docker.io/port/ceph-config-helper:v1.7.5
  rbd_provisioner: quay.io/external_storage/rbd-provisioner:v0.1.1
  minimal: docker.io/alpine:latest
  pull_policy: "IfNotPresent"

kubectl get pod -n ceph

NAME                                        READY     STATUS                  RESTARTS   AGE
ceph-mds-c5c856bb8-rw2vq                    0/1       Pending                 0          13m
ceph-mds-keyring-generator-llhcl            0/1       Completed               0          13m
ceph-mgr-566969ff9f-bhnsz                   0/1       CrashLoopBackOff        6          7m
ceph-mgr-keyring-generator-gplx2            0/1       Completed               0          13m
ceph-mon-check-9fd5797bc-nb5l6              1/1       Running                 0          11m
ceph-mon-fpd6w                              3/3       Running                 0          13m
ceph-mon-keyring-generator-kvsgc            0/1       Completed               0          13m
ceph-namespace-client-key-generator-fg9nv   0/1       Completed               0          7m
ceph-osd-dev-sdb-4qnd9                      0/1       Init:CrashLoopBackOff   6          13m
ceph-osd-dev-sdb-glk52                      0/1       Init:CrashLoopBackOff   6          13m
ceph-osd-keyring-generator-9ztc7            0/1       Completed               0          13m
ceph-rbd-provisioner-5bc57f5f64-pmnr6       1/1       Running                 0          13m
ceph-rbd-provisioner-5bc57f5f64-sllbc       1/1       Running                 0          13m
ceph-rgw-597dcb57f7-9nzrz                   0/1       Pending                 0          13m
ceph-rgw-keyring-generator-5j844            0/1       Completed               0          13m
ceph-storage-keys-generator-t22q7           0/1       Completed               0          13m

kubectl describe pod/ceph-mon-fpd6w -n ceph

Events:
  Type     Reason       Age                From           Message
  ----     ------       ----               ----           -------
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-mon-keyring" : secrets "ceph-mon-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secrets "ceph-bootstrap-mds-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secrets "ceph-bootstrap-rgw-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secrets "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount  16m (x5 over 16m)  kubelet, lab1  MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secrets "ceph-client-admin-keyring" not found

But, I can get the secet.

# kubectl get secret -n ceph
NAME                                  TYPE                                  DATA      AGE
ceph-bootstrap-mds-keyring            Opaque                                1         16m
ceph-bootstrap-mgr-keyring            Opaque                                1         16m
ceph-bootstrap-osd-keyring            Opaque                                1         16m
ceph-bootstrap-rgw-keyring            Opaque                                1         16m
ceph-client-admin-keyring             Opaque                                1         16m
ceph-keystone-user-rgw                Opaque                                7         16m
ceph-mon-keyring                      Opaque                                1         16m
default-token-htx2q                   kubernetes.io/service-account-token   3         16m
pvc-ceph-client-key                   kubernetes.io/rbd                     1         10m
pvc-ceph-conf-combined-storageclass   kubernetes.io/rbd                     1         16m

kubectl logs -f ceph-mon-check-9fd5797bc-nb5l6 -n ceph

+ echo '2018-08-16 03:43:15  /watch_mon_health.sh: sleep 30 sec'
+ return 0
+ sleep 30
+ '[' true ']'
+ log 'checking for zombie mons'
+ '[' -z 'checking for zombie mons' ']'
++ date '+%F %T'
2018-08-16 03:43:45  /watch_mon_health.sh: checking for zombie mons
+ TIMESTAMP='2018-08-16 03:43:45'
+ echo '2018-08-16 03:43:45  /watch_mon_health.sh: checking for zombie mons'
+ return 0
+ CLUSTER=ceph
+ /check_zombie_mons.py
2018-08-16 03:43:46.122705 7fb2f3994700  0 librados: client.admin authentication error (1) Operation not permitted
[errno 1] error connecting to the cluster
Traceback (most recent call last):
  File "/check_zombie_mons.py", line 30, in <module>
    current_mons = extract_mons_from_monmap()
  File "/check_zombie_mons.py", line 18, in extract_mons_from_monmap
    monmap = subprocess.check_output(monmap_command, shell=True)
  File "/usr/lib64/python2.7/subprocess.py", line 575, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command 'ceph --cluster=${CLUSTER} mon getmap > /tmp/monmap && monmaptool -f /tmp/monmap --print' returned non-zero exit status 1

MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secret "ceph-bootstrap-osd-keyring" not found

Is this a request for help?: yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one):BUG REPORT

Version of Helm and Kubernetes:
kubeadm version 1.12.1
helm 2.9.1

Which chart:
Just follow the http://docs.ceph.com/docs/master/start/kube-helm/
Would like to have osd on every node /dev/sdb

What happened:
tried many times, the osd pod can't start
Events:
Type Reason Age From Message


Normal Scheduled 17m default-scheduler Successfully assigned ceph/ceph-osd-dev-sdb-4fd4c to pro-docker-2-64
Warning FailedMount 17m (x5 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secret "ceph-bootstrap-osd-keyring" not found
Warning FailedMount 17m (x5 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secret "ceph-bootstrap-mds-keyring" not found
Warning FailedMount 17m (x5 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-mon-keyring" : secret "ceph-mon-keyring" not found
Warning FailedMount 11m (x11 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secret "ceph-client-admin-keyring" not found
Warning FailedMount 7m8s (x13 over 17m) kubelet, pro-docker-2-64 MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secret "ceph-bootstrap-rgw-keyring" not found
Warning FailedMount 103s (x7 over 15m) kubelet, pro-docker-2-64 Unable to mount volumes for pod "ceph-osd-dev-sdb-4fd4c_ceph(ede3590e-d027-11e8-a389-005056b224f1)": timeout expired waiting for volumes to attach or mount for pod "ceph"/"ceph-osd-dev-sdb-4fd4c". list of unmounted volumes=[ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring]. list of unattached volumes=[devices pod-var-lib-ceph pod-run ceph-bin ceph-etc ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring run-udev default-token-k2x7h]

What you expected to happen:
expect to run 'helm status ceph' and see ceph ready to deploy.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:
I am using vmware to add virtual disk /dev/sdb. entire k8s master and node on Ubuntu18.04 with latest update.

Many thanks,

secrets must be read-only

Is this a request for help?: no

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.8+unreleased", GitCommit:"c5f2174f264554c62278c0695d58f250d3e207c8", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"fe9d36533901b71923c49142f5cf007f93fa926f", GitTreeState:"clean"}

Kubernetes master > 1.9

Which chart: ceph

What happened:

I compiled k8s master from source (commit 04634cb19843195) and brought up a local cluster with:

RUNTIME_CONFIG=storage.k8s.io/v1alpha1=true ALLOW_PRIVILEGED=1 FEATURE_GATES="BlockVolume=true,MountPropagation=true,CSIPersistentVolume=true," hack/local-up-cluster.sh -O

Then I followed http://docs.ceph.com/docs/master/start/kube-helm/#configure-your-ceph-cluster to install the ceph chart.

In that installation, the start_mon.sh script in the ceph-mon pod fails with:

+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-osd/ceph.keyring into /etc/ceph/ceph.mon.keyring
bufferlist::write_file(/etc/ceph/ceph.mon.keyring): failed to open file: (30) Read-only file system
could not write /etc/ceph/ceph.mon.keyring

What you expected to happen:

The script shouldn't write into a secret. The modification is not stored permanently in older Kubernetes releases and starting with 1.10, the default will be to mount secrets as read-only, even if "readonly: false" is used - see kubernetes/kubernetes#58720.

Anything else we need to know:

@intlabs said on Slack that he's going to fix this for openstack-helm/ceph. In the meantime one can use ReadOnlyAPIDataVolumes=false in FEATURE_GATES to restore the old behavior.

Here's a fix that worked for me. It's intentionally very minimal, perhaps the right solution also has to clean up the usage of secret in other pods:

diff --git a/ceph/ceph/templates/bin/_start_mon.sh.tpl b/ceph/ceph/templates/bin/_start_mon.sh.tpl
index 50e4bfd..5b3330c 100644
--- a/ceph/ceph/templates/bin/_start_mon.sh.tpl
+++ b/ceph/ceph/templates/bin/_start_mon.sh.tpl
@@ -62,8 +62,7 @@ chown ceph. /var/log/ceph
 # If we don't have a monitor keyring, this is a new monitor
 if [ ! -e "$MON_DATA_DIR/keyring" ]; then
   if [ ! -e $MON_KEYRING ]; then
-    log "ERROR- $MON_KEYRING must exist.  You can extract it from your current monitor by running 'ceph auth get mon. -o $MON_KEYRING' or use a KV Store"
-    exit 1
+    touch $MON_KEYRING
   fi
 
   if [ ! -e $MONMAP ]; then
diff --git a/ceph/ceph/templates/daemonset-mon.yaml b/ceph/ceph/templates/daemonset-mon.yaml
index 4b9c90d..3c26211 100644
--- a/ceph/ceph/templates/daemonset-mon.yaml
+++ b/ceph/ceph/templates/daemonset-mon.yaml
@@ -141,10 +141,6 @@ spec:
               mountPath: /etc/ceph/ceph.client.admin.keyring
               subPath: ceph.client.admin.keyring
               readOnly: true
-            - name: ceph-mon-keyring
-              mountPath: /etc/ceph/ceph.mon.keyring
-              subPath: ceph.mon.keyring
-              readOnly: false
             - name: ceph-bin
               mountPath: /variables_entrypoint.sh
               subPath: variables_entrypoint.sh
@@ -195,9 +191,6 @@ spec:
         - name: ceph-client-admin-keyring
           secret:
             secretName: {{ .Values.secrets.keyrings.admin }}
-        - name: ceph-mon-keyring
-          secret:
-            secretName: {{ .Values.secrets.keyrings.mon }}
         - name: ceph-bootstrap-osd-keyring
           secret:
             secretName: {{ .Values.secrets.keyrings.osd }}

the DNS pod can not resolve ceph-monitor'name ceph-mon.ceph.svc.cluster.local

Is this a request for help?: yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

the container alawys show "1 dns.go:555] Could not find endpoints for service "ceph-mon" in namespace "ceph". DNS records will be created once endpoints show up. " in pod kube-dns-85bc874cc5-mdzhb
"

[root@master ceph]# helm install --name=ceph local/ceph --namespace=ceph
NAME: ceph
LAST DEPLOYED: Tue Jun 12 09:53:41 2018
NAMESPACE: ceph
STATUS: DEPLOYED

RESOURCES:
==> v1beta1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
ceph-mon 1 1 0 1 0 ceph-mon=enabled 1s
ceph-osd-dev-sda 1 1 0 1 0 ceph-osd-device-dev-sda=enabled,ceph-osd=enabled 1s

==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
ceph-mds 1 1 1 0 1s
ceph-mgr 1 1 1 0 1s
ceph-mon-check 1 1 1 0 1s
ceph-rbd-provisioner 2 2 2 0 1s
ceph-rgw 1 1 1 0 1s

==> v1/Job
NAME DESIRED SUCCESSFUL AGE
ceph-mon-keyring-generator 1 0 1s
ceph-mds-keyring-generator 1 0 1s
ceph-osd-keyring-generator 1 0 1s
ceph-mgr-keyring-generator 1 0 1s
ceph-rgw-keyring-generator 1 0 1s
ceph-namespace-client-key-generator 1 0 1s
ceph-storage-keys-generator 1 0 1s

==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
ceph-mon-rsjkn 0/3 Init:0/2 0 1s
ceph-osd-dev-sda-jb8s7 0/1 Init:0/3 0 1s
ceph-mds-696bd98bdb-92tj2 0/1 Init:0/2 0 1s
ceph-mgr-56f45bb99c-pmpfm 0/1 Pending 0 1s
ceph-mon-check-74d98c5b95-k5xc5 0/1 Pending 0 1s
ceph-rbd-provisioner-b58659dc9-llllj 0/1 Pending 0 1s
ceph-rbd-provisioner-b58659dc9-rh4zd 0/1 ContainerCreating 0 1s
ceph-rgw-5bd9dd66c5-q5vzp 0/1 Pending 0 1s
ceph-mon-keyring-generator-nzg2l 0/1 Pending 0 1s
ceph-mds-keyring-generator-cr8ql 0/1 Pending 0 1s
ceph-osd-keyring-generator-z5jrq 0/1 Pending 0 1s
ceph-mgr-keyring-generator-kw2wj 0/1 Pending 0 1s
ceph-rgw-keyring-generator-6kghm 0/1 Pending 0 1s
ceph-namespace-client-key-generator-dk968 0/1 Pending 0 1s
ceph-storage-keys-generator-4mhhk 0/1 Pending 0 1s

==> v1/Secret
NAME TYPE DATA AGE
ceph-keystone-user-rgw Opaque 7 1s

==> v1/ConfigMap
NAME DATA AGE
ceph-bin-clients 2 1s
ceph-bin 26 1s
ceph-etc 1 1s
ceph-templates 5 1s

==> v1/StorageClass
NAME PROVISIONER AGE
ceph-rbd ceph.com/rbd 1s

==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ceph-mon ClusterIP None 6789/TCP 1s
ceph-rgw ClusterIP 10.109.46.173 8088/TCP 1s

[root@master ceph]# kubectl exec kube-dns-85bc874cc5-mdzhb -ti -n kube-system -c kubedns -- sh
/ # ps
PID USER TIME COMMAND
1 root 3:19 /kube-dns --domain=172.16.34.88. --dns-port=10053 --config-dir=/kube-dns-config --v=2
26 root 0:35 ping ceph-mon.ceph.svc.cluster.local
157 root 0:00 sh
161 root 0:00 sh
165 root 0:00 sh
/ #
/ # ping ceph-mon.ceph.svc.cluster.local
ping: bad address 'ceph-mon.ceph.svc.cluster.local'
/ #

[root@master ceph]# kubectl get pod --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
ceph ceph-mds-696bd98bdb-rvq42 0/1 CrashLoopBackOff 6 11m
ceph ceph-mds-keyring-generator-6nrct 0/1 Completed 0 11m
ceph ceph-mgr-56f45bb99c-smqqj 0/1 CrashLoopBackOff 6 11m
ceph ceph-mgr-keyring-generator-kdjd4 0/1 Completed 0 11m
ceph ceph-mon-check-74d98c5b95-nqhmg 1/1 Running 0 11m
ceph ceph-mon-keyring-generator-7xmd8 0/1 Completed 0 11m
ceph ceph-mon-m72hp 3/3 Running 0 11m
ceph ceph-namespace-client-key-generator-cvnpw 0/1 Completed 0 11m
ceph ceph-osd-dev-sda-kzn65 0/1 Init:CrashLoopBackOff 6 11m
ceph ceph-osd-keyring-generator-48gb6 0/1 Completed 0 11m
ceph ceph-rbd-provisioner-b58659dc9-7jsnk 1/1 Running 0 11m
ceph ceph-rbd-provisioner-b58659dc9-sf6hr 1/1 Running 0 11m
ceph ceph-rgw-5bd9dd66c5-n25bn 0/1 CrashLoopBackOff 6 11m
ceph ceph-rgw-keyring-generator-vs8th 0/1 Completed 0 11m
ceph ceph-storage-keys-generator-ww7hn 0/1 Completed 0 11m
default busybox 1/1 Running 113 4d
kube-system etcd-master 1/1 Running 8 24d
kube-system heapster-69b5d4974d-9g96p 1/1 Running 10 24d
kube-system kube-apiserver-master 1/1 Running 8 24d
kube-system kube-controller-manager-master 1/1 Running 8 24d
kube-system kube-dns-85bc874cc5-mdzhb 3/3 Running 27 24d
kube-system kube-flannel-ds-b94c4 1/1 Running 12 24d
kube-system kube-flannel-ds-sqzwv 1/1 Running 10 24d
kube-system kube-proxy-9j6sq 1/1 Running 10 24d
kube-system kube-proxy-znkxj 1/1 Running 7 24d
kube-system kube-scheduler-master 1/1 Running 8 24d
kube-system kubernetes-dashboard-7d5dcdb6d9-c2sz6 1/1 Running 10 24d
kube-system monitoring-grafana-69df66f668-fpgn5 1/1 Running 10 24d
kube-system monitoring-influxdb-78d4c6f5b6-hnjg2 1/1 Running 50 24d
kube-system tiller-deploy-f9b8476d-trtml 1/1 Running 0 4d

Version of Helm and Kubernetes:

[root@master ceph]# helm version
Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

[root@master ceph]# kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:10:24Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

Which chart:

What happened:
DNS pod can not resolve ceph-mon.ceph.svc.cluster.local

What you expected to happen:
DNS pod can resolve ceph-mon.ceph.svc.cluster.local

How to reproduce it (as minimally and precisely as possible):
always

Anything else we need to know:
None

Not compatible with latest kubernetes

If this is a BUG REPORT, please:
-The deployment isn't compatible with kubernetes 1.16

What happened:

secret/ceph-keystone-user-rgw unchanged
configmap/ceph-bin-clients unchanged
configmap/ceph-bin unchanged
configmap/ceph-etc configured
configmap/ceph-templates unchanged
storageclass.storage.k8s.io/general unchanged
service/ceph-mon unchanged
service/ceph-rgw unchanged
job.batch/ceph-storage-admin-key-cleaner-z0qki created
job.batch/ceph-mds-keyring-generator unchanged
job.batch/ceph-osd-keyring-generator unchanged
job.batch/ceph-rgw-keyring-generator unchanged
job.batch/ceph-mon-keyring-generator unchanged
job.batch/ceph-mgr-keyring-generator unchanged
job.batch/ceph-namespace-client-key-cleaner-xkx0p created
job.batch/ceph-namespace-client-key-generator unchanged
job.batch/ceph-storage-keys-generator unchanged
unable to recognize "STDIN": no matches for kind "DaemonSet" in version "extensions/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "extensions/v1beta1"
unable to recognize "STDIN": no matches for kind "Deployment" in version "apps/v1beta1"

What you expected to happen:
A cluster would need to be deployed

How to reproduce it (as minimally and precisely as possible):
Install latest kubernetes, k3s,.. download the charts and run them

Behavior when removing an OSD node

When removing a OSD node from the cluster:
$ kubectl label node mira115 ceph-osd=disabled --overwrite

The ceph-osd PODs are deleted however the OSD are marked down and are still present in the cluster.
The expected behavior since it's a node removal should include running the following commands:

  • for each OSD: ceph osd purge <osd>
  • Then: ceph osd crush rm <hostname>
  • maybe: zap the OSD drives?

Otherwise, adding back this node, we end up with:

-2       12.59991     host mira115
 1   hdd  0.89999         osd.1      down        0 1.00000
 2   hdd  0.89999         osd.2      down        0 1.00000
 3   hdd  0.89999         osd.3      down        0 1.00000
 5   hdd  0.89999         osd.5      down        0 1.00000
 6   hdd  0.89999         osd.6      down        0 1.00000
17   hdd  0.89999         osd.17     down        0 1.00000
18   hdd  0.89999         osd.18     down  1.00000 1.00000
20   hdd  0.89999         osd.20       up  1.00000 1.00000
21   hdd  0.89999         osd.21       up  1.00000 1.00000
22   hdd  0.89999         osd.22       up  1.00000 1.00000
23   hdd  0.89999         osd.23       up  1.00000 1.00000
24   hdd  0.89999         osd.24       up  1.00000 1.00000
25   hdd  0.89999         osd.25       up  1.00000 1.00000
26   hdd  0.89999         osd.26       up  1.00000 1.00000

Deploying ceph multiple times causes authentication error.

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
kubernetes 1.8.3

Which chart:
ceph

What happened:
I have a kubernetes deployment running. When I try to deploy ceph for the first time, it works fine.
But when I purge ceph helm chart and redeploy ceph, ceph-mon, ceph-mon-check works fine but ceph-mds, ceph-mgr have an authentication failure and timeout connecting to ceph-mon.

What you expected to happen:
I would expect that all the services are able to connect to monitor without any authentication issues.

How to reproduce it (as minimally and precisely as possible):
Deploy ceph, purge ceph chart, deploy ceph again.

Anything else we need to know:
Something to note: I am not deleting the hostpath folder in the nodes on which ceph is being re deployed. I was expecting that redeploying would persist the content of ceph.

Change osd and mon path on minikube

Is this a request for help?:

no

Is this a BUG REPORT or FEATURE REQUEST? (choose one):

FEATURE REQUEST

Version of Helm and Kubernetes:

Client: &version.Version{SemVer:"v2.12.0", GitCommit:"d325d2a9c179b33af1a024cdb5a4472b6288016a", GitTreeState:"clean"}
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.1", GitCommit:"eec55b9ba98609a46fee712359c7b5b365bdd920", GitTreeState:"clean", BuildDate:"2018-12-13T10:39:04Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}

Which chart:

ceph-helm

What happened:

does not work on minikube

What you expected to happen:

work on minikube

How to reproduce it (as minimally and precisely as possible):

follow the install process

Anything else we need to know:

in order to fix this update :

osd_directory: /var/lib/ceph-helm
mon_directory: /var/lib/ceph-helm

to

osd_directory: /data/ceph-helm
mon_directory: /data/lib/ceph-helm

perhaps this should be specified in the documention.

diff --git a/ceph/ceph/values.yaml b/ceph/ceph/values.yaml
index 5831c53..72a74b7 100644
--- a/ceph/ceph/values.yaml
+++ b/ceph/ceph/values.yaml
@@ -254,8 +254,8 @@ ceph:
     mgr: true
   storage:
     # will have $NAMESPACE/{osd,mon} appended
-    osd_directory: /var/lib/ceph-helm
-    mon_directory: /var/lib/ceph-helm
+    osd_directory: /data/ceph-helm
+    mon_directory: /data/lib/ceph-helm
     # use /var/log for fluentd to collect ceph log
     # mon_log: /var/log/ceph/mon
     # osd_log: /var/log/ceph/osd

ceph-mgr and ceph-osd is not starting

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.12.3", GitCommit:"eecf22f77df5f65c823aacd2dbd30ae6c65f186e", GitTreeState:"clean"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:37:52Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-02-28T13:30:26Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Which chart: ceph

What happened: ceph-mgr and ceph-osd won't start up
ouput of ceph-mgr:

๏ปฟ+ source variables_entrypoint.sh 
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr' 
++ : ceph 
++ : ceph-config/ceph 
++ : 
++ : 
++ : 0 
++ : dockerblade-slot6-oben.example.com 
++ : dockerblade-slot6-oben.example.com 
++ : /etc/ceph/monmap-ceph 
++ : /var/lib/ceph/mon/ceph-dockerblade-slot6-oben.example.com 
++ : 0 
++ : 0 
++ : mds-dockerblade-slot6-oben.example.com 
++ : 0 
++ : 100 
++ : 0 
++ : 0 
+++ uuidgen 
++ : 5700ffd2-02f6-4212-8a76-8a57f3fe2a04 
+++ uuidgen 
++ : 38e7aef4-c42b-457a-af33-fa8dc3ff1eb7 
++ : root=default host=dockerblade-slot6-oben.example.com 
++ : 0 
++ : cephfs 
++ : cephfs_data 
++ : 8 
++ : cephfs_metadata 
++ : 8 
++ : dockerblade-slot6-oben.example.com 
++ : 
++ : 
++ : 8080 
++ : 0 
++ : 9000 
++ : 0.0.0.0 
++ : cephnfs 
++ : dockerblade-slot6-oben.example.com 
++ : 0.0.0.0 
++ CLI_OPTS='--cluster ceph' 
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d' 
++ MOUNT_OPTS='-t xfs -o noatime,inode64' 
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-dockerblade-slot6-oben.example.com/keyring 
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring 
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring 
++ RGW_KEYRING=/var/lib/ceph/radosgw/dockerblade-slot6-oben.example.com/keyring 
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring 
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring 
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring 
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring 
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph 
+ source common_functions.sh 
++ set -ex 
+ [[ ! -e /usr/bin/ceph-mgr ]] 
+ [[ ! -e /etc/ceph/ceph.conf ]] 
+ '[' 0 -eq 1 ']' 
+ '[' '!' -e /var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring ']' 
+ timeout 10 ceph --cluster ceph auth get-or-create mgr.dockerblade-slot6-oben.example.com mon 'allow profile mgr' osd 'allow *' mds 'allow *' -o /var/lib/ceph/mgr/ceph-dockerblade-slot6-oben.example.com/keyring 

and the output of ceph-osd's osd-prepare-pod:

๏ปฟ+ export LC_ALL=C 
+ LC_ALL=C 
+ source variables_entrypoint.sh 
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr' 
++ : ceph 
++ : ceph-config/ceph 
++ : 
++ : osd_ceph_disk_prepare 
++ : 1 
++ : dockerblade-slot5-unten 
++ : dockerblade-slot5-unten 
++ : /etc/ceph/monmap-ceph 
++ : /var/lib/ceph/mon/ceph-dockerblade-slot5-unten 
++ : 0 
++ : 0 
++ : mds-dockerblade-slot5-unten 
++ : 0 
++ : 100 
++ : 0 
++ : 0 
+++ uuidgen 
++ : e101933b-67b3-4267-824f-173d2ef7a47b 
+++ uuidgen 
++ : 10dd57d2-f3c7-4cab-88ea-8e3771baeaa7 
++ : root=default host=dockerblade-slot5-unten 
++ : 0 
++ : cephfs 
++ : cephfs_data 
++ : 8 
++ : cephfs_metadata 
++ : 8 
++ : dockerblade-slot5-unten 
++ : 
++ : 
++ : 8080 
++ : 0 
++ : 9000 
++ : 0.0.0.0 
++ : cephnfs 
++ : dockerblade-slot5-unten 
++ : 0.0.0.0 
++ CLI_OPTS='--cluster ceph' 
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d' 
++ MOUNT_OPTS='-t xfs -o noatime,inode64' 
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-dockerblade-slot5-unten/keyring 
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring 
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring 
++ RGW_KEYRING=/var/lib/ceph/radosgw/dockerblade-slot5-unten/keyring 
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-dockerblade-slot5-unten/keyring 
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring 
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring 
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring 
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph 
+ source common_functions.sh 
++ set -ex 
+ is_available rpm 
+ command -v rpm 
+ is_available dpkg 
+ command -v dpkg 
+ OS_VENDOR=ubuntu 
+ source /etc/default/ceph 
++ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728 
+ case "$CEPH_DAEMON" in 
+ OSD_TYPE=prepare 
+ start_osd 
+ [[ ! -e /etc/ceph/ceph.conf ]] 
+ '[' 1 -eq 1 ']' 
+ [[ ! -e /etc/ceph/ceph.client.admin.keyring ]] 
+ case "$OSD_TYPE" in 
+ source osd_disk_prepare.sh 
++ set -ex 
+ osd_disk_prepare 
+ [[ -z /dev/container/block-data ]] 
+ [[ ! -e /dev/container/block-data ]] 
+ '[' '!' -e /var/lib/ceph/bootstrap-osd/ceph.keyring ']' 
+ timeout 10 ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring health 
+ exit 1 

What you expected to happen: to start up flawlessly

Anything else we need to know:
here is my overrides.yml:

network:
  public: 10.42.0.0/16
  cluster: 10.42.0.0/16

osd_devices:
  - name: block-data
    device: /dev/container/block-data
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s

I am using Rancher 2 / RKE on bare-metal. I am unsure about the network-setup. Maybe i have some issues here:

  • All nodes (6) can see and reach each other by IPv4 address only. Although the nodes have names there is no DNS set up outside of the cluster
  • Rancher/RKE sets up a flannel-network with CIDR 10.42.0.0/16 which is what i used as network.public and network.cluster

Ceph OSD's readiness probe fails.

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
Latest Master Branch.

Which chart:
Ceph-Helm

What happened:
I have deployed ceph helm chart on to a kubernetes cluster with the following overrides.
I am running MDS too.
`network:
public: 10.244.0.0/16
cluster: 10.244.0.0/16

ceph_mgr_modules_config:
dashboard:
port: 7000

osd_directory:
enabled: true

manifests:
deployment_rgw: false
service_rgw: false
daemonset_osd: true

storageclass:
name: ceph-rbd
pool: rbd
user_id: k8s`

All the services run correctly except OSD's. I am using osd_directory: enabled.
OSD fail on readiness probe.

Readiness probe failed: dial tcp 10.211.55.186:6800: getsockopt: connection refused Back-off restarting failed container Error syncing pod

What you expected to happen:
I expected to get all the osd's running too along with mon, mgr, mds.

How to reproduce it (as minimally and precisely as possible):
My cluster setup.
1 Master node
2 Worker nodes

  1. Use the following overrides file.

`network:
public: 10.244.0.0/16
cluster: 10.244.0.0/16

ceph_mgr_modules_config:
dashboard:
port: 7000

osd_directory:
enabled: true

manifests:
deployment_rgw: false
service_rgw: false
daemonset_osd: true

storageclass:
name: ceph-rbd
pool: rbd
user_id: k8s`

  1. Please replace public/cluster network to whatever is applicable in your kubernetes cluster.

  2. Add ceph-mon=enabled,ceph-mds=enabled,ceph-mgr=enabled to master node.
    Add ceph-osd=enabled to two worker nodes.

Follow the deploy instructions in http://docs.ceph.com/docs/master/start/kube-helm/ with the above changes to it.

Anything else we need to know:

Error on OSD pod: "MountVolume.SetUp failed for volume ..." using ceph-helm on GKE

I'm trying deploy Ceph on GKE(k8s) using ceph-helm, but after run "helm install ..." osd pod can't be created due to error "MountVolume.SetUp failed for volume ..." Full error see below

GKE env:
2 nodes: n1-standart-1 (1 virtual CPU, 3.75Gb RAM, hdd 100Gb, 2 mounted ssd 375Gb)
kubernetes: 1.9.7-gke.6 or 1.10.7-gke.6

I use this instruction (my script see below)
http://docs.ceph.com/docs/mimic/start/kube-helm/
but it fails on command "helm install ..."

kubectl get pods -n ceph

NAME                                        READY     STATUS                  RESTARTS   AGE
ceph-mds-5696f9df5d-nmgtb                   0/1       Pending                 0          12m
ceph-mds-keyring-generator-rh9tr            0/1       Completed               0          12m
ceph-mgr-8656b978df-w4mt6                   1/1       Running                 2          12m
ceph-mgr-keyring-generator-t2x7j            0/1       Completed               0          12m
ceph-mon-check-7d49bd686c-nmpw5             1/1       Running                 0          12m
ceph-mon-keyring-generator-hpbcg            0/1       Completed               0          12m
ceph-mon-xjjs4                              3/3       Running                 0          12m
ceph-namespace-client-key-generator-np2kv   0/1       Completed               0          12m
ceph-osd-dev-sdb-5wzs6                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-dev-sdb-zwldd                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-dev-sdc-qsqpl                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-dev-sdc-x4722                      0/1       Init:CrashLoopBackOff   6          12m
ceph-osd-keyring-generator-xlmmb            0/1       Completed               0          12m
ceph-rbd-provisioner-5544dcbcf5-gb9ws       1/1       Running                 0          12m
ceph-rbd-provisioner-5544dcbcf5-hnmjm       1/1       Running                 0          12m
ceph-rgw-65b4bd8cc5-24fxz                   0/1       Pending                 0          12m
ceph-rgw-keyring-generator-4fp2j            0/1       Completed               0          12m
ceph-storage-keys-generator-x5nzl           0/1       Completed               0          12m

Describe failed osd pod shows:

Events:
  Type     Reason                 Age                From                                                        Message
  ----     ------                 ----               ----                                                        -------
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "run-udev"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "pod-var-lib-ceph"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "pod-run"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "ceph-etc"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "ceph-bin"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "devices"
  Normal   SuccessfulMountVolume  14m                kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp succeeded for volume "default-token-mdmzq"
  Warning  FailedMount            14m (x3 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secrets "ceph-client-admin-keyring" not found
  Warning  FailedMount            14m (x3 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secrets "ceph-bootstrap-mds-keyring" not found
  Warning  FailedMount            14m (x4 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secrets "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount            14m (x4 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-mon-keyring" : secrets "ceph-mon-keyring" not found
  Warning  FailedMount            14m (x4 over 14m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secrets "ceph-bootstrap-rgw-keyring" not found
  Normal   Pulled                 9m (x5 over 11m)   kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  Container image "docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04" already present on machine
  Warning  BackOff                4m (x29 over 11m)  kubelet, gke-standard-cluster-2-default-pool-8b55990f-s264  Back-off restarting failed container

Command "lsblk -f" on both nodes shows

NAME   FSTYPE LABEL           UUID                                 MOUNTPOINT
sdb                                                                /mnt/disks/ssd0
โ”œโ”€sdb2                                                             
โ””โ”€sdb1                                                             
sdc                                                                /mnt/disks/ssd1
โ”œโ”€sdc2                                                             
โ””โ”€sdc1                                                             
sda                                                                
โ””โ”€sda1 ext4   cloudimg-rootfs 819b0621-c9ea-4d69-b955-966a1b7c9cff /

Command "gdisk -l /dev/sdb" (and sdc) on osd nodes shows

GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdc: 98304000 sectors, 375.0 GiB
Logical sector size: 4096 bytes
Disk identifier (GUID): 86E73D28-8AD8-4F5A-B58C-12C61E508C96
Partition table holds up to 128 entries
First usable sector is 6, last usable sector is 98303994
Partitions will be aligned on 256-sector boundaries
Total free space is 250 sectors (1000.0 KiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1         1310976        98303994   370.0 GiB   F804  ceph data
   2             256         1310975   5.0 GiB     F802  ceph journal

Commands below show no error
kubectl logs -n ceph pod/ceph-mon-xjjs4 -c ceph-mon | grep error
kubectl logs -n ceph pod/ceph-mon-xjjs4 -c cluster-log-tailer | grep error
kubectl logs -n ceph pod/ceph-mon-xjjs4 -c cluster-audit-log-tailer | grep error

Output "kubectl logs -n ceph pod/ceph-mon-xjjs4 -c ceph-mon"

+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ : 172.21.0.0/20
++ : mon
++ : 0
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : /var/lib/ceph/mon/monmap
++ : /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 1
++ : 0
++ : mds-gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : 63d877a7-1d14-4882-bdde-a6995abbf4a3
+++ uuidgen
++ : d7796d8c-7c16-4765-bcb4-e49b4e34c8cf
++ : root=default host=gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : gke-standard-cluster-2-default-pool-8b55990f-bj85
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ [[ -z 172.21.0.0/20 ]]
+ [[ -z 10.140.0.2 ]]
+ [[ -z 10.140.0.2 ]]
+ [[ -z 172.21.0.0/20 ]]
+ get_mon_config
++ ceph-conf --lookup fsid -c /etc/ceph/ceph.conf
+ local fsid=ba3982a0-3a07-45b6-b69b-02bc37deeb00
+ timeout=10
+ MONMAP_ADD=
+ [[ -z '' ]]
+ [[ 10 -gt 0 ]]
+ [[ 1 -eq 0 ]]
++ kubectl get pods --namespace=ceph -l application=ceph -l component=mon -o template '--template={{range .items}}{{if .status.podIP}}--add {{.spec.nodeName}} {{.status.podIP}} {{end}} {{end}}'
+ MONMAP_ADD='--add gke-standard-cluster-2-default-pool-8b55990f-bj85 10.140.0.2  '
+ ((  timeout--  ))
+ sleep 1
+ [[ -z --addgke-standard-cluster-2-default-pool-8b55990f-bj8510.140.0.2 ]]
+ [[ -z --addgke-standard-cluster-2-default-pool-8b55990f-bj8510.140.0.2 ]]
+ '[' -f /var/lib/ceph/mon/monmap ']'
+ monmaptool --create --add gke-standard-cluster-2-default-pool-8b55990f-bj85 10.140.0.2 --fsid ba3982a0-3a07-45b6-b69b-02bc37deeb00 /var/lib/ceph/mon/monmap --clobber
monmaptool: monmap file /var/lib/ceph/mon/monmap
monmaptool: set fsid to ba3982a0-3a07-45b6-b69b-02bc37deeb00
monmaptool: writing epoch 0 to /var/lib/ceph/mon/monmap (1 monitors)
+ chown ceph. /var/log/ceph
+ [[ ! -e /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/keyring ]]
+ [[ ! -e /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/done ]]
+ '[' '!' -e /etc/ceph/ceph.mon.keyring.seed ']'
+ cp -vf /etc/ceph/ceph.mon.keyring.seed /etc/ceph/ceph.mon.keyring
'/etc/ceph/ceph.mon.keyring.seed' -> '/etc/ceph/ceph.mon.keyring'
+ '[' '!' -e /var/lib/ceph/mon/monmap ']'
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-osd/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-mds/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-mds/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-rgw/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
importing contents of /etc/ceph/ceph.client.admin.keyring into /etc/ceph/ceph.mon.keyring
+ ceph-mon --setuser ceph --setgroup ceph --cluster ceph --mkfs -i gke-standard-cluster-2-default-pool-8b55990f-bj85 --monmap /var/lib/ceph/mon/monmap --keyring /etc/ceph/ceph.mon.keyring --mon-data /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85
+ touch /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/done
+ log SUCCESS
+ '[' -z SUCCESS ']'
++ date '+%F %T'
2018-10-19 18:44:44  /start_mon.sh: SUCCESS
+ TIMESTAMP='2018-10-19 18:44:44'
+ echo '2018-10-19 18:44:44  /start_mon.sh: SUCCESS'
+ return 0
+ exec /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i gke-standard-cluster-2-default-pool-8b55990f-bj85 --mon-data /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85 --public-addr 10.140.0.2:6789
2018-10-19 18:44:44.694633 7fcdefd43f00  0 set uid:gid to 64045:64045 (ceph:ceph)
2018-10-19 18:44:44.694866 7fcdefd43f00  0 ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable), process (unknown), pid 1
2018-10-19 18:44:44.695016 7fcdefd43f00  0 pidfile_write: ignore empty --pid-file
2018-10-19 18:44:44.702045 7fcdefd43f00  0 load: jerasure load: lrc load: isa
2018-10-19 18:44:44.702378 7fcdefd43f00  0  set rocksdb option compression = kNoCompression
2018-10-19 18:44:44.702472 7fcdefd43f00  0  set rocksdb option write_buffer_size = 33554432
2018-10-19 18:44:44.702548 7fcdefd43f00  0  set rocksdb option compression = kNoCompression
2018-10-19 18:44:44.702609 7fcdefd43f00  0  set rocksdb option write_buffer_size = 33554432
2018-10-19 18:44:44.702874 7fcdefd43f00  4 rocksdb: RocksDB version: 5.4.0

2018-10-19 18:44:44.702934 7fcdefd43f00  4 rocksdb: Git sha rocksdb_build_git_sha:@0@
2018-10-19 18:44:44.702970 7fcdefd43f00  4 rocksdb: Compile date Feb 19 2018
2018-10-19 18:44:44.703023 7fcdefd43f00  4 rocksdb: DB SUMMARY

2018-10-19 18:44:44.703128 7fcdefd43f00  4 rocksdb: CURRENT file:  CURRENT

2018-10-19 18:44:44.703183 7fcdefd43f00  4 rocksdb: IDENTITY file:  IDENTITY

2018-10-19 18:44:44.703225 7fcdefd43f00  4 rocksdb: MANIFEST file:  MANIFEST-000001 size: 13 Bytes

2018-10-19 18:44:44.703281 7fcdefd43f00  4 rocksdb: SST files in /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/store.db dir, Total Num: 0, files:

2018-10-19 18:44:44.703333 7fcdefd43f00  4 rocksdb: Write Ahead Log file in /var/lib/ceph/mon/ceph-gke-standard-cluster-2-default-pool-8b55990f-bj85/store.db: 000003.log size: 1103 ;

2018-10-19 18:44:44.703369 7fcdefd43f00  4 rocksdb:                         Options.error_if_exists: 0
2018-10-19 18:44:44.703422 7fcdefd43f00  4 rocksdb:                       Options.create_if_missing: 0
2018-10-19 18:44:44.703457 7fcdefd43f00  4 rocksdb:                         Options.paranoid_checks: 1
...

My script for ceph installation

#INSTALL AND START HELM
sudo apt-get update
mkdir ~/helm
cd ~/helm
wget https://storage.googleapis.com/kubernetes-helm/helm-v2.11.0-linux-386.tar.gz
tar -zxvf helm-v2.11.0-linux-386.tar.gz
sudo mv linux-386/helm /usr/local/bin/helm

# RUN PROXY
kubectl proxy --port=8080

#CONFIGURE YOUR CEPH CLUSTER
cd ~
cat<<EOF>tiller-rbac-config.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: tiller
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: tiller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: tiller
  namespace: kube-system
EOF
kubectl create -f tiller-rbac-config.yaml
helm init --service-account tiller

helm serve
helm repo add local http://localhost:8879/charts

# ADD CEPH-HELM TO HELM LOCAL REPOS
git clone https://github.com/ceph/ceph-helm
cd ceph-helm/ceph
make

#CONFIGURE YOUR CEPH CLUSTER
cd ~
cat<<EOF>ceph-overrides.yaml
network:
  public:   172.21.0.0/20
  cluster:   172.21.0.0/20
osd_devices:
  - name: dev-sdb
    device: /dev/sdb
    zap: "1"
  - name: dev-sdc
    device: /dev/sdc
    zap: "1"
storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s
EOF

#CREATE THE CEPH CLUSTER NAMESPACE
kubectl create namespace ceph

#CONFIGURE RBAC PERMISSIONS
kubectl create clusterrolebinding test --clusterrole=cluster-admin [email protected]
kubectl create -f ~/ceph-helm/ceph/rbac.yaml

#LABEL KUBELETS
kubectl label node gke-standard-cluster-2-default-pool-8b55990f-bj85 ceph-mon=enabled ceph-mgr=enabled
kubectl label node gke-standard-cluster-2-default-pool-8b55990f-bj85 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-osd-device-dev-sdc=enabled
kubectl label node gke-standard-cluster-2-default-pool-8b55990f-s264 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled ceph-osd-device-dev-sdc=enabled

#CEPH DEPLOYMENT
helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml 

Ps
I had read this articles, but it looks like aren't my case
#55
#51
#48
#45

And if smb knows other way (tested and working) to deploy Ceph on K8s, tell me please

ceph-etc configmaps changing after adding and removing or changing osd-s

Is this a request for help?:
No

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
It's a bug report

Version of Helm and Kubernetes:
helm version
2.8.0

kubernetes version:
1.9.2

Which chart:
ceph-helm

What happened:
After the deployment, added a new disk to the ceph-overrides.yml and ran:
helm upgrade ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml
This added the new disk in, but also changed the ceph-etc configmap which caused the cluster fsid to change.
If the machine which runs the OSD containers gets rebooted, the new containers come up with the new fsid, the mon-s are still running with the old fsid which causes the osd containers to crash loop back off.

What you expected to happen:
Configmap should have stayed the same.

How to reproduce it (as minimally and precisely as possible):
Finish deployment, add a new disk or change an existing osd mapping to a new disk in the ceph-overrides.yml file then issue the command:
helm upgrade ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml

Anything else we need to know:

Can we push ceph-helm to offical helm/charts?

this repo was forked from official helm/charts, and was going too far away from the main repo. please push back to the official, and we all can develop, test there, and make helm ceph deploy stable at last.

Integration of Ceph Exporter

Hi,
I was looking for a metrics exporter for prometheus and found a general exporter at: ceph/ceph-container/examples/helm/ceph

Would it be possible to integrate the exporter into this project? I am not very good at writing helm charts, and the structure of both charts seems to differ a lot.

Thanks,
Stefan

ceph-mon failing: unable to load initial keyring

Is this a request for help?: yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Version of Helm and Kubernetes:

$ helm version
Client: &version.Version{SemVer:"v2.8+unreleased", GitCommit:"c5f2174f264554c62278c0695d58f250d3e207c8", GitTreeState:"clean"}
Server: &version.Version{SemVer:"canary+unreleased", GitCommit:"fe9d36533901b71923c49142f5cf007f93fa926f", GitTreeState:"clean"}

Kubernetes master > 1.9

Which chart: ceph

What happened:

I compiled k8s master from source (commit 04634cb19843195) and brought up a local cluster with:

RUNTIME_CONFIG=storage.k8s.io/v1alpha1=true ALLOW_PRIVILEGED=1 FEATURE_GATES="BlockVolume=true,MountPropagation=true,CSIPersistentVolume=true,ReadOnlyAPIDataVolumes=false"  hack/local-up-cluster.sh -O

Then I installed Helm and ceph. For Helm I use:

helm init --upgrade --canary-image

#  https://github.com/kubernetes/helm/issues/2224#issuecomment-356344286
kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

For ceph I use these commands:

kubectl create namespace ceph
kubectl create -f /fast/work/ceph-helm/ceph/rbac.yaml
kubectl label node 127.0.0.1 ceph-mon=enabled ceph-mgr=enabled ceph-osd=enabled ceph-osd-device-loop=enabled

truncate -s 10G /fast/work/ceph-loop-storage
sudo touch /dev/ceph-loop-storage
sudo mount -obind /fast/work/ceph-loop-storage /dev/ceph-loop-storage

# The local helm repository contains the ceph chart.
if ! pidof helm >/dev/null; then
    helm serve &
fi

cat >/tmp/ceph-overrides.yaml <<EOF
network:
  public:   192.168.0/20
  cluster:   192.168.0.0/20

osd_devices:
  - name: loop
    device: /dev/ceph-loop-storage
    zap: "1"

storageclass:
  name: ceph-rbd
  pool: rbd
  user_id: k8s
EOF

sudo rm -rf /varlib/ceph-helm
# in ceph-helm
helm install --name=ceph ceph/ceph --namespace=ceph -f /tmp/ceph-overrides.yaml

After this, some pods never start and ceph-mon fails:

$ kubectl -n ceph get pods
NAME                                   READY     STATUS             RESTARTS   AGE
ceph-mds-6f9cb6bd69-dlrgm              0/1       Pending            0          2h
ceph-mgr-776957b4cb-wxlc8              0/1       Init:0/2           0          2h
ceph-mon-check-57b6ddf49d-dg8j6        0/1       Init:0/2           0          2h
ceph-mon-rfzr6                         2/3       CrashLoopBackOff   28         2h
ceph-osd-loop-ctzdc                    0/1       Init:0/3           0          2h
ceph-rbd-provisioner-b58659dc9-m68qv   1/1       Running            0          2h
ceph-rbd-provisioner-b58659dc9-trzmm   1/1       Running            0          2h
ceph-rgw-dcbb695d6-tfhjg               0/1       Pending            0          2h

$ kubectl -n ceph describe pod/ceph-mon-rfzr6 
Name:           ceph-mon-rfzr6
Namespace:      ceph
Node:           127.0.0.1/127.0.0.1
Start Time:     Tue, 13 Mar 2018 17:56:58 +0100
Labels:         application=ceph
                component=mon
                controller-revision-hash=3731361699
                pod-template-generation=1
                release_group=ceph
Annotations:    <none>
Status:         Running
IP:             127.0.0.1
Controlled By:  DaemonSet/ceph-mon
Init Containers:
  init:
    Container ID:  docker://32678bca76683c38ce5cfa6b174037285ecbde2148e97b036f56c70586d34044
    Image:         docker.io/kolla/ubuntu-source-kubernetes-entrypoint:4.0.0
    Image ID:      docker-pullable://kolla/ubuntu-source-kubernetes-entrypoint@sha256:75116ab2f9f65c5fc078e68ce7facd66c1c57496947f37b7209b32f94925e53b
    Port:          <none>
    Command:
      kubernetes-entrypoint
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 13 Mar 2018 17:57:25 +0100
      Finished:     Tue, 13 Mar 2018 17:57:25 +0100
    Ready:          True
    Restart Count:  0
    Environment:
      POD_NAME:              ceph-mon-rfzr6 (v1:metadata.name)
      NAMESPACE:             ceph (v1:metadata.namespace)
      INTERFACE_NAME:        eth0
      DEPENDENCY_SERVICE:    
      DEPENDENCY_JOBS:       
      DEPENDENCY_DAEMONSET:  
      DEPENDENCY_CONTAINER:  
      COMMAND:               echo done
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
  ceph-init-dirs:
    Container ID:  docker://92f77fe2d747af0d51d78916fcba20503a171910fdd89d29346b2cdd20788324
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          <none>
    Command:
      /tmp/init_dirs.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 13 Mar 2018 17:57:27 +0100
      Finished:     Tue, 13 Mar 2018 17:57:27 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /run from pod-run (rw)
      /tmp/init_dirs.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Containers:
  cluster-audit-log-tailer:
    Container ID:  docker://928570a1606a2611e94429ceaa2973155652365c03f79b58e2d88d3471e3a731
    Image:         docker.io/alpine:latest
    Image ID:      docker-pullable://alpine@sha256:7b848083f93822dd21b0a2f14a110bd99f6efb4b838d499df6d04a49d0debf8b
    Port:          <none>
    Command:
      /tmp/log_handler.sh
    Args:
      /var/log/ceph/ceph.audit.log
    State:          Running
      Started:      Tue, 13 Mar 2018 17:57:28 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp/log_handler.sh from ceph-bin (rw)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
  cluster-log-tailer:
    Container ID:  docker://7dc91378bf88b366ca4e3f8eb731ca142a702bd9a5b4125ca5b2bd6adaa7eb1c
    Image:         docker.io/alpine:latest
    Image ID:      docker-pullable://alpine@sha256:7b848083f93822dd21b0a2f14a110bd99f6efb4b838d499df6d04a49d0debf8b
    Port:          <none>
    Command:
      /tmp/log_handler.sh
    Args:
      /var/log/ceph/ceph.log
    State:          Running
      Started:      Tue, 13 Mar 2018 17:57:28 +0100
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /tmp/log_handler.sh from ceph-bin (rw)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
  ceph-mon:
    Container ID:  docker://49c26f3a425ac86fcaebbe6a19ed6f17fcfa79f8564c3019185326d162fb5a57
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          6789/TCP
    Command:
      /start_mon.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Tue, 13 Mar 2018 20:04:54 +0100
      Finished:     Tue, 13 Mar 2018 20:05:00 +0100
    Ready:          False
    Restart Count:  29
    Liveness:       tcp-socket :6789 delay=60s timeout=5s period=10s #success=1 #failure=3
    Readiness:      tcp-socket :6789 delay=0s timeout=5s period=10s #success=1 #failure=3
    Environment:
      K8S_HOST_NETWORK:     1
      MONMAP:               /var/lib/ceph/mon/monmap
      NAMESPACE:            ceph (v1:metadata.namespace)
      CEPH_DAEMON:          mon
      CEPH_PUBLIC_NETWORK:  192.168.0/20
      KUBECTL_PARAM:        -l application=ceph -l component=mon
      MON_IP:                (v1:status.podIP)
    Mounts:
      /common_functions.sh from ceph-bin (ro)
      /etc/ceph/ceph.client.admin.keyring from ceph-client-admin-keyring (ro)
      /etc/ceph/ceph.conf from ceph-etc (ro)
      /run from pod-run (rw)
      /start_mon.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/lib/ceph/bootstrap-mds/ceph.keyring from ceph-bootstrap-mds-keyring (rw)
      /var/lib/ceph/bootstrap-osd/ceph.keyring from ceph-bootstrap-osd-keyring (rw)
      /var/lib/ceph/bootstrap-rgw/ceph.keyring from ceph-bootstrap-rgw-keyring (rw)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-rn798 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          False 
  PodScheduled   True 
Volumes:
  ceph-bin:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-bin
    Optional:  false
  ceph-etc:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-etc
    Optional:  false
  pod-var-log-ceph:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  pod-var-lib-ceph:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/ceph-helm/ceph/mon
    HostPathType:  
  pod-run:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  Memory
  ceph-client-admin-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-client-admin-keyring
    Optional:    false
  ceph-bootstrap-osd-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-osd-keyring
    Optional:    false
  ceph-bootstrap-mds-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-mds-keyring
    Optional:    false
  ceph-bootstrap-rgw-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-rgw-keyring
    Optional:    false
  default-token-rn798:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-rn798
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  ceph-mon=enabled
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:
  Type     Reason   Age                From                Message
  ----     ------   ----               ----                -------
  Normal   Pulled   58m (x19 over 2h)  kubelet, 127.0.0.1  Container image "docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04" already present on machine
  Warning  BackOff  3m (x560 over 2h)  kubelet, 127.0.0.1  Back-off restarting failed container


$ kubectl -n ceph logs ceph-mon-rfzr6 ceph-mon
+ export LC_ALL=C
+ LC_ALL=C
+ source variables_entrypoint.sh
++ ALL_SCENARIOS='osd osd_directory osd_directory_single osd_ceph_disk osd_ceph_disk_prepare osd_ceph_disk_activate osd_ceph_activate_journal mgr'
++ : ceph
++ : ceph-config/ceph
++ : 192.168.0/20
++ : mon
++ : 0
++ : pohly-desktop
++ : pohly-desktop
++ : /var/lib/ceph/mon/monmap
++ : /var/lib/ceph/mon/ceph-pohly-desktop
++ : 1
++ : 0
++ : mds-pohly-desktop
++ : 0
++ : 100
++ : 0
++ : 0
+++ uuidgen
++ : a22cb2e2-eac5-49ae-85aa-8abf231ee5c1
+++ uuidgen
++ : 03e59aae-9eb7-47c9-9d4e-7348ce4ead15
++ : root=default host=pohly-desktop
++ : 0
++ : cephfs
++ : cephfs_data
++ : 8
++ : cephfs_metadata
++ : 8
++ : pohly-desktop
++ :
++ :
++ : 8080
++ : 0
++ : 9000
++ : 0.0.0.0
++ : cephnfs
++ : pohly-desktop
++ : 0.0.0.0
++ CLI_OPTS='--cluster ceph'
++ DAEMON_OPTS='--cluster ceph --setuser ceph --setgroup ceph -d'
++ MOUNT_OPTS='-t xfs -o noatime,inode64'
++ MDS_KEYRING=/var/lib/ceph/mds/ceph-mds-pohly-desktop/keyring
++ ADMIN_KEYRING=/etc/ceph/ceph.client.admin.keyring
++ MON_KEYRING=/etc/ceph/ceph.mon.keyring
++ RGW_KEYRING=/var/lib/ceph/radosgw/pohly-desktop/keyring
++ MGR_KEYRING=/var/lib/ceph/mgr/ceph-pohly-desktop/keyring
++ MDS_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-mds/ceph.keyring
++ RGW_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-rgw/ceph.keyring
++ OSD_BOOTSTRAP_KEYRING=/var/lib/ceph/bootstrap-osd/ceph.keyring
++ OSD_PATH_BASE=/var/lib/ceph/osd/ceph
+ source common_functions.sh
++ set -ex
+ [[ -z 192.168.0/20 ]]
+ [[ -z 127.0.0.1 ]]
+ [[ -z 127.0.0.1 ]]
+ [[ -z 192.168.0/20 ]]
+ get_mon_config
++ ceph-conf --lookup fsid -c /etc/ceph/ceph.conf
+ local fsid=35f6bbc7-9f08-4984-a8c0-e690f95059ca
+ timeout=10
+ MONMAP_ADD=
+ [[ -z '' ]]
+ [[ 10 -gt 0 ]]
+ [[ 1 -eq 0 ]]
++ kubectl get pods --namespace=ceph -l application=ceph -l component=mon -o template '--template={{range .items}}{{if .status.podIP}}--add {{.spec.nodeName}} {{.status.podIP}} {{end}} {{end}}'
+ MONMAP_ADD='--add 127.0.0.1 127.0.0.1  '
+ ((  timeout--  ))
+ sleep 1
+ [[ -z --add127.0.0.1127.0.0.1 ]]
+ [[ -z --add127.0.0.1127.0.0.1 ]]
+ '[' -f /var/lib/ceph/mon/monmap ']'
+ monmaptool --print /var/lib/ceph/mon/monmap
+ grep -q 127.0.0.1:6789
+ '[' 0 -eq 0 ']'
+ log '127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap'
+ '[' -z '127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap' ']'
++ date '+%F %T'
+ TIMESTAMP='2018-03-13 18:59:39'
+ echo '2018-03-13 18:59:39  /start_mon.sh: 127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap'
+ return 0
+ return
2018-03-13 18:59:39  /start_mon.sh: 127.0.0.1 already exists in monmap /var/lib/ceph/mon/monmap
+ chown ceph. /var/log/ceph
+ '[' '!' -e /var/lib/ceph/mon/ceph-pohly-desktop/keyring ']'
+ '[' '!' -e /etc/ceph/ceph.mon.keyring ']'
+ touch /etc/ceph/ceph.mon.keyring
+ '[' '!' -e /var/lib/ceph/mon/monmap ']'
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-osd/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-osd/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-mds/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-mds/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /var/lib/ceph/bootstrap-rgw/ceph.keyring
importing contents of /var/lib/ceph/bootstrap-rgw/ceph.keyring into /etc/ceph/ceph.mon.keyring
+ for keyring in '$OSD_BOOTSTRAP_KEYRING' '$MDS_BOOTSTRAP_KEYRING' '$RGW_BOOTSTRAP_KEYRING' '$ADMIN_KEYRING'
+ ceph-authtool /etc/ceph/ceph.mon.keyring --import-keyring /etc/ceph/ceph.client.admin.keyring
importing contents of /etc/ceph/ceph.client.admin.keyring into /etc/ceph/ceph.mon.keyring
+ ceph-mon --setuser ceph --setgroup ceph --cluster ceph --mkfs -i pohly-desktop --inject-monmap /var/lib/ceph/mon/monmap --keyring /etc/ceph/ceph.mon.keyring --mon-data /var/lib/ceph/mon/ceph-pohly-desktop
2018-03-13 18:59:39.715279 7fa991335f00 -1 '/var/lib/ceph/mon/ceph-pohly-desktop' already exists and is not empty: monitor may already exist
+ log SUCCESS
+ '[' -z SUCCESS ']'
++ date '+%F %T'
+ TIMESTAMP='2018-03-13 18:59:39'
+ echo '2018-03-13 18:59:39  /start_mon.sh: SUCCESS'
+ return 0
+ exec /usr/bin/ceph-mon --cluster ceph --setuser ceph --setgroup ceph -d -i pohly-desktop --mon-data /var/lib/ceph/mon/ceph-pohly-desktop --public-addr 127.0.0.1:6789
2018-03-13 18:59:39  /start_mon.sh: SUCCESS
2018-03-13 18:59:39.756113 7f3f7a7fff00  0 set uid:gid to 64045:64045 (ceph:ceph)
2018-03-13 18:59:39.756139 7f3f7a7fff00  0 ceph version 12.2.3 (2dab17a455c09584f2a85e6b10888337d1ec8949) luminous (stable), process (unknown), pid 1
2018-03-13 18:59:39.756228 7f3f7a7fff00  0 pidfile_write: ignore empty --pid-file
2018-03-13 18:59:39.764761 7f3f7a7fff00  0 load: jerasure load: lrc load: isa 
2018-03-13 18:59:39.764863 7f3f7a7fff00  0  set rocksdb option compression = kNoCompression
2018-03-13 18:59:39.764877 7f3f7a7fff00  0  set rocksdb option write_buffer_size = 33554432
2018-03-13 18:59:39.764899 7f3f7a7fff00  0  set rocksdb option compression = kNoCompression
2018-03-13 18:59:39.764904 7f3f7a7fff00  0  set rocksdb option write_buffer_size = 33554432
2018-03-13 18:59:39.765053 7f3f7a7fff00  4 rocksdb: RocksDB version: 5.4.0

2018-03-13 18:59:39.765065 7f3f7a7fff00  4 rocksdb: Git sha rocksdb_build_git_sha:@0@
2018-03-13 18:59:39.765067 7f3f7a7fff00  4 rocksdb: Compile date Feb 19 2018
2018-03-13 18:59:39.765068 7f3f7a7fff00  4 rocksdb: DB SUMMARY

2018-03-13 18:59:39.765124 7f3f7a7fff00  4 rocksdb: CURRENT file:  CURRENT

2018-03-13 18:59:39.765131 7f3f7a7fff00  4 rocksdb: IDENTITY file:  IDENTITY

2018-03-13 18:59:39.765136 7f3f7a7fff00  4 rocksdb: MANIFEST file:  MANIFEST-000110 size: 109 Bytes

2018-03-13 18:59:39.765139 7f3f7a7fff00  4 rocksdb: SST files in /var/lib/ceph/mon/ceph-pohly-desktop/store.db dir, Total Num: 1, files: 000004.sst 

2018-03-13 18:59:39.765142 7f3f7a7fff00  4 rocksdb: Write Ahead Log file in /var/lib/ceph/mon/ceph-pohly-desktop/store.db: 000111.log size: 0 ; 

2018-03-13 18:59:39.765144 7f3f7a7fff00  4 rocksdb:                         Options.error_if_exists: 0
2018-03-13 18:59:39.765145 7f3f7a7fff00  4 rocksdb:                       Options.create_if_missing: 0
2018-03-13 18:59:39.765146 7f3f7a7fff00  4 rocksdb:                         Options.paranoid_checks: 1
2018-03-13 18:59:39.765147 7f3f7a7fff00  4 rocksdb:                                     Options.env: 0x557b73957f20
2018-03-13 18:59:39.765149 7f3f7a7fff00  4 rocksdb:                                Options.info_log: 0x557b757689a0
2018-03-13 18:59:39.765150 7f3f7a7fff00  4 rocksdb:                          Options.max_open_files: -1
2018-03-13 18:59:39.765151 7f3f7a7fff00  4 rocksdb:                Options.max_file_opening_threads: 16
2018-03-13 18:59:39.765152 7f3f7a7fff00  4 rocksdb:                               Options.use_fsync: 0
2018-03-13 18:59:39.765153 7f3f7a7fff00  4 rocksdb:                       Options.max_log_file_size: 0
2018-03-13 18:59:39.765155 7f3f7a7fff00  4 rocksdb:                  Options.max_manifest_file_size: 18446744073709551615
2018-03-13 18:59:39.765156 7f3f7a7fff00  4 rocksdb:                   Options.log_file_time_to_roll: 0
2018-03-13 18:59:39.765157 7f3f7a7fff00  4 rocksdb:                       Options.keep_log_file_num: 1000
2018-03-13 18:59:39.765158 7f3f7a7fff00  4 rocksdb:                    Options.recycle_log_file_num: 0
2018-03-13 18:59:39.765159 7f3f7a7fff00  4 rocksdb:                         Options.allow_fallocate: 1
2018-03-13 18:59:39.765160 7f3f7a7fff00  4 rocksdb:                        Options.allow_mmap_reads: 0
2018-03-13 18:59:39.765161 7f3f7a7fff00  4 rocksdb:                       Options.allow_mmap_writes: 0
2018-03-13 18:59:39.765162 7f3f7a7fff00  4 rocksdb:                        Options.use_direct_reads: 0
2018-03-13 18:59:39.765178 7f3f7a7fff00  4 rocksdb:                        Options.use_direct_io_for_flush_and_compaction: 0
2018-03-13 18:59:39.765179 7f3f7a7fff00  4 rocksdb:          Options.create_missing_column_families: 0
2018-03-13 18:59:39.765180 7f3f7a7fff00  4 rocksdb:                              Options.db_log_dir: 
2018-03-13 18:59:39.765181 7f3f7a7fff00  4 rocksdb:                                 Options.wal_dir: /var/lib/ceph/mon/ceph-pohly-desktop/store.db
2018-03-13 18:59:39.765182 7f3f7a7fff00  4 rocksdb:                Options.table_cache_numshardbits: 6
2018-03-13 18:59:39.765183 7f3f7a7fff00  4 rocksdb:                      Options.max_subcompactions: 1
2018-03-13 18:59:39.765185 7f3f7a7fff00  4 rocksdb:                  Options.max_background_flushes: 1
2018-03-13 18:59:39.765186 7f3f7a7fff00  4 rocksdb:                         Options.WAL_ttl_seconds: 0
2018-03-13 18:59:39.765186 7f3f7a7fff00  4 rocksdb:                       Options.WAL_size_limit_MB: 0
2018-03-13 18:59:39.765187 7f3f7a7fff00  4 rocksdb:             Options.manifest_preallocation_size: 4194304
2018-03-13 18:59:39.765189 7f3f7a7fff00  4 rocksdb:                     Options.is_fd_close_on_exec: 1
2018-03-13 18:59:39.765190 7f3f7a7fff00  4 rocksdb:                   Options.advise_random_on_open: 1
2018-03-13 18:59:39.765191 7f3f7a7fff00  4 rocksdb:                    Options.db_write_buffer_size: 0
2018-03-13 18:59:39.765192 7f3f7a7fff00  4 rocksdb:         Options.access_hint_on_compaction_start: 1
2018-03-13 18:59:39.765193 7f3f7a7fff00  4 rocksdb:  Options.new_table_reader_for_compaction_inputs: 0
2018-03-13 18:59:39.765194 7f3f7a7fff00  4 rocksdb:               Options.compaction_readahead_size: 0
2018-03-13 18:59:39.765195 7f3f7a7fff00  4 rocksdb:           Options.random_access_max_buffer_size: 1048576
2018-03-13 18:59:39.765196 7f3f7a7fff00  4 rocksdb:           Options.writable_file_max_buffer_size: 1048576
2018-03-13 18:59:39.765197 7f3f7a7fff00  4 rocksdb:                      Options.use_adaptive_mutex: 0
2018-03-13 18:59:39.765198 7f3f7a7fff00  4 rocksdb:                            Options.rate_limiter: (nil)
2018-03-13 18:59:39.765199 7f3f7a7fff00  4 rocksdb:     Options.sst_file_manager.rate_bytes_per_sec: 0
2018-03-13 18:59:39.765200 7f3f7a7fff00  4 rocksdb:                          Options.bytes_per_sync: 0
2018-03-13 18:59:39.765201 7f3f7a7fff00  4 rocksdb:                      Options.wal_bytes_per_sync: 0
2018-03-13 18:59:39.765202 7f3f7a7fff00  4 rocksdb:                       Options.wal_recovery_mode: 2
2018-03-13 18:59:39.765203 7f3f7a7fff00  4 rocksdb:                  Options.enable_thread_tracking: 0
2018-03-13 18:59:39.765204 7f3f7a7fff00  4 rocksdb:         Options.allow_concurrent_memtable_write: 1
2018-03-13 18:59:39.765205 7f3f7a7fff00  4 rocksdb:      Options.enable_write_thread_adaptive_yield: 1
2018-03-13 18:59:39.765206 7f3f7a7fff00  4 rocksdb:             Options.write_thread_max_yield_usec: 100
2018-03-13 18:59:39.765207 7f3f7a7fff00  4 rocksdb:            Options.write_thread_slow_yield_usec: 3
2018-03-13 18:59:39.765208 7f3f7a7fff00  4 rocksdb:                               Options.row_cache: None
2018-03-13 18:59:39.765209 7f3f7a7fff00  4 rocksdb:                              Options.wal_filter: None
2018-03-13 18:59:39.765210 7f3f7a7fff00  4 rocksdb:             Options.avoid_flush_during_recovery: 0
2018-03-13 18:59:39.765211 7f3f7a7fff00  4 rocksdb:             Options.base_background_compactions: 1
2018-03-13 18:59:39.765212 7f3f7a7fff00  4 rocksdb:             Options.max_background_compactions: 1
2018-03-13 18:59:39.765214 7f3f7a7fff00  4 rocksdb:             Options.avoid_flush_during_shutdown: 0
2018-03-13 18:59:39.765215 7f3f7a7fff00  4 rocksdb:             Options.delayed_write_rate : 16777216
2018-03-13 18:59:39.765216 7f3f7a7fff00  4 rocksdb:             Options.max_total_wal_size: 0
2018-03-13 18:59:39.765217 7f3f7a7fff00  4 rocksdb:             Options.delete_obsolete_files_period_micros: 21600000000
2018-03-13 18:59:39.765218 7f3f7a7fff00  4 rocksdb:                   Options.stats_dump_period_sec: 600
2018-03-13 18:59:39.765219 7f3f7a7fff00  4 rocksdb: Compression algorithms supported:
2018-03-13 18:59:39.765220 7f3f7a7fff00  4 rocksdb: 	Snappy supported: 0
2018-03-13 18:59:39.765221 7f3f7a7fff00  4 rocksdb: 	Zlib supported: 0
2018-03-13 18:59:39.765223 7f3f7a7fff00  4 rocksdb: 	Bzip supported: 0
2018-03-13 18:59:39.765224 7f3f7a7fff00  4 rocksdb: 	LZ4 supported: 0
2018-03-13 18:59:39.765224 7f3f7a7fff00  4 rocksdb: 	ZSTD supported: 0
2018-03-13 18:59:39.765225 7f3f7a7fff00  4 rocksdb: Fast CRC32 supported: 0
2018-03-13 18:59:39.765616 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2609] Recovering from manifest file: MANIFEST-000110

2018-03-13 18:59:39.765753 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/column_family.cc:407] --------------- Options for column family [default]:

2018-03-13 18:59:39.765776 7f3f7a7fff00  4 rocksdb:               Options.comparator: leveldb.BytewiseComparator
2018-03-13 18:59:39.765778 7f3f7a7fff00  4 rocksdb:           Options.merge_operator: 
2018-03-13 18:59:39.765780 7f3f7a7fff00  4 rocksdb:        Options.compaction_filter: None
2018-03-13 18:59:39.765782 7f3f7a7fff00  4 rocksdb:        Options.compaction_filter_factory: None
2018-03-13 18:59:39.765783 7f3f7a7fff00  4 rocksdb:         Options.memtable_factory: SkipListFactory
2018-03-13 18:59:39.765785 7f3f7a7fff00  4 rocksdb:            Options.table_factory: BlockBasedTable
2018-03-13 18:59:39.765815 7f3f7a7fff00  4 rocksdb:            table_factory options:   flush_block_policy_factory: FlushBlockBySizePolicyFactory (0x557b754480f8)
  cache_index_and_filter_blocks: 1
  cache_index_and_filter_blocks_with_high_priority: 1
  pin_l0_filter_and_index_blocks_in_cache: 1
  index_type: 0
  hash_index_allow_collision: 1
  checksum: 1
  no_block_cache: 0
  block_cache: 0x557b75754400
  block_cache_name: LRUCache
  block_cache_options:
    capacity : 134217728
    num_shard_bits : 4
    strict_capacity_limit : 0
    high_pri_pool_ratio: 0.000
  block_cache_compressed: (nil)
  persistent_cache: (nil)
  block_size: 4096
  block_size_deviation: 10
  block_restart_interval: 16
  index_block_restart_interval: 1
  filter_policy: rocksdb.BuiltinBloomFilter
  whole_key_filtering: 1
  format_version: 2

2018-03-13 18:59:39.765873 7f3f7a7fff00  4 rocksdb:        Options.write_buffer_size: 33554432
2018-03-13 18:59:39.765879 7f3f7a7fff00  4 rocksdb:  Options.max_write_buffer_number: 2
2018-03-13 18:59:39.765881 7f3f7a7fff00  4 rocksdb:          Options.compression: NoCompression
2018-03-13 18:59:39.765884 7f3f7a7fff00  4 rocksdb:                  Options.bottommost_compression: Disabled
2018-03-13 18:59:39.765886 7f3f7a7fff00  4 rocksdb:       Options.prefix_extractor: nullptr
2018-03-13 18:59:39.765888 7f3f7a7fff00  4 rocksdb:   Options.memtable_insert_with_hint_prefix_extractor: nullptr
2018-03-13 18:59:39.765889 7f3f7a7fff00  4 rocksdb:             Options.num_levels: 7
2018-03-13 18:59:39.765890 7f3f7a7fff00  4 rocksdb:        Options.min_write_buffer_number_to_merge: 1
2018-03-13 18:59:39.765892 7f3f7a7fff00  4 rocksdb:     Options.max_write_buffer_number_to_maintain: 0
2018-03-13 18:59:39.765893 7f3f7a7fff00  4 rocksdb:            Options.compression_opts.window_bits: -14
2018-03-13 18:59:39.765894 7f3f7a7fff00  4 rocksdb:                  Options.compression_opts.level: -1
2018-03-13 18:59:39.765895 7f3f7a7fff00  4 rocksdb:               Options.compression_opts.strategy: 0
2018-03-13 18:59:39.765897 7f3f7a7fff00  4 rocksdb:         Options.compression_opts.max_dict_bytes: 0
2018-03-13 18:59:39.765898 7f3f7a7fff00  4 rocksdb:      Options.level0_file_num_compaction_trigger: 4
2018-03-13 18:59:39.765899 7f3f7a7fff00  4 rocksdb:          Options.level0_slowdown_writes_trigger: 20
2018-03-13 18:59:39.765901 7f3f7a7fff00  4 rocksdb:              Options.level0_stop_writes_trigger: 36
2018-03-13 18:59:39.765902 7f3f7a7fff00  4 rocksdb:                   Options.target_file_size_base: 67108864
2018-03-13 18:59:39.765903 7f3f7a7fff00  4 rocksdb:             Options.target_file_size_multiplier: 1
2018-03-13 18:59:39.765905 7f3f7a7fff00  4 rocksdb:                Options.max_bytes_for_level_base: 268435456
2018-03-13 18:59:39.765916 7f3f7a7fff00  4 rocksdb: Options.level_compaction_dynamic_level_bytes: 0
2018-03-13 18:59:39.765917 7f3f7a7fff00  4 rocksdb:          Options.max_bytes_for_level_multiplier: 10.000000
2018-03-13 18:59:39.765921 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[0]: 1
2018-03-13 18:59:39.765923 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[1]: 1
2018-03-13 18:59:39.765925 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[2]: 1
2018-03-13 18:59:39.765926 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[3]: 1
2018-03-13 18:59:39.765927 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[4]: 1
2018-03-13 18:59:39.765928 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[5]: 1
2018-03-13 18:59:39.765930 7f3f7a7fff00  4 rocksdb: Options.max_bytes_for_level_multiplier_addtl[6]: 1
2018-03-13 18:59:39.765931 7f3f7a7fff00  4 rocksdb:       Options.max_sequential_skip_in_iterations: 8
2018-03-13 18:59:39.765932 7f3f7a7fff00  4 rocksdb:                    Options.max_compaction_bytes: 1677721600
2018-03-13 18:59:39.765933 7f3f7a7fff00  4 rocksdb:                        Options.arena_block_size: 4194304
2018-03-13 18:59:39.765934 7f3f7a7fff00  4 rocksdb:   Options.soft_pending_compaction_bytes_limit: 68719476736
2018-03-13 18:59:39.765935 7f3f7a7fff00  4 rocksdb:   Options.hard_pending_compaction_bytes_limit: 274877906944
2018-03-13 18:59:39.765937 7f3f7a7fff00  4 rocksdb:       Options.rate_limit_delay_max_milliseconds: 100
2018-03-13 18:59:39.765938 7f3f7a7fff00  4 rocksdb:                Options.disable_auto_compactions: 0
2018-03-13 18:59:39.765940 7f3f7a7fff00  4 rocksdb:                         Options.compaction_style: kCompactionStyleLevel
2018-03-13 18:59:39.765951 7f3f7a7fff00  4 rocksdb:                           Options.compaction_pri: kByCompensatedSize
2018-03-13 18:59:39.765952 7f3f7a7fff00  4 rocksdb:  Options.compaction_options_universal.size_ratio: 1
2018-03-13 18:59:39.765954 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.min_merge_width: 2
2018-03-13 18:59:39.765955 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.max_merge_width: 4294967295
2018-03-13 18:59:39.765957 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.max_size_amplification_percent: 200
2018-03-13 18:59:39.765958 7f3f7a7fff00  4 rocksdb: Options.compaction_options_universal.compression_size_percent: -1
2018-03-13 18:59:39.765959 7f3f7a7fff00  4 rocksdb: Options.compaction_options_fifo.max_table_files_size: 1073741824
2018-03-13 18:59:39.765961 7f3f7a7fff00  4 rocksdb:                   Options.table_properties_collectors: 
2018-03-13 18:59:39.765962 7f3f7a7fff00  4 rocksdb:                   Options.inplace_update_support: 0
2018-03-13 18:59:39.765963 7f3f7a7fff00  4 rocksdb:                 Options.inplace_update_num_locks: 10000
2018-03-13 18:59:39.765965 7f3f7a7fff00  4 rocksdb:               Options.memtable_prefix_bloom_size_ratio: 0.000000
2018-03-13 18:59:39.765967 7f3f7a7fff00  4 rocksdb:   Options.memtable_huge_page_size: 0
2018-03-13 18:59:39.765968 7f3f7a7fff00  4 rocksdb:                           Options.bloom_locality: 0
2018-03-13 18:59:39.765969 7f3f7a7fff00  4 rocksdb:                    Options.max_successive_merges: 0
2018-03-13 18:59:39.765970 7f3f7a7fff00  4 rocksdb:                Options.optimize_filters_for_hits: 0
2018-03-13 18:59:39.765971 7f3f7a7fff00  4 rocksdb:                Options.paranoid_file_checks: 0
2018-03-13 18:59:39.765972 7f3f7a7fff00  4 rocksdb:                Options.force_consistency_checks: 0
2018-03-13 18:59:39.765983 7f3f7a7fff00  4 rocksdb:                Options.report_bg_io_stats: 0
2018-03-13 18:59:39.767128 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2859] Recovered from manifest file:/var/lib/ceph/mon/ceph-pohly-desktop/store.db/MANIFEST-000110 succeeded,manifest_file_number is 110, next_file_number is 112, last_sequence is 5, log_number is 0,prev_log_number is 0,max_column_family is 0

2018-03-13 18:59:39.767145 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2867] Column family [default] (ID 0), log number is 109

2018-03-13 18:59:39.767239 7f3f7a7fff00  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1520967579767228, "job": 1, "event": "recovery_started", "log_files": [111]}
2018-03-13 18:59:39.767250 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/db_impl_open.cc:482] Recovering log #111 mode 2
2018-03-13 18:59:39.767358 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/version_set.cc:2395] Creating manifest 113

2018-03-13 18:59:39.769145 7f3f7a7fff00  4 rocksdb: EVENT_LOG_v1 {"time_micros": 1520967579769139, "job": 1, "event": "recovery_finished"}
2018-03-13 18:59:39.773075 7f3f7a7fff00  4 rocksdb: [/build/ceph-12.2.3/src/rocksdb/db/db_impl_open.cc:1063] DB pointer 0x557b75868000
2018-03-13 18:59:39.773292 7f3f7a7fff00  0 mon.pohly-desktop does not exist in monmap, will attempt to join an existing cluster
2018-03-13 18:59:39.773699 7f3f7a7fff00  0 using public_addr 127.0.0.1:6789/0 -> 127.0.0.1:6789/0
2018-03-13 18:59:39.774640 7f3f7a7fff00  0 starting mon.pohly-desktop rank -1 at public addr 127.0.0.1:6789/0 at bind addr 127.0.0.1:6789/0 mon_data /var/lib/ceph/mon/ceph-pohly-desktop fsid 8811b121-9e17-43f7-a276-87246527f00a
2018-03-13 18:59:39.774806 7f3f7a7fff00  0 starting mon.pohly-desktop rank -1 at 127.0.0.1:6789/0 mon_data /var/lib/ceph/mon/ceph-pohly-desktop fsid 8811b121-9e17-43f7-a276-87246527f00a
2018-03-13 18:59:39.775574 7f3f7a7fff00  1 mon.pohly-desktop@-1(probing) e0 preinit fsid 8811b121-9e17-43f7-a276-87246527f00a
2018-03-13 18:59:39.775702 7f3f7a7fff00  1 mon.pohly-desktop@-1(probing).mds e0 Unable to load 'last_metadata'
2018-03-13 18:59:39.776171 7f3f7a7fff00 -1 auth: error reading file: /var/lib/ceph/mon/ceph-pohly-desktop/keyring: can't open /var/lib/ceph/mon/ceph-pohly-desktop/keyring: (2) No such file or directory
2018-03-13 18:59:39.776181 7f3f7a7fff00 -1 mon.pohly-desktop@-1(probing) e0 unable to load initial keyring /etc/ceph/ceph.mon.pohly-desktop.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
2018-03-13 18:59:39.776185 7f3f7a7fff00 -1 failed to initialize

Note the failure: e0 unable to load initial keyring /etc/ceph/ceph.mon.pohly-desktop.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,

What you expected to happen:

I expect all pods to run normally, except for rgw and mds (no host for those).

How to reproduce it (as minimally and precisely as possible):

See above.

Anything else we need to know:

Note that I am working around issue #50 with ReadOnlyAPIDataVolumes=false

Cannot generate keyring

Is this a request for help?:


BUG REPOR :

When I run command "kubectl get pods -n ceph" after "helm install ..", I got:

NAME                                   READY   STATUS     RESTARTS   AGE
ceph-mds-75dc968dc7-kvp67              0/1     Init:0/2   0          18m
ceph-mgr-75f4c4dc76-x5r74              0/1     Init:0/2   0          18m
ceph-mon-check-85d59b5fd4-xjczs        0/1     Init:0/2   0          18m
ceph-mon-zjzpp                         0/3     Init:0/2   0          18m
ceph-osd-dev-sdd-clhpt                 0/1     Init:0/3   0          18m
ceph-osd-dev-sde-2fkmt                 0/1     Init:0/3   0          18m
ceph-rbd-provisioner-d59d65f74-nkj48   1/1     Running    0          18m
ceph-rbd-provisioner-d59d65f74-xqhmt   1/1     Running    0          18m
ceph-rgw-7598c7788-mbjt9               0/1     Init:0/2   0          18m

Then I see logs of one of these faild pods, and got:

Events:
  Type     Reason       Age                      From               Message
  ----     ------       ----                     ----               -------
  Normal   Scheduled    21m                      default-scheduler  Successfully assigned ceph/ceph-rgw-7598c7788-mbjt9 to minion
  Warning  FailedMount  14m (x5 over 14m)        kubelet, minion    MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secret "ceph-bootstrap-rgw-keyring" not found
  Warning  FailedMount  14m (x5 over 14m)        kubelet, minion    MountVolume.SetUp failed for volume "ceph-mon-keyring" : secret "ceph-mon-keyring" not found
  Warning  FailedMount  14m (x5 over 14m)        kubelet, minion    MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secret "ceph-bootstrap-mds-keyring" not found
  Warning  FailedMount  8m15s (x11 over 14m)     kubelet, minion    MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secret "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount  4m11s (x13 over 14m)     kubelet, minion    MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secret "ceph-client-admin-keyring" not found
  Warning  FailedMount  <invalid> (x9 over 12m)  kubelet, minion    Unable to mount volumes for pod "ceph-rgw-7598c7788-mbjt9_ceph(44a7d63a-9230-11e9-9dd2-5820b10bbe42)": timeout expired waiting for volumes to attach or mount for pod "ceph"/"ceph-rgw-7598c7788-mbjt9". list of unmounted volumes=[ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring]. list of unattached volumes=[pod-etc-ceph ceph-bin ceph-etc pod-var-lib-ceph pod-run ceph-client-admin-keyring ceph-mon-keyring ceph-bootstrap-osd-keyring ceph-bootstrap-mds-keyring ceph-bootstrap-rgw-keyring default-token-rbjr6]

When I run command โ€œkubectl get job -n cephโ€, I got:

NAME                                  COMPLETIONS   DURATION   AGE
ceph-mds-keyring-generator            0/1           17m        17m
ceph-mgr-keyring-generator            0/1           17m        17m
ceph-mon-keyring-generator            0/1           17m        17m
ceph-namespace-client-key-generator   0/1           17m        17m
ceph-osd-keyring-generator            0/1           17m        17m
ceph-rgw-keyring-generator            0/1           17m        17m
ceph-storage-keys-generator           0/1           17m        17m

I think the file "ceph/ceph/templates/job-keyring.yaml" format is wrong for k8s v1.14.1.

{{- if .Values.manifests.job_keyring }}
{{- $envAll := . }}
{{- if .Values.deployment.storage_secrets }}
{{- range $key1, $cephBootstrapKey := tuple "mds" "osd" "rgw" "mon" "mgr"}}
{{- $jobName := print $cephBootstrapKey "-keyring-generator" }}
---
apiVersion: batch/v1
kind: Job 
metadata:
  name: ceph-{{ $jobName }}
spec:
  template:
    metadata:
      labels:
{{ tuple $envAll "ceph" $jobName | include "helm-toolkit.snippets.kubernetes_metadata_labels" | indent 8 }}
    spec:
      restartPolicy: OnFailure
      nodeSelector:
        {{ $envAll.Values.labels.jobs.node_selector_key }}: {{ $envAll.Values.labels.jobs.node_selector_value }}      containers:
        - name:  ceph-{{ $jobName }}
          image: {{ $envAll.Values.images.ceph_config_helper }}
          imagePullPolicy: {{ $envAll.Values.images.pull_policy }}
{{ tuple $envAll $envAll.Values.pod.resources.jobs.secret_provisioning | include "helm-toolkit.snippets.kubernetes_resources" | indent 10 }}          env:
            - name: DEPLOYMENT_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: CEPH_GEN_DIR 
              value: /opt/ceph
            - name: CEPH_TEMPLATES_DIR
              value: /opt/ceph/templates
            {{- if eq $cephBootstrapKey "mon"}}
            - name: CEPH_KEYRING_NAME
              value: ceph.mon.keyring
            - name: CEPH_KEYRING_TEMPLATE
              value: mon.keyring
            {{- else }}
            - name: CEPH_KEYRING_NAME
              value: ceph.keyring
            - name: CEPH_KEYRING_TEMPLATE
              value: bootstrap.keyring.{{ $cephBootstrapKey }}
            {{- end }}
            - name: KUBE_SECRET_NAME
              value: {{  index $envAll.Values.secrets.keyrings $cephBootstrapKey }}
          command:
            - /opt/ceph/ceph-key.sh
          volumeMounts:
            - name: ceph-bin
              mountPath: /opt/ceph/ceph-key.sh
              subPath: ceph-key.sh
              readOnly: true
            - name: ceph-bin
              mountPath: /opt/ceph/ceph-key.py
              subPath: ceph-key.py
              readOnly: true
            - name: ceph-templates
              mountPath: /opt/ceph/templates
              readOnly: true
      volumes:
        - name: ceph-bin
          configMap:
            name: ceph-bin
            defaultMode: 0555
        - name: ceph-templates
          configMap:
            name: ceph-templates
            defaultMode: 0444
{{- end }}
{{- end }}
{{- end }}

Version of Helm and Kubernetes:

Helm

Client: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.1", GitCommit:"5270352a09c7e8b6e8c9593002a73535276507c0", GitTreeState:"clean"}

Kubernetes

Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:11:31Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.1", GitCommit:"b7394102d6ef778017f2ca4046abbaa23b88c290", GitTreeState:"clean", BuildDate:"2019-04-08T17:02:58Z", GoVersion:"go1.12.1", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
local/ceph

What happened:

kubectl describe pod ceph-storage-keys-generator-9x6z8 -n ceph
++ ceph_gen_key
++ python /opt/ceph/ceph-key.py
+ CEPH_CLIENT_KEY=AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==
+ create_kube_key AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w== ceph.client.admin.keyring admin.keyring ceph-client-admin-keyring
+ CEPH_KEYRING=AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==
+ CEPH_KEYRING_NAME=ceph.client.admin.keyring
+ CEPH_KEYRING_TEMPLATE=admin.keyring
+ KUBE_SECRET_NAME=ceph-client-admin-keyring
+ kubectl get --namespace ceph secrets ceph-client-admin-keyring
Error from server (NotFound): secrets "ceph-client-admin-keyring" not found
+ cat
+ kubectl create --namespace ceph -f -
++ kube_ceph_keyring_gen AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w== admin.keyring
++ CEPH_KEY=AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==
++ CEPH_KEY_TEMPLATE=admin.keyring
++ sed 's|{{ key }}|AQBhjwldAAAAABAAO9uRzdtFukGfCKVFFFPM1w==|' /opt/ceph/templates/admin.keyring
++ base64
++ tr -d '\n'
error: error validating "STDIN": error validating data: unknown; if you choose to ignore these errors, turn validation off with --validate=false

What you expected to happen:

When I run bellow command, all jobs are completed.

kubectl get job - n ceph

And after bellow command, I can see kinds of keyring.

kubectl get secret -n ceph

How to reproduce it (as minimally and precisely as possible):

Just run command

helm install --name=ceph local/ceph --namespace=ceph -f ~/ceph-overrides.yaml

Anything else we need to know:

Waiting you to resolv it online. tks! @rootfs @zmc @alfredodeza @liewegas @jecluis @ktdreyer

ceph-rbd-provisioner failing to create ceph-rbd storage class

Is this a request for help?: Yes


Is this a BUG REPORT or FEATURE REQUEST? (choose one): Bug Report

Version of Helm and Kubernetes: Helm 2.11 Minikube v0.24.1

Which chart: Not sure

What happened: Trying to create PVC after all pods come up

What you expected to happen: I create my PVC and everything works

How to reproduce it (as minimally and precisely as possible): Not sure

Anything else we need to know: I create a cluster based on your setup guide but when I go to create a PVC in Kubernetes

apiVersion: v1
metadata:
  name: ceph-pvc
spec:
  accessModes:
   - ReadWriteOnce
  resources:
    requests:
       storage: 20Gi
  storageClassName: ceph-rbd

It stays pending forever. This is the error I'm seeing in the logs.

  Warning  ProvisioningFailed  6s                ceph.com/rbd ceph-rbd-provisioner-85d57d8799-j6vbn cb6ca04b-3094-11e9-9c4b-0242ac110006  Failed to provision volume with StorageClass "ceph-rbd": failed to create rbd image: signal: segmentation fault (core dumped), command output: 2019-02-14 20:28:34.084001 7f94dca8fd80 -1 did not load config file, using default settings.
2019-02-14 20:28:34.143846 7f94dca8fd80 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin: (2) No such file or directory
*** Caught signal (Segmentation fault) **

ceph-osd fails to initialize when using non-bluestore and journals

Is this a request for help?:NO


Is this a BUG REPORT or FEATURE REQUEST? (choose one):BUG REPORT

Version of Helm and Kubernetes:Kubernetes 1.11, helm: 2.11

Which chart:ceph-helm

What happened:
tried to use non-bluestore configuration with separate journals.

What you expected to happen:
OSD should activate.

How to reproduce it (as minimally and precisely as possible):
Simple two drive system. Set up values.yaml to use non-bluestore object storage.

Anything else we need to know:
Attaching potential "fix" for ceph/ceph/templates/bin/_osd_disk_activate.sh.tpl

#!/bin/bash
set -ex

function osd_activate {
if [[ -z "${OSD_DEVICE}" ]];then
log "ERROR- You must provide a device to build your OSD ie: /dev/sdb"
exit 1
fi

CEPH_DISK_OPTIONS=""
CEPH_OSD_OPTIONS=""

DATA_UUID=$(blkid -o value -s PARTUUID ${OSD_DEVICE}1)
LOCKBOX_UUID=$(blkid -o value -s PARTUUID ${OSD_DEVICE}3 || true)
JOURNAL_PART=$(dev_part ${OSD_DEVICE} 2)
ACTUAL_OSD_DEVICE=$(readlink -f ${OSD_DEVICE}) # resolve /dev/disk/by-
names

watch the udev event queue, and exit if all current events are handled

udevadm settle --timeout=600

wait till partition exists then activate it

if [[ -n "${OSD_JOURNAL}" ]]; then
#wait_for_file /dev/disk/by-partuuid/${OSD_JOURNAL_UUID}
#chown ceph. /dev/disk/by-partuuid/${OSD_JOURNAL_UUID}
#CEPH_OSD_OPTIONS="${CEPH_OSD_OPTIONS} --osd-journal /dev/disk/by-partuuid/${OSD_JOURNAL_UUID}"
CEPH_OSD_OPTIONS="${CEPH_OSD_OPTIONS}"
else
wait_for_file $(dev_part ${OSD_DEVICE} 1)
chown ceph. $JOURNAL_PART
fi

chown ceph. /var/log/ceph

DATA_PART=$(dev_part ${OSD_DEVICE} 1)
MOUNTED_PART=${DATA_PART}

if [[ ${OSD_DMCRYPT} -eq 1 ]]; then
echo "Mounting LOCKBOX directory"
# NOTE(leseb): adding || true so when this bug will be fixed the entrypoint will not fail
# Ceph bug tracker: http://tracker.ceph.com/issues/18945
mkdir -p /var/lib/ceph/osd-lockbox/${DATA_UUID}
mount /dev/disk/by-partuuid/${LOCKBOX_UUID} /var/lib/ceph/osd-lockbox/${DATA_UUID} || true
CEPH_DISK_OPTIONS="$CEPH_DISK_OPTIONS --dmcrypt"
MOUNTED_PART="/dev/mapper/${DATA_UUID}"
fi

ceph-disk -v --setuser ceph --setgroup disk activate ${CEPH_DISK_OPTIONS} --no-start-daemon ${DATA_PART}

OSD_ID=$(grep "${MOUNTED_PART}" /proc/mounts | awk '{print $2}' | grep -oh '[0-9]*')
OSD_PATH=$(get_osd_path $OSD_ID)
OSD_KEYRING="$OSD_PATH/keyring"
OSD_WEIGHT=$(df -P -k $OSD_PATH | tail -1 | awk '{ d= $2/1073741824 ; r = sprintf("%.2f", d); print r }')
ceph ${CLI_OPTS} --name=osd.${OSD_ID} --keyring=$OSD_KEYRING osd crush create-or-move -- ${OSD_ID} ${OSD_WEIGHT} ${CRUSH_LOCATION}

log "SUCCESS"
exec /usr/bin/ceph-osd ${CLI_OPTS} ${CEPH_OSD_OPTIONS} -f -i ${OSD_ID} --setuser ceph --setgroup disk
}

Secret not found for ceph helm

Describe the bug
A clear and concise description of what the bug is.

Secret not found

Version of Helm and Kubernetes:
k8s: v1.15.7
Helm: v2.16.1

Which chart:
helm-ceph

What happened:
Secret is missing

What you expected to happen
I expected the outcome to look like below but, nothing is running except the rbd pod.

$ kubectl -n ceph get pods
NAME READY STATUS RESTARTS AGE
ceph-mds-3804776627-976z9 0/1 Pending 0 1m
ceph-mgr-3367933990-b368c 1/1 Running 0 1m
ceph-mon-check-1818208419-0vkb7 1/1 Running 0 1m
ceph-mon-cppdk 3/3 Running 0 1m
ceph-mon-t4stn 3/3 Running 0 1m
ceph-mon-vqzl0 3/3 Running 0 1m
ceph-osd-dev-sdd-6dphp 1/1 Running 0 1m
ceph-osd-dev-sdd-6w7ng 1/1 Running 0 1m
ceph-osd-dev-sdd-l80vv 1/1 Running 0 1m
ceph-osd-dev-sde-6dq6w 1/1 Running 0 1m
ceph-osd-dev-sde-kqt0r 1/1 Running 0 1m
ceph-osd-dev-sde-lp2pf 1/1 Running 0 1m
ceph-rbd-provisioner-2099367036-4prvt 1/1 Running 0 1m
ceph-rbd-provisioner-2099367036-h9kw7 1/1 Running 0 1m
ceph-rgw-3375847861-4wr74 0/1 Pending 0 1m

How to reproduce it (as minimally and precisely as possible):
cd /tmp
curl https://raw.githubusercontent.com/kubernetes/helm/master/scripts/get > install-helm.sh
bash install-helm.sh
helm init
helm serve
helm repo add local http://localhost:8879/charts
git clone https://github.com/ceph/ceph-helm
cd ceph-helm/ceph
sudo apt install -y make
make

sudo nano ceph-overrides.yaml
network:
public: 172.16.0.0/16
cluster: 172.16.0.0/16

osd_devices:

  • name: dev-sdb1
    device: /dev/sdb
    zap: "1"
    storageclass:
    name: ceph-rbd
    pool: rbd
    user_id: k8s

kubectl create namespace ceph
kubectl create -f /home/vagrant/ceph-helm/ceph/rbac.yaml
kubectl label node node-1 ceph-mon=enabled ceph-mgr=enabled
kubectl label node node-1 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled
kubectl label node node-2 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled
kubectl label node node-3 ceph-osd=enabled ceph-osd-device-dev-sdb=enabled

helm install --name=ceph /home/vagrant/ceph-helm/ceph/ceph --namespace=ceph -f /home/vagrant/ceph-helm/ceph/ceph-overrides.yaml

Anything else we need to know:
I followed the steps on https://docs.ceph.com/docs/mimic/start/kube-helm/

RGW doesn't work

Is this a request for help?:
Yes

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
This is a BUG REPORT

Version of Helm and Kubernetes:
k8s - 1.8.7
helm - v2.8.2

Which chart:
ceph

What happened:
rgw container doesnt work.
it writes a log

Initialization timeout, failed to initialize

What you expected to happen:
rgw works and binds port 8088

How to reproduce it (as minimally and precisely as possible):
just run

helm install --name=ceph local/ceph --namespace=ceph -f ./ceph-overrides.yaml

Anything else we need to know:

How can I adjust the number of pgs ?

I have a running ceph deployment in kubernetes. What is the suggested way to adjust the number of pgs dynamically(as more osds(nodes) get added) with the ceph-helm deployment?

can't use md0 as device

Is this a request for help?: no

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

Version of Helm and Kubernetes:
helm

Client: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1", GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}

kubectl

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.1", GitCommit:"b1b29978270dc22fecc592ac55d903350454310a", GitTreeState:"clean", BuildDate:"2018-07-17T18:53:20Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.5-gke.4", GitCommit:"6265b9797fc8680c8395abeab12c1e3bad14069a", GitTreeState:"clean", BuildDate:"2018-08-04T03:47:40Z", GoVersion:"go1.9.3b4", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
ceph-helm

What happened:
Tried to setup an mdadm device to stripe 2 disks in raid0 and handle them as a single osd.
It does not finish the osd-device setup properly.

What you expected to happen:
for the setup to finish and work as well as it does with any other sdb/sdc/sdd...

How to reproduce it (as minimally and precisely as possible):
create an md0 device and use it as you would any other sdX device (OSD device).
the setup fails because the osd-activate-pod crashes with:

2018-08-13T17:14:53.260443418Z command_check_call: Running command: /usr/bin/ceph-osd --cluster ceph --mkfs -i 9 --monmap /var/lib/ceph/tmp/mnt.eawl9p/activate.monmap --osd-data /var/lib/ceph/tmp/mnt.eawl9p --osd-journal /var/lib/ceph/tmp/mnt.eawl9p/journal --osd-uuid 36356a07-a91f-4625-8b6f-864dd991de5f --setuser ceph --setgroup disk
2018-08-13T17:14:53.324461741Z 2018-08-13 17:14:53.324174 7fc7564cee00 -1 filestore(/var/lib/ceph/tmp/mnt.eawl9p) mkjournal(1066): error creating journal on /var/lib/ceph/tmp/mnt.eawl9p/journal: (2) No such file or directory
2018-08-13T17:14:53.324483608Z 2018-08-13 17:14:53.324256 7fc7564cee00 -1 OSD::mkfs: ObjectStore::mkfs failed with error (2) No such file or directory
2018-08-13T17:14:53.324849587Z 2018-08-13 17:14:53.324610 7fc7564cee00 -1  ** ERROR: error creating empty object store in /var/lib/ceph/tmp/mnt.eawl9p: (2) No such file or directory
2018-08-13T17:14:53.329122347Z mount_activate: Failed to activate
2018-08-13T17:14:53.329225389Z unmount: Unmounting /var/lib/ceph/tmp/mnt.eawl9p
2018-08-13T17:14:53.3294495Z command_check_call: Running command: /bin/umount -- /var/lib/ceph/tmp/mnt.eawl9p
2018-08-13T17:14:53.375884854Z Traceback (most recent call last):
2018-08-13T17:14:53.375907887Z   File "/usr/sbin/ceph-disk", line 9, in &lt;module&gt;
2018-08-13T17:14:53.375913364Z     load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
2018-08-13T17:14:53.375918173Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5717, in run
2018-08-13T17:14:53.377208587Z     main(sys.argv[1:])
2018-08-13T17:14:53.377222215Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 5668, in main
2018-08-13T17:14:53.37842527Z     args.func(args)
2018-08-13T17:14:53.378439165Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3758, in main_activate
2018-08-13T17:14:53.379145782Z     reactivate=args.reactivate,
2018-08-13T17:14:53.379156768Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3521, in mount_activate
2018-08-13T17:14:53.379899211Z     (osd_id, cluster) = activate(path, activate_key_template, init)
2018-08-13T17:14:53.379910301Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3698, in activate
2018-08-13T17:14:53.380577968Z     keyring=keyring,
2018-08-13T17:14:53.380589482Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 3165, in mkfs
2018-08-13T17:14:53.381196441Z     '--setgroup', get_ceph_group(),
2018-08-13T17:14:53.381206848Z   File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 566, in command_check_call
2018-08-13T17:14:53.381212315Z     return subprocess.check_call(arguments)
2018-08-13T17:14:53.381216601Z   File "/usr/lib/python2.7/subprocess.py", line 541, in check_call
2018-08-13T17:14:53.381482189Z     raise CalledProcessError(retcode, cmd)
2018-08-13T17:14:53.381659138Z subprocess.CalledProcessError: Command '['/usr/bin/ceph-osd', '--cluster', 'ceph', '--mkfs', '-i', u'9', '--monmap', '/var/lib/ceph/tmp/mnt.eawl9p/activate.monmap', '--osd-data', '/var/lib/ceph/tmp/mnt.eawl9p', '--osd-journal', '/var/lib/ceph/tmp/mnt.eawl9p/journal', '--osd-uuid', u'36356a07-a91f-4625-8b6f-864dd991de5f', '--setuser', 'ceph', '--setgroup', 'disk']' returned non-zero exit status 1

Anything else we need to know:
I found out a place where it is assumed that the partition number X of a device is defined by just adding a number to the device name, this is true for sdX1 for example, but not for mdXp1
I applied the following patch, but still doesn't work.

diff --git a/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl b/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl
index eda2b3f..88cf800 100644
--- a/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl
+++ b/ceph/ceph/templates/bin/_osd_disk_prepare.sh.tpl
@@ -27,7 +27,7 @@ function osd_disk_prepare {
     log "Checking if it belongs to this cluster"
     tmp_osd_mount="/var/lib/ceph/tmp/`echo $RANDOM`/"
     mkdir -p $tmp_osd_mount
-    mount ${OSD_DEVICE}1 ${tmp_osd_mount}
+    mount $(dev_part ${OSD_DEVICE} 1) ${tmp_osd_mount}
     osd_cluster_fsid=`cat ${tmp_osd_mount}/ceph_fsid`
     umount ${tmp_osd_mount} && rmdir ${tmp_osd_mount}
     cluster_fsid=`ceph ${CLI_OPTS} --name client.bootstrap-osd --keyring $OSD_BOOTSTRAP_KEYRING fsid`
@@ -56,7 +56,7 @@ function osd_disk_prepare {
     echo "Unmounting LOCKBOX directory"
     # NOTE(leseb): adding || true so when this bug will be fixed the entrypoint will not fail
     # Ceph bug tracker: http://tracker.ceph.com/issues/18944
-    DATA_UUID=$(blkid -o value -s PARTUUID ${OSD_DEVICE}1)
+    DATA_UUID=$(blkid -o value -s PARTUUID $(dev_part ${OSD_DEVICE} 1))
     umount /var/lib/ceph/osd-lockbox/${DATA_UUID} || true
   else
     ceph-disk -v prepare ${CLI_OPTS} --journal-uuid ${OSD_JOURNAL_UUID} ${OSD_DEVICE} ${OSD_JOURNAL}

Mounting CephFS is giving an Error.

Request for Help.
I am using ceph-helm, also running mds with it.
ceph -s shows that everything is up and running.

I am unable to mount ceph fs to a pod.

I am using this ceph-test-fs.yaml

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: cephfs
  name: ceph-cephfs-test
  namespace: ceph
spec:
  nodeSelector:
    node-type: storage
  containers:
  - name: cephfs-rw
    image: busybox
    command:
    - sh
    - -c
    - while true; do sleep 1; done
    volumeMounts:
    - mountPath: "/mnt/cephfs"
      name: cephfs
  volumes:
  - name: cephfs
    cephfs:
      monitors:
#This only works if you have skyDNS resolveable from the kubernetes node. Otherwise you must manually put in one or more mon pod ips.
      - ceph-mon.ceph:6789
      user: admin
      secretRef:
        name: ??

I am not sure what keyring to use to be able to mount.
and also I did try all the secrets generated by ceph-helm deployment.

Can anyone help me with mounting ceph fs?

Here is the error log:

MountVolume.SetUp failed for volume "cephfs" : CephFS: mount failed: mount failed: exit status 32 Mounting command: systemd-run Mounting arguments: --description=Kubernetes transient mount for /var/lib/kubelet/pods/7d82c944-e01f-11e7-bf03-001c42d61047/volumes/kubernetes.io~cephfs/cephfs --scope -- mount -t ceph -o name=admin,secretfile=/etc/ceph/admin.secret ceph-mon.ceph:6789:/ /var/lib/kubelet/pods/7d82c944-e01f-11e7-bf03-001c42d61047/volumes/kubernetes.io~cephfs/cephfs Output: Running scope as unit run-10734.scope. mount: wrong fs type, bad option, bad superblock on ceph-mon.ceph:6789:/, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so.

ceph-mon, ceph-osd-*, ceph-mgr do not start on k8s v1.9.5

Is this a BUG REPORT or FEATURE REQUEST? (choose one):
bug

Version of Helm and Kubernetes:

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.2", GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.5", GitCommit:"f01a2bf98249a4db383560443a59bed0c13575df", GitTreeState:"clean", BuildDate:"2018-03-19T15:50:45Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"linux/amd64"}

$ helm version
Client: &version.Version{SemVer:"v2.9.0", GitCommit:"f6025bb9ee7daf9fee0026541c90a6f557a3e0bc", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.0", GitCommit:"f6025bb9ee7daf9fee0026541c90a6f557a3e0bc", GitTreeState:"clean"}

Which chart:

$ helm ls
NAME    REVISION        UPDATED                         STATUS          CHART           NAMESPACE
ceph    1               Wed May  9 00:34:18 2018        DEPLOYED        ceph-0.1.0      ceph     

What happened:
ceph-mon, ceph-osd-*, ceph-mgr do not start. Fails with error:

container_linux.go:247: starting container process caused "process_linux.go:359: container init caused rootfs_linux.go:54: mounting "/var/lib/kubelet/pods/.../volume-subpaths/ceph-bootstrap-rgw-keyring/ceph-mon/8" to rootfs "/var/lib/docker/aufs/mnt/..." at "/var/lib/docker/aufs/mnt/.../var/lib/ceph/bootstrap-rgw/ceph.keyring" caused "not a directory"

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):
Run the steps from http://docs.ceph.com/docs/master/start/kube-helm/ on k8s 1.9.5

Anything else we need to know:
Looks like related to kubernetes/kubernetes#62417

It's an expected error due to security fix

procMount for k8s 1.12

Current version of ceph-helm fails to install on k8s v1.12. v1.12 requires procMount set in the security context.

The templates for the OSD set a securityContext for priviledged: true. procMount: default is now required. One instance in daemonset-osd.yaml and two instances in daemonset-osd-devices.yaml. I made edits locally to verify, but am not setup to cleanly make the edits within the git repo.

ceph osd fails due to /dev/sdX being changed across reboots.

Is this a request for help?:No


Is this a BUG REPORT or FEATURE REQUEST? (choose one):BUG REPORT

Version of Helm and Kubernetes:Helm: 2.11.0, Kubernetes 1.11.6

Which chart: ceph-helm

What happened:
Servers were configured with SAS controllers and onboard ATA controller i.e two sets of SSD/HDD controllers. Across reboots the drives /dev/ names changed e.g. drive on SAS controller port 1 became /dev/sdc and prior to reboot it was /dev/sda. This is not uncommon.
The values.yaml file was configured to avoid the situation using by-path rather than /dev/sdX values.

osd_devices:

  • name: nvsedcog-osd-1
    device: /dev/disk/by-path/pci-0000:00:11.4-ata-1
    journal: /dev/disk/by-path/pci-0000:00:1f.2-ata-2
  • name: nvsedcog-osd-2
    device: /dev/disk/by-path/pci-0000:00:11.4-ata-3
    journal: /dev/disk/by-path/pci-0000:00:1f.2-ata-2

What you expected to happen:
_osd_disk_activate.sh.tpl, _osd_disk_prepare.sh.tpl should have found the correct device name using readlink and used the corresponding /dev/sdX device.

How to reproduce it (as minimally and precisely as possible):

A SAS controller is not necessary - given 3 drives, /dev/sda, /dev/sdb, /dev/sdc, install ceph on /dev/sda and /dev/sdc.
Shutdown the server and remove /dev/sdb.
On restart, osd1 or the osd attached to /dev/sdc will fail.

Anything else we need to know:
I'm attaching the "fixes" I made to support by-path names in the values.yaml file:

_osd_disk_prepare.sh.tpl.txt
_osd_disk_activate.sh.tpl.txt

[secrets "ceph-bootstrap-osd-keyring" not found] ceph-osd fail on k8s 1.10.1 install by kubeadm


kubectl get po -n ceph
NAME                                            READY     STATUS                  RESTARTS   AGE
ceph-mds-696bd98bdb-bnvpg                       0/1       Pending                 0          18m
ceph-mds-keyring-generator-q679r                0/1       Completed               0          18m
ceph-mgr-6d5f86d9c4-nr76h                       1/1       Running                 1          18m
ceph-mgr-keyring-generator-v825z                0/1       Completed               0          18m
ceph-mon-86lth                                  1/1       Running                 0          18m
ceph-mon-check-74d98c5b95-wf9tm                 1/1       Running                 0          18m
ceph-mon-keyring-generator-rfg8j                0/1       Completed               0          18m
ceph-mon-pp5hc                                  1/1       Running                 0          18m
ceph-namespace-client-key-cleaner-g9dri-sjmqd   0/1       Completed               0          1h
ceph-namespace-client-key-cleaner-qwkee-pdkh6   0/1       Completed               0          21m
ceph-namespace-client-key-cleaner-t25ui-5gkb7   0/1       Completed               0          2d
ceph-namespace-client-key-generator-xk4w6       0/1       Completed               0          18m
ceph-osd-dev-sda-6jbgd                          0/1       Init:CrashLoopBackOff   8          18m
ceph-osd-dev-sda-khfhw                          0/1       Init:CrashLoopBackOff   8          18m
ceph-osd-dev-sda-krkjf                          0/1       Init:CrashLoopBackOff   8          18m
ceph-osd-keyring-generator-mvktj                0/1       Completed               0          18m
ceph-rbd-provisioner-b58659dc9-nhx2q            1/1       Running                 0          18m
ceph-rbd-provisioner-b58659dc9-nnlh2            1/1       Running                 0          18m
ceph-rgw-5bd9dd66c5-gh946                       0/1       Pending                 0          18m
ceph-rgw-keyring-generator-dz9kd                0/1       Completed               0          18m
ceph-storage-admin-key-cleaner-1as0t-fq589      0/1       Completed               0          1h
ceph-storage-admin-key-cleaner-oayjp-fglzr      0/1       Completed               0          2d
ceph-storage-admin-key-cleaner-zemvx-jxn7c      0/1       Completed               0          21m
ceph-storage-keys-generator-szps9               0/1       Completed               0          18m

Version of Helm and Kubernetes:

  • k8s 1.10.1
  • helm
helm version
Client: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.8.2", GitCommit:"a80231648a1473929271764b920a8e346f6de844", GitTreeState:"clean"}

Which chart:

commit 70681c8218e75bb10acc7dc210b791b69545ce0d
Merge: a4fa8a1 6cb2e1e
Author: Huamin Chen <[email protected]>
Date:   Thu Apr 26 12:11:43 2018 -0400

  • pod describe
kubectl describe -n ceph po  ceph-osd-dev-sda-6jbgd
Name:           ceph-osd-dev-sda-6jbgd
Namespace:      ceph
Node:           k8s-2/192.168.16.40
Start Time:     Sat, 28 Apr 2018 15:54:08 +0800
Labels:         application=ceph
                component=osd
                controller-revision-hash=1450926272
                pod-template-generation=1
                release_group=ceph
Annotations:    <none>
Status:         Pending
IP:             192.168.16.40
Controlled By:  DaemonSet/ceph-osd-dev-sda
Init Containers:
  init:
    Container ID:  docker://8c9536cc4c5d811f57ba6349c87245121651841f52db682f858ae0ac70555856
    Image:         docker.io/kolla/ubuntu-source-kubernetes-entrypoint:4.0.0
    Image ID:      docker-pullable://kolla/ubuntu-source-kubernetes-entrypoint@sha256:75116ab2f9f65c5fc078e68ce7facd66c1c57496947f37b7209b32f94925e53b
    Port:          <none>
    Host Port:     <none>
    Command:
      kubernetes-entrypoint
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sat, 28 Apr 2018 15:54:34 +0800
      Finished:     Sat, 28 Apr 2018 15:54:36 +0800
    Ready:          True
    Restart Count:  0
    Environment:
      POD_NAME:              ceph-osd-dev-sda-6jbgd (v1:metadata.name)
      NAMESPACE:             ceph (v1:metadata.namespace)
      INTERFACE_NAME:        eth0
      DEPENDENCY_SERVICE:    ceph-mon
      DEPENDENCY_JOBS:
      DEPENDENCY_DAEMONSET:
      DEPENDENCY_CONTAINER:
      COMMAND:               echo done
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
  ceph-init-dirs:
    Container ID:  docker://1562879ebbc52c47cfd9fb292339e548d26450207846ff6eeb38594569d5ec5f
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          <none>
    Host Port:     <none>
    Command:
      /tmp/init_dirs.sh
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Sat, 28 Apr 2018 15:54:38 +0800
      Finished:     Sat, 28 Apr 2018 15:54:39 +0800
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /run from pod-run (rw)
      /tmp/init_dirs.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
  osd-prepare-pod:
    Container ID:  docker://2b5bed33de8f35533eb72ef3208010153b904a8ed34c527a4916b88f549d5f6b
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:      docker-pullable://ceph/daemon@sha256:687056228e899ecbfd311854e3864db0b46dd4a9a6d4eb4b47c815ca413f25ee
    Port:          <none>
    Host Port:     <none>
    Command:
      /start_osd.sh
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Sat, 28 Apr 2018 16:11:06 +0800
      Finished:     Sat, 28 Apr 2018 16:11:07 +0800
    Ready:          False
    Restart Count:  8
    Environment:
      CEPH_DAEMON:         osd_ceph_disk_prepare
      KV_TYPE:             k8s
      CLUSTER:             ceph
      CEPH_GET_ADMIN_KEY:  1
      OSD_DEVICE:          /dev/mapper/centos-root
      HOSTNAME:             (v1:spec.nodeName)
    Mounts:
      /common_functions.sh from ceph-bin (ro)
      /dev from devices (rw)
      /etc/ceph/ceph.client.admin.keyring from ceph-client-admin-keyring (ro)
      /etc/ceph/ceph.conf from ceph-etc (ro)
      /etc/ceph/ceph.mon.keyring from ceph-mon-keyring (ro)
      /osd_activate_journal.sh from ceph-bin (ro)
      /osd_disk_activate.sh from ceph-bin (ro)
      /osd_disk_prepare.sh from ceph-bin (ro)
      /osd_disks.sh from ceph-bin (ro)
      /run from pod-run (rw)
      /start_osd.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/lib/ceph/bootstrap-mds/ceph.keyring from ceph-bootstrap-mds-keyring (ro)
      /var/lib/ceph/bootstrap-osd/ceph.keyring from ceph-bootstrap-osd-keyring (ro)
      /var/lib/ceph/bootstrap-rgw/ceph.keyring from ceph-bootstrap-rgw-keyring (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Containers:
  osd-activate-pod:
    Container ID:
    Image:         docker.io/ceph/daemon:tag-build-master-luminous-ubuntu-16.04
    Image ID:
    Port:          <none>
    Host Port:     <none>
    Command:
      /start_osd.sh
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Liveness:       tcp-socket :6800 delay=60s timeout=5s period=10s #success=1 #failure=3
    Readiness:      tcp-socket :6800 delay=0s timeout=5s period=10s #success=1 #failure=3
    Environment:
      CEPH_DAEMON:         osd_ceph_disk_activate
      KV_TYPE:             k8s
      CLUSTER:             ceph
      CEPH_GET_ADMIN_KEY:  1
      OSD_DEVICE:          /dev/mapper/centos-root
      HOSTNAME:             (v1:spec.nodeName)
    Mounts:
      /common_functions.sh from ceph-bin (ro)
      /dev from devices (rw)
      /etc/ceph/ceph.client.admin.keyring from ceph-client-admin-keyring (ro)
      /etc/ceph/ceph.conf from ceph-etc (ro)
      /etc/ceph/ceph.mon.keyring from ceph-mon-keyring (ro)
      /osd_activate_journal.sh from ceph-bin (ro)
      /osd_disk_activate.sh from ceph-bin (ro)
      /osd_disk_prepare.sh from ceph-bin (ro)
      /osd_disks.sh from ceph-bin (ro)
      /run from pod-run (rw)
      /start_osd.sh from ceph-bin (ro)
      /var/lib/ceph from pod-var-lib-ceph (rw)
      /var/lib/ceph/bootstrap-mds/ceph.keyring from ceph-bootstrap-mds-keyring (ro)
      /var/lib/ceph/bootstrap-osd/ceph.keyring from ceph-bootstrap-osd-keyring (ro)
      /var/lib/ceph/bootstrap-rgw/ceph.keyring from ceph-bootstrap-rgw-keyring (ro)
      /var/log/ceph from pod-var-log-ceph (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-z5m75 (ro)
      /variables_entrypoint.sh from ceph-bin (ro)
Conditions:
  Type           Status
  Initialized    False
  Ready          False
  PodScheduled   True
Volumes:
  devices:
    Type:          HostPath (bare host directory volume)
    Path:          /dev
    HostPathType:
  pod-var-lib-ceph:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  pod-var-log-ceph:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/ceph/osd
    HostPathType:
  pod-run:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  Memory
  ceph-bin:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-bin
    Optional:  false
  ceph-etc:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ceph-etc
    Optional:  false
  ceph-client-admin-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-client-admin-keyring
    Optional:    false
  ceph-mon-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-mon-keyring
    Optional:    false
  ceph-bootstrap-osd-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-osd-keyring
    Optional:    false
  ceph-bootstrap-mds-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-mds-keyring
    Optional:    false
  ceph-bootstrap-rgw-keyring:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ceph-bootstrap-rgw-keyring
    Optional:    false
  default-token-z5m75:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-z5m75
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  ceph-osd=enabled
                 ceph-osd-device-dev-sda=enabled
Tolerations:     node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute
                 node.kubernetes.io/unreachable:NoExecute
Events:
  Type     Reason                 Age                From            Message
  ----     ------                 ----               ----            -------
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "pod-var-log-ceph"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "devices"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "pod-var-lib-ceph"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "pod-run"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-bin"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-etc"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "default-token-z5m75"
  Warning  FailedMount            19m (x2 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-bootstrap-osd-keyring" : secrets "ceph-bootstrap-osd-keyring" not found
  Warning  FailedMount            19m (x2 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-bootstrap-rgw-keyring" : secrets "ceph-bootstrap-rgw-keyring" not found
  Warning  FailedMount            19m (x2 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-client-admin-keyring" : secrets "ceph-client-admin-keyring" not found
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-client-admin-keyring"
  Normal   SuccessfulMountVolume  19m                kubelet, k8s-2  MountVolume.SetUp succeeded for volume "ceph-bootstrap-osd-keyring"
  Warning  FailedMount            19m (x4 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-mon-keyring" : secrets "ceph-mon-keyring" not found
  Warning  FailedMount            19m (x4 over 19m)  kubelet, k8s-2  MountVolume.SetUp failed for volume "ceph-bootstrap-mds-keyring" : secrets "ceph-bootstrap-mds-keyring" not found
  Normal   SuccessfulMountVolume  19m (x2 over 19m)  kubelet, k8s-2  (combined from similar events): MountVolume.SetUp succeeded for volume "ceph-mon-keyring"
  Warning  BackOff                4m (x65 over 18m)  kubelet, k8s-2  Back-off restarting failed container

  • pod log
kubectl  logs  -n ceph   ceph-osd-dev-sda-6jbgd
Error from server (BadRequest): container "osd-activate-pod" in pod "ceph-osd-dev-sda-6jbgd" is waiting to start: PodInitializing

PVC stuck in pending state

below are the logs of rbd provisioner
kubectl logs -f -n ceph ceph-rbd-provisioner-5b9bfb859d-2wvd9

  • exec /usr/local/bin/rbd-provisioner -id ceph-rbd-provisioner-5b9bfb859d-2wvd9
    I0514 08:59:52.174721 1 main.go:84] Creating RBD provisioner with identity: ceph-rbd-provisioner-5b9bfb859d-2wvd9
    I0514 08:59:52.225339 1 controller.go:407] Starting provisioner controller a2b23724-7626-11e9-9f90-0242ac1e4803!
    I0514 09:13:36.115718 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:36.222196 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:36.300492 1 leaderelection.go:156] attempting to acquire leader lease...
    E0514 09:13:36.371424 1 leaderelection.go:273] Failed to update lock: Operation cannot be fulfilled on persistentvolumeclaims "ceph-pvc": the object has been modified; please apply your changes to the latest version and try again
    I0514 09:13:37.253079 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:38.408582 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:13:52.253423 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:07.254063 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:22.254484 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:24.849756 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc
    I0514 09:14:24.850364 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:37.254732 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:52.255130 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:14:55.219073 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/ceph-pvc, timeout reached
    I0514 09:15:07.255355 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:15:22.255579 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:15:22.267831 1 leaderelection.go:156] attempting to acquire leader lease...
    I0514 09:15:37.256247 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:15:52.256693 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:04.686503 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc
    I0514 09:16:04.686626 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:07.256853 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:22.257263 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:35.201268 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/ceph-pvc, timeout reached
    I0514 09:16:37.257458 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:52.258008 1 controller.go:1068] scheduleOperation[lock-provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:16:52.275487 1 leaderelection.go:156] attempting to acquire leader lease...
    I0514 09:16:52.302071 1 leaderelection.go:178] successfully acquired lease to provision for pvc default/ceph-pvc
    I0514 09:16:52.302256 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:17:07.258228 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:17:22.258531 1 controller.go:1068] scheduleOperation[provision-default/ceph-pvc[8dc8b0cc-7628-11e9-bef5-080027176a7d]]
    I0514 09:17:22.658162 1 leaderelection.go:204] stopped trying to renew lease to provision for pvc default/ceph-pvc, timeout reached

All pods are running well
kubectl get pods -n ceph
NAME READY STATUS RESTARTS AGE
ceph-mds-68c79b5cc-jt2m9 0/1 Pending 0 47m
ceph-mds-keyring-generator-jmj5g 0/1 Completed 0 47m
ceph-mgr-6c687f5964-7g8fq 1/1 Running 1 47m
ceph-mgr-keyring-generator-4jtbm 0/1 Completed 0 47m
ceph-mon-check-676d984874-mnscz 1/1 Running 0 47m
ceph-mon-fl2jl 3/3 Running 0 48m
ceph-mon-keyring-generator-cv8ff 0/1 Completed 0 47m
ceph-namespace-client-key-generator-klnmt 0/1 Completed 0 47m
ceph-osd-dev-sdb-hnkd4 1/1 Running 0 47m
ceph-osd-keyring-generator-7h7dz 0/1 Completed 0 47m
ceph-rbd-provisioner-5b9bfb859d-2wvd9 1/1 Running 0 47m
ceph-rbd-provisioner-5b9bfb859d-r7khc 1/1 Running 0 47m
ceph-rgw-6d946b-ztwnx 0/1 Pending 0 47m
ceph-rgw-keyring-generator-ndzbz 0/1 Completed 0 47m
ceph-storage-keys-generator-2wnrq 0/1 Completed 0 47m

3 nodes one node for osd one for mon one master node for kubernetes

pod "ceph-osd-dev-" blocked after the k8s recovered from a crash

mount ${OSD_DEVICE}1 ${tmp_osd_mount}

I installed a ceph cluster using the ceph-helm, but my kubernetes cluster crashed for some reason. After I restored the k8s cluster, all the pods recovered excluding the ceph-osd-dev- pods.

The initContainer osd-prepare-pod logs:

2018-08-25 04:37:41  /start_osd.sh: Checking if it belongs to this cluster
++ echo 2712
+ tmp_osd_mount=/var/lib/ceph/tmp/2712/
+ mkdir -p /var/lib/ceph/tmp/2712/
+ mount /dev/loop01 /var/lib/ceph/tmp/2712/
mount: special device /dev/loop01 does not exist

I checked the devices in /dev/ ,

$ ls /dev/loop*
/dev/loop0  /dev/loop0p1  /dev/loop0p2  /dev/loop-control

I suspect the /dev/loop0p1 is the device wanted, so I make a symbol link , and IT WORKS!

I think there is something wrong with it. The ceph deamon image is

registry.docker-cn.com/ceph/daemon                                         tag-build-master-luminous-ubuntu-16.04   c48fa6936ae5        6 months ago        445 MB

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.