Code Monkey home page Code Monkey logo

dynatrace-operator's Introduction

Dynatrace Operator

GoDoc CI codecov GitHub release (latest SemVer) Go Report Card Releases

The Dynatrace Operator supports rollout and lifecycle management of various Dynatrace components in Kubernetes and OpenShift.

  • OneAgent
    • classicFullStack rolls out a OneAgent pod per node to monitor pods on it and the node itself
    • applicationMonitoring is a webhook based injection mechanism for automatic app-only injection
      • CSI Driver can be enabled to cache OneAgent downloads per node
    • hostMonitoring is only monitoring the hosts (i.e. nodes) in the cluster without app-only injection
      • CSI Driver is used to provide a writeable volume for the Oneagent as it's running in read-only mode
    • cloudNativeFullStack is a combination of applicationMonitoring and hostMonitoring
      • CSI Driver is used for both features
  • ActiveGate
    • routing routes OneAgent traffic through the ActiveGate
    • kubernetes-monitoring allows monitoring of the Kubernetes API
    • metrics-ingest routes enriched metrics through ActiveGate

For more information please have a look at our DynaKube Custom Resource examples and our official help page.

Support lifecycle

As the Dynatrace Operator is provided by Dynatrace Incorporated, support is provided by the Dynatrace Support team, as described on the support page. Github issues will also be considered on a case-by-case basis regardless of support contracts and commercial relationships with Dynatrace.

The Dynatrace support lifecycle for Kubernetes and Openshift can be found in the official technology support pages.

Quick Start

The Dynatrace Operator acts on its separate namespace dynatrace. It holds the operator deployment and all dependent objects like permissions, custom resources and corresponding StatefulSets.

Installation

For install instructions on Openshift, head to the official help page

To create the namespace and apply the operator run the following commands

kubectl create namespace dynatrace
kubectl apply -f https://github.com/Dynatrace/dynatrace-operator/releases/latest/download/kubernetes.yaml

If using cloudNativeFullStack or applicationMonitoring with CSI driver, the following command is required as well:

kubectl apply -f https://github.com/Dynatrace/dynatrace-operator/releases/latest/download/kubernetes-csi.yaml

A secret holding tokens for authenticating to the Dynatrace cluster needs to be created upfront. Create access tokens of type Dynatrace API and use its values in the following commands respectively. For assistance please refer to Create user-generated access tokens.

The token scopes required by the Dynatrace Operator are documented on our official help page

kubectl -n dynatrace create secret generic dynakube --from-literal="apiToken=DYNATRACE_API_TOKEN" --from-literal="dataIngestToken=DATA_INGEST_TOKEN"

Create DynaKube custom resource for ActiveGate and OneAgent rollout

The rollout of the Dynatrace components is governed by a custom resource of type DynaKube. This custom resource will contain parameters for various Dynatrace capabilities (OneAgent deployment mode, ActiveGate capabilities, etc.)

Note: .spec.tokens denotes the name of the secret holding access tokens.

If not specified Dynatrace Operator searches for a secret called like the DynaKube custom resource .metadata.name.

The recommended approach is using classic Fullstack injection to roll out Dynatrace to your cluster, available as classicFullStack sample. In case you want to have adjustments please have a look at our DynaKube Custom Resource examples.

Save one of the sample configurations, change the API url to your environment and apply it to your cluster.

kubectl apply -f cr.yaml

For detailed instructions see our official help page.

Uninstall dynatrace-operator

For instructions on how to uninstall the dynatrace-operator on Openshift, head to the official help page

Clean-up all Dynatrace Operator specific objects:

kubectl delete -f https://github.com/Dynatrace/dynatrace-operator/releases/latest/download/kubernetes.yaml

If the CSI driver was installed, the following command is required as well:

kubectl delete -f https://github.com/Dynatrace/dynatrace-operator/releases/latest/download/kubernetes-csi.yaml

Hacking

See HACKING for details on how to get started enhancing Dynatrace Operator.

Contributing

See CONTRIBUTING for details on submitting changes.

License

Dynatrace Operator is under Apache 2.0 license. See LICENSE for details.

dynatrace-operator's People

Contributors

0sewa0 avatar albertogdd avatar andriisoldatenko avatar annazaionchkovska avatar aorcholski avatar baichinger avatar chrismuellner avatar dependabot[bot] avatar dseywald avatar dt-team-kubernetes avatar dtmad avatar fabwer avatar github-actions[bot] avatar gkrenn avatar jsm84 avatar lrgar avatar luhi-dt avatar meik99 avatar mjgrzybek avatar mmayr-at avatar mreider avatar popecruzdt avatar realtimetodie avatar renovate[bot] avatar romr-of-dt avatar stefan-falk-dt avatar stefanhauth avatar toszr avatar waodim avatar wepudt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dynatrace-operator's Issues

Version 0.3.0 missing from Operatorhub catalog in Openshift

Bug Report

What did you do?

Updating the Dynatrace operator to 0.3.0 does not seem to be possible using the operatorhub available in OpenShift. It's the same experience when looking up the operator on operatorhub.io: https://operatorhub.io/operator/dynatrace-operator

What did you expect to see?

The Dynatrace operator should be available to be updated using the means available in openshift.

Environment

  • OpenShift version information (if applicable):
    openshift 4.7.31

Access to Kubernetes API

I tried to configure access to the Kubernetes API but the required certificates are not wired up:

dynakube-kubemon-0 dynatrace-operator 2021-03-26 00:05:40 UTC INFO [<mud10346>] [<com.compuware.apm.debug.apiconnector.kubernetes>, KubernetesFastCheck] Fast check failed for endpoint https://kubernetes.default.svc/api with SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target [Suppressing further identical messages for 1 hour]

Adding trusted Root CAs to activegate

Hello, i am following your guide here to connect my Dynatrace SAAS platform to our eks cluster: https://github.com/Dynatrace/dynatrace-operator, however on the last step in the Dynatrace UI, i am getting "There was an error with the TLS handshake: The specified certificate is not trusted". I know under normal circumstances on an Activegate i can do this https://www.dynatrace.com/support/help/setup-and-configuration/dynatrace-activegate/configuration/configure-trusted-root-certificates-on-activegate/, but how can i mount a keystore to my activegate pod? There does not seem to be an option to add a volume to it in the cr.yaml.

Host Group parameter not applied

The Host Group parameter for the OneAgent isn't applied due to a typo in the install.sh script.

Modify the host group (currently line 160) in install.sh file, remove the double quotes.
OLD:
- --set-host-group="${CLUSTER_NAME}"
NEW:
- --set-host-group=${CLUSTER_NAME}

Latest version of oneagent (1.55.1000) doesn't seem to support unprivileged any more.

It looks like the latest dynatrace oneagent container doesn't support unprivileged mode any more.
This suddenly popped up because the operator uses the 'latest' tag.

As workaround I had to hard-set oneagent.image to version 1.54.1001

(I also tried turning off 'useUnprivilegedMode' but it still complains. It only works when I hand-edit the daemonset to include hostIPC: true

Looks like the 'latest' image (as of writing, 1.55.1000) it gives this error:

15:19:09 Warning: ONEAGENT_INSTALLER_SCRIPT_URL does not match expected pattern, please verify its correctness. See the following help page for instructions how to retrieve the URL: https://www.dynatrace.com/support/help/shortlink/oneagent-docker#locate-your-oneagent-installer-url. If installer download succeeded, then this message may safely be ignored
15:19:09 Error: Container was launched without --ipc parameter
15:19:09 Error: Dynatrace OneAgent requires parameters: --privileged set to 'true' and --pid, --net, --ipc set to 'host'.
15:19:09 If you are not sure how to launch the container please visit: https://www.dynatrace.com/support/help/shortlink/oneagent-docker
15:19:09 Error: Initialization procedure failed
15:19:09 Error: ----- Begin container init log -----
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_INSTALLER_SCRIPT_URL=https://dynatrace.europe.stater.corp/e/acceptance/api/v1/deployment/installer/agent/unix/default/latest?Api-Token=***&arch=x86&flavor=default
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_INSTALLER_DOWNLOAD_TOKEN=
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_INSTALLER_DOWNLOAD_VERBOSE=
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_INSTALLER_SKIP_CERT_CHECK=true
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_ENABLE_VOLUME_STORAGE=
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_CONTAINER_STORAGE_PATH=
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_NO_REMOUNT_ROOT=
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_ADDITIONAL_UNMOUNT_PATTERN=
2021-09-06 15:19:09 UTC [INFO]  ONEAGENT_DISABLE_CONTAINER_INJECTION=
2021-09-06 15:19:09 UTC [INFO]  AGENT_CONTAINER_IMAGE_VERSION=1.207.229.20210115-085810
2021-09-06 15:19:09 UTC [INFO]  Path: /usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2021-09-06 15:19:09 UTC [WARN]  ONEAGENT_INSTALLER_SCRIPT_URL does not match expected pattern, please verify its correctness. See the following help page for instructions how to retrieve the URL: https://www.dynatrace.com/support/help/shortlink/oneagent-docker#locate-your-oneagent-installer-url. If installer download succeeded, then this message may safely be ignored
2021-09-06 15:19:09 UTC [ERROR] Container was launched without --ipc parameter
2021-09-06 15:19:09 UTC [ERROR] Dynatrace OneAgent requires parameters: --privileged set to 'true' and --pid, --net, --ipc set to 'host'.
2021-09-06 15:19:09 UTC [INFO]  If you are not sure how to launch the container please visit: https://www.dynatrace.com/support/help/shortlink/oneagent-docker
2021-09-06 15:19:09 UTC [ERROR] Initialization procedure failed
15:19:09 Error: ----- End of container init log -----

NB: Maybe this should be raised at the oneagent github but I was unable to find that.

Adding podAnnotations

Hello, i would like to attach IAM role using kube2iam so it would be nice to have ability to append some annotations into Active Gate stateful set.

Custom Resource - metdata name

I have copied the example Custom Resource, While applying the CR's on OpenShift the 'metadata: name:' must be dynakube.
If you change the name, the CR will not create the DaemonSets.
This was discovered while trying to deploy the multiDynakube.yaml, the metdata/name have dynakube-application-monitoring and dynakube-cloud-native. If you apply the CR, nothing happens.
I applied cloudNativeFullStack.yaml, DaemonSet is created. Change the metadata/name to dynatrace-full, nothing happens, no DaemonSet or additional Pods. (dynakube-oneagent)

arm64 does not work

Hi. I have a mixed architecture cluster (amd64 and arm64). On my arm64 node (raspberry pi 4, ubuntu 20.04), OneAgent does not start properly:

20:43:14 Started agent deployment as a container, PID 33396.
20:43:14 Downloading agent to /tmp/Dynatrace-OneAgent-Linux.sh via https://{my-env}/api/v1/deployment/installer/agent/unix/default/latest?Api-Token=***&arch=x86&flavor=default
20:43:15 Download was skipped as the agent version did not change
20:43:15 Verifying agent installer signature
20:43:20 Verification successful
20:43:20 Deploying to: /mnt/host_root
20:43:20 Starting installer...
20:43:23 Checking root privileges...
20:43:23 OK
20:43:23 Error: Cannot determine architecture or architecture not supported: <AARCH64>

Obviously the agent is downloaded in x86 architecture, so then it cannot start.

Also, if ActiveGate is getting scheduled on the ARM node, it fails.

standard_init_linux.go:219: exec user process caused: exec format error

AFAIK ActiveGate does not support ARM architecture. It would be nice if the operator would help and schedule it on x86 nodes only.

secret/dynatrace-helm looks for ProxyKey

27m Warning Failed pod/dynakube-helm-kubemon-0 Error: couldn't find key ProxyKey in Secret dynatrace/dynakube-helm

The kubemon and routing pods are looking for a key in the secret called ProxyKey but you only have

apiVersion: v1
data:
apiToken:
paasToken:
proxy:

data fields in the secret so if you configure a proxy in the base configs it will fail

Auto discovery of dashboards/alert rules/ other configurations by the dynatrace operator

Hi All,

Grafana dashboards are automatically discovered from the ConfigMaps in the deployed namespace. If a ConfigMap has been created with the grafana_dashboard label set to "1" , then the JSON encoded dashboard in the ConfigMap will be imported into Grafana.

Same way, if the dynatrace operator detects and auto imports automatically based on certain label, will ease our life for the automation.

Are there any plans to support this feature ?

Host Group Does not apply on latest release

Just worked with a customer and noticed the host group value that is automatically set by the operator includes double quotes on the value for the host group when describing the daemon set. This does not get accepted by the agent and is not usable in the UI. Editing the daemonset and removing the quotes on the host group arg resolves the problem.

k8shostgroup

Operator version for OC 3.11

Hi team good day. As per https://github.com/Dynatrace/dynatrace-operator#supported-platforms the proper Operator Version for the Openshift 3.11 should be the v0.2.2. But when prompt for the install command on the "Deploy Dynatrace" page, we get the same install.sh for the Kubernetes, with no flag like the "--openshift". This way we are always getting the latest version instead of the v.0.2.2 and the installation fails at some point due to incompatible commands/actions.
Please let me know how do I proper select the v.0.2.2 on the "Deploy Dynatrace" page, or even the proper install.sh for the v.0.2.2, since even there it is still pointing to a 404 file: https://github.com/Dynatrace/dynatrace-operator/releases/latest/download/openshift3.11.yaml (line 108) .
Thanks.

oneagent Daemonset pods interfering with aws-vpc-cni node startup

We have dynatrace-operator running in our EKS clusters using the helm chart: https://github.com/Dynatrace/helm-charts/tree/master/dynatrace-operator/chart/default

Recently, when we updated the aws-vpc-cni addon, we noticed that it wasn't able to startup. The reason seemed to be an injected dynatrace library.

Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  26s               default-scheduler  Successfully assigned kube-system/aws-node-587pv to ip-*-*-*-*.ap-southeast-2.compute.internal
  Normal   Pulled     25s               kubelet            Container image "602401143452.dkr.ecr.ap-southeast-2.amazonaws.com/amazon-k8s-cni-init:v1.9.3" already present on machine
  Normal   Created    25s               kubelet            Created container aws-vpc-cni-init
  Normal   Started    25s               kubelet            Started container aws-vpc-cni-init
  Normal   Pulled     24s               kubelet            Container image "602401143452.dkr.ecr.ap-southeast-2.amazonaws.com/amazon-k8s-cni:v1.9.3" already present on machine
  Normal   Created    24s               kubelet            Created container aws-node
  Normal   Started    24s               kubelet            Started container aws-node
  Warning  Unhealthy  4s (x2 over 14s)  kubelet            Readiness probe failed: /app/grpc-health-probe: symbol lookup error: /opt/dynatrace/oneagent/agent/bin/current/linux-x86-64/liboneagentproc.so: undefined symbol: __environ

We had to do the following steps to workaround this issue:

  • Uninstall dynatrace-operator
  • Delete the oneagent daemonset (because that doesn't happen as part of operator uninstallation)
  • Wait for the aws-vpc-cni Daemonset to progress
  • Reinstall dynatrace-operator

This does not manifest for new nodes, because aws-node Pods start (and has to) before any other Pods including oneagent Pods. This only manifests when we update the cni Daemonset.

Versions:
EKS: 1.20
dynatrace-operator: 0.3.0
OneAgent: 1.231.245

[dynatrace-operator] dynakube pods not running on AWS EKS having Bottlerocket AMI

Hi Team,

We are trying to run dynatrace-operator on nodes having bottlerocket ami, but getting below error for the pods/dynakube-classic-*:

12:17:51 Bootstrapping regular deployment
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_INSTALLER_SCRIPT_URL=<ENDPOINT>/api/v1/deployment/installer/agent/unix/default/latest?Api-Token=***&arch=x86&flavor=default
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_INSTALLER_DOWNLOAD_TOKEN=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_INSTALLER_DOWNLOAD_VERBOSE=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_INSTALLER_SKIP_CERT_CHECK=false
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_ENABLE_VOLUME_STORAGE=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_CONTAINER_STORAGE_PATH=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_NO_REMOUNT_ROOT=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_ADDITIONAL_UNMOUNT_PATTERN=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_DISABLE_CONTAINER_INJECTION=
2021-11-17 12:17:51 UTC [INFO] ONEAGENT_READ_ONLY_MODE=
2021-11-17 12:17:51 UTC [INFO] AGENT_CONTAINER_IMAGE_VERSION=1.227.97.20210927-151339
2021-11-17 12:17:51 UTC [INFO] Path: /usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
2021-11-17 12:17:51 UTC [INFO] Started with capabilities: Capabilities for `self': = cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_net_admin,cap_net_raw,cap_sys_chroot,cap_sys_ptrace,cap_sys_admin,cap_sys_resource,cap_setfcap+eip
2021-11-17 12:17:51 UTC [INFO] Started with user: uid=0(root) gid=0(root) groups=0(root)
2021-11-17 12:17:52 UTC [INFO] Creating directory /mnt/host_root with rights 755
2021-11-17 12:17:52 UTC [INFO] Stored installation path as visible from host's mount namespace: /opt/dynatrace/oneagent
2021-11-17 12:17:52 UTC [INFO] Stored data storage dir as visible from host's mount namespace: /var/lib/dynatrace/oneagent
2021-11-17 12:17:52 UTC [INFO] Stored log dir as visible from host's mount namespace: /var/log/dynatrace/oneagent
2021-11-17 12:17:52 UTC [INFO] Mounting basic directories
2021-11-17 12:17:52 UTC [INFO] Bind mounting /mnt/root//bin to /mnt/host_root//bin
2021-11-17 12:17:52 UTC [INFO] Creating directory /mnt/host_root//bin with rights 755
Error: mount failed, error code: 13, error message: Permission denied
12:17:52 Error: Initialization procedure failed
2021-11-17 12:17:52 UTC [ERROR] Initialization procedure failed
12:17:52 Error: ----- Begin container init log -----
NAME					READY		STATUS			RESTARTS	AGE
pod/dynakube-classic-65p5j		0/1		CrashloopBackoff	65		5h11m
pod/dynakube-classic-6rmrs		0/1		CrashloopBackoff	65		5h11m
pod/dynakube-classic-77rvm		0/1		CrashloopBackoff	65		5h11m
pod/dynakube-classic-7vjhg		0/1		CrashloopBackoff	65		5h11m
pod/dynakube-classic-827k2		0/1		CrashloopBackoff	65		5h11m
pod/dynakube-classic-9bged		0/1		CrashloopBackoff	65		5h11m
pod/dynakube-kubemon-0			1/1		Running			12		5h11m
pod/dynakube-routing-0			1/1		Running			12		5h11m
pod/dynatrace-operator-7999b4f5d-g7ยฃcp	1/1		Running			0		5h11m

Client Version: v1.21.2-13+d2965f0db10712
Server Version: v1.21.2-eks-06eac09

Documentation Followed:
https://artifacthub.io/packages/helm/dynatrace/dynatrace-operator
https://www.dynatrace.com/support/help/technology-support/

ActiveGate crashes on IPv6 only mode

FYI. I noticed that ActiveGate is not able to run on dual-stack cluster where dynatrace namespace is set to use IPv6 only using Kyverno policy like this (like described on official dual-stack documentation https://kubernetes.io/docs/concepts/services-networking/dual-stack/ )

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: ipv6-only
spec:
  validationFailureAction: enforce
  background: false
  rules:
  - name: ipv6-only
    match:
      resources:
        kinds:
        - Service
        namespaces:
        - "dynatrace"
    preconditions:
      all:
      - key: "{{request.operation}}"
        operator: In
        value:
        - CREATE
    mutate:
      patchStrategicMerge:
        spec:
          ipFamilyPolicy: SingleStack
          ipFamilies:
          - IPv6

Pod ends up to boot loop with error like this because DNS entrypoint is parsed incorrectly (should be https://[fd::7a15]:443/communication )

2021-11-29 09:26:52 UTC INFO    [<collector>] [<collector.core>, CollectorImpl] Initiating Collector Startup...
2021-11-29 09:26:52 UTC INFO    [<collector>] [<collector.core>, CollectorImpl] No keyfile detected, starting without crypto subsystem.
2021-11-29 09:26:52 UTC INFO    [<mvl43013>] [<collector.core>, CollectorImpl] Working mode property set to PRIVATE
2021-11-29 09:26:52 UTC INFO    [<mvl43013>] [<collector.core>, CollectorImpl] ActiveGate not configured, will configure now.
2021-11-29 09:26:52 UTC INFO    [<mvl43013>] [<collector.core>, CollectorImpl] Gateway token is configured
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.compuware.apm.util.shared.api.connectivity.UriBuilder (file:/opt/dynatrace/gateway/lib/util.shared-1.229.163.20211109-103203.jar) to field java.net.URI.schemeSpecificPart
WARNING: Please consider reporting this to the maintainers of com.compuware.apm.util.shared.api.connectivity.UriBuilder
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
2021-11-29 09:26:53 UTC SEVERE  [<mvl43013>] [<collector.core>, CollectorImpl] Connectivity detection failed. Collector startup cancelled
com.compuware.apm.platform.shared.api.connectivity.ConnectivityDetectorException: DNS Entrypoint URI: https://fd::7a15:443/communication is invalid.
        at com.compuware.apm.platform.shared.connectivity.ConnectivityDetectorImpl.retrieveAndValidateDnsEntryPointUris(ConnectivityDetectorImpl.java:461)
        at com.compuware.apm.platform.shared.connectivity.ConnectivityDetectorImpl.newInstance(ConnectivityDetectorImpl.java:86)
        at com.compuware.apm.platform.shared.api.connectivity.ConnectivityDetectorFactory.createCollectorConnectivityDetector(ConnectivityDetectorFactory.java:27)
        at com.compuware.apm.collector.core.CollectorImpl.populateCollectorUris(CollectorImpl.java:1216)
        at com.compuware.apm.collector.core.CollectorImpl.activate(CollectorImpl.java:988)
        at com.compuware.apm.collector.core.CollectorImpl.main(CollectorImpl.java:767)
Caused by: java.lang.IllegalArgumentException: URI string: https:/communication - Host component for given scheme https is missing.
        at com.compuware.apm.util.shared.api.connectivity.UriBuilder.validateComponentsOfURI(UriBuilder.java:292)
        at com.compuware.apm.util.shared.api.connectivity.UriBuilder.build(UriBuilder.java:283)
        at com.compuware.apm.platform.shared.connectivity.ConnectivityDetectorImpl.retrieveAndValidateDnsEntryPointUris(ConnectivityDetectorImpl.java:432)
        ... 5 more

install.sh checkTokenScopes does not check for revocation status

The install.sh checkTokenScopes() does not look for revoked = true

{
  "id": "dt0c01.ABC123",
  "name": "Download Token (generated at 2021-08-13 13:11)",
  "userId": "xyz",
  "revoked": true,
  "created": 1628824267605,
  "lastUse": 1633475218208,
  "scopes": [
    "InstallerDownload",
    "SupportAlert"
  ],
  "personalAccessToken": false
}

Suggest to improve this with a check to determine if the token is disabled or not.

customPullSecret value not in ActiveGate StatefulSets

I'm using a private Registry to host ActiveGate container images and setting customPullSecret in Dynakube CR that contains the credentials to authenticate to that registry and pull the images.

I'm getting an ImagePullBackOff error unauthorized

Checked the Statefulset definition and is still pointing to the default pull-secret

    spec:
      nodeSelector:
        node-role.kubernetes.io/worker: ''
      restartPolicy: Always
      serviceAccountName: dynatrace-kubernetes-monitoring
      imagePullSecrets:
        - name: dynakube-pull-secret
      schedulerName: default-scheduler

This is what I have in Dynakube CR:

customPullSecret: dynaregcred

support for extension module, log analytics, memory dump active gate

Can we set this custom properties in classic full stack or cloud native

customProperties:
  value: |
    [collector]
    MSGrouter = true
    restInterface = true
    DumpSupported = true
    [extension_controller]
    extension_controller_enabled = true
    [log_analytics_collector]
    log_analytics_collector_enabled = true

Installing via Helm to custom namespace conflicts with CRD since 0.3.0

The dynatrace-operator Helm chart lets you install to a custom namespace, including deploying all the webhook configuration to the specified namespace: https://github.com/Dynatrace/helm-charts/tree/dynatrace-operator-v0.3.0/dynatrace-operator/chart/default/templates/Common/webhook

The CRD however hardcodes the expected namespace of the webhook to dynatrace: https://github.com/Dynatrace/dynatrace-operator/blob/master/config/crd/patches/webhook_in_dynakubes.yaml#L13. This causes the update to that version of the operator to fail.

I understand the CRD and the chart are decoupled but it may be worth documenting that this is a possible problem. Our solution is to deploy the CRD alongside the operator with our own Helm chart and modify the offending line to say {{.Release.Namespace}} instead.

Can "useImmutableImage" be relaxed for pulling the OneAgent?

From what I could figure out, the useImmutableImage flag is used to pull both the dynatrace operator and the oneagent images from a private Docker registry with an optional custom pull secret. I wonder if it makes sense to allow this optionally only for the operator image and relax the constraint for the oneagent image.
If the useImmutableImage is not set, the operator tries to pull the image from Docker Hub - wouldn't it make sense to pull directly from the Dynatrace environment instead?

Issues with custom resource

Applying latest CR from master produces this error, Any temporary workaround until the fix or estimated time for the fix to consume this latest CR

v1beta1
unable to recognize "/Users/I501201/Downloads/cr.yaml": no matches for kind "DynaKube" in version "dynatrace.com/v1beta1"

changing to v1aplha1 produces the error as

error validating data: [ValidationError(DynaKube.spec.activeGate): unknown field "capabilities" in com.dynatrace.v1beta1.DynaKube.spec.activeGate, ValidationError(DynaKube.spec.oneAgent): unknown field "cloudNativeFullStack" in com.dynatrace.v1beta1.DynaKube.spec.oneAgent]; if you choose to ignore these errors, turn validation off with --validate=false

install.sh release consistency

Hi,

can the install.sh artefact be updated to point to the actual release when being released (instead of latest) ? (on previous, current and future releases)

e.g.
0.2.2 : ocp311 installation will fail, because the specific resources were removed from latest release
0.3.0 : isocp311() function should actually exit the installer, and instruct the user to use the (fixed) installer script from 0.2.2

WKR

Dynatrace images appear to be missing from Docker, Quay, and other sources

A bunch of Dynatrace infrastructure, including the oneagent-operator, appears to be missing from both docker and quay as of recent

E.g.:

https://hub.docker.com/r/dynatrace/dynatrace-oneagent-operator
https://hub.docker.com/r/dynatrace/dynatrace-operator
https://quay.io/repository/dynatrace/dynatrace-oneagent-operator?tab=tags
https://quay.io/repository/dynatrace/dynatrace-operator?tab=tags

These originally existed, and are suddenly missing with no real word of whats happened to them. My initial(?) assumption is that these are old images, but I see them still being referenced in the latest release of Dynatrace:

      containers:
      - args:
        - operator
        env:
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        image: docker.io/dynatrace/dynatrace-operator:v0.2.2

Via this release: https://github.com/Dynatrace/dynatrace-operator/releases/tag/v0.2.2 inside of the kubernetes.yaml file

Comment in sample CR is incorrect

In dynatrace-operator/config/samples/hostMonitoring.yaml, the image section lists the default image source as docker hub. In discussion with container team, it's actually pulled from cluster instead. The comment should be updated to reflect this.

Dynatrace Operator Is Causing a Tekton Buildah Pod to Fail on OpenShift / AWS with "unshare(CLONE_NEWUSER): Invalid argument"

Steps Used to Recreate Issue:

YAML to use in the steps below:

apiVersion: v1
kind: Pod
metadata:
  name: buildah-bare-test
spec:
  serviceAccountName: pipeline
  restartPolicy: Never
  containers:
    - name: buildah
      image: registry.redhat.io/rhel8/buildah@sha256:6a68ece207bc5fd8db2dd5cc2d0b53136236fb5178eb5b71eebe5d07a3c33d13
      command: ["buildah", "from", "scratch"]
      volumeMounts:
      - mountPath: /var/lib/containers
        name: varlibcontainers
  volumes:
  - emptyDir: {}
    name: varlibcontainers
  1. Create a new Red Hat OpenShift cluster on AWS (small, 3 nodes) which starts at ver 4.7.9
  2. Upgrade OCP to ver 4.7.13 (may not be necessary)
  3. Upgrade OCP to ver 4.8.12 (may not be necessary)
  4. Install Red Hat OpenShift Pipelines Operator ver 1.5.2 (aka Tekton) from OperatorHub
  5. Create namespace thru console: buildah-test
  6. Create pod with YAML - which works
    Status: Completed
    Logs:
working-container
  1. Delete pod
  2. Install Dynatrace Operator ver 0.2.2
  3. Try same YAML - which still works
    Status: Completed
    Logs:
working-container
  1. Delete pod
  2. Create DynaKube from within Dynatrace Operator
  3. Try same YAML - which now fails
    Status: Error
    Logs:
Error during unshare(CLONE_NEWUSER): Invalid argument
level=error msg="error parsing PID \"\": strconv.Atoi: parsing \"\": invalid syntax"
level=error msg="(unable to determine exit status)"
  1. Delete pod
  2. Delete DynaKube from Dynatrace Operator
  3. Delete Dynatrace Operator
  4. Delete namespace central-monitoring
  5. Try same YAML - which still fails
    Status: Error
    Logs:
Error during unshare(CLONE_NEWUSER): Invalid argument
level=error msg="error parsing PID \"\": strconv.Atoi: parsing \"\": invalid syntax"
level=error msg="(unable to determine exit status)"
  1. Delete pod
  2. Recreate all 3 OpenShift nodes; by deleting existing nodes one by one, then rolling machinesets
  3. Try same YAML - which now works again
    Status: Completed
    Logs:
working-container

Expectations:

  • Step 12 - Would expect pod to be successfully created, or at least be given a reasonable explanation for why it is now failing.
  • Step 17 - Would expect Operator to clean up the removal so that pod would be created successfully again.

At least Step 19 is a viable workaround to purge the issue, but seems extreme.

See: https://github.com/containers/buildah/issues/3455 and https://github.com/tektoncd/operator/issues/381

invalid capability=metrics-ingest on classicFullStack

Hello,

In file: dynatrace-operator/config/samples/classicFullStack.yaml

The below - metrics-ingest is an invalid parameter. According to docs only possible values are the top two.

  # Configuration for ActiveGate instances.
  activeGate:
    # Enables listed ActiveGate capabilities
    capabilities:
      - routing
      - kubernetes-monitoring
      - metrics-ingest

After removing the last parameter, the file applies perfectly with kubectl apply -f "<name of yaml>"

To reproduce: follow standard docs here.

Let me know if I am doing something wrong here, which is entirely possible ๐Ÿ˜„ .

Thanks!

Remove "dynatrace" namespace naming requirement

The current documentation and associated deployment manifests should not assume it's always possible to deploy the operator into a namespace called "dynatrace" on the target cluster.

Some users (especially larger enterprises) are running their workloads in fully managed multi-tenant K8s/OpenShift PaaS clusters which may enforce strict naming rules for namespaces.

In our case we had to do a search & replace of any occurrences of the string "namespace: dynatrace" in the openshift.yaml manifest before applying it.

While this is relatively simple and seems to do the trick I think at least the documentation should be improved in this regard.

install.sh with option --skip-ssl-verification succeded : but pods dynakube-oneagent and activegate are in pending status

Good morning,

I tried to install Dynatrace on EKS, it works fine few weeks ago, and i had to redo the install, but now i encountered this error with v0.2.2:

Applying DynaKube CustomResource...
Error from server (InternalError): error when creating "STDIN": Internal error occurred: conversion webhook for dynatrace.com/v1alpha1, Kind=DynaKube failed: Post "https://dynatrace-webhook.dynatrace.svc:443/convert?timeout=30s": no endpoints available for service "dynatrace-webhook"

So I download the latest install.sh release with wget and the install went fine.

Kubernetes monitoring successfully setup.

But on my Kubernetes cluster, i encounter these errors ImagePullBackOff :

Failed to pull image "MY_API_URL/linux/oneagent:latest": rpc error: code = Unknown desc = Error response from daemon: Get https://MY_API_URL_SHORT/v2/: x509: certificate signed by unknown authority

image

And why is the operator is installing dynakube-oneagent instead of dynakube-classic ?

Dynakube definition includes properties not yet released

The Kubernetes installation defined in the readme references

apiVersion: dynatrace.com/v1beta1
kind: DynaKube
metadata:
  name: dynakube
  namespace: dynatrace

Which does not exist in the latest release of the CRD, it only contains v1alpha1

this also creates issues with the samples defined:

https://github.com/Dynatrace/dynatrace-operator/tree/master/config/samples

Particularly the missing attributes:

capabilties on DynaKube.spec.activeGate
cloudNativeFullStack on DynaKube.spec.oneAgent

apiextensions.k8s.io/v1beta1/CustomResourceDefinition is deprecated

Installing the Operator on newer Kubernetes environments is printing following warning:

 12:09:43.375044 207360 warnings.go:67] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
 12:09:43.428453 207360 warnings.go:67] apiextensions.k8s.io/v1beta1 CustomResourceDefinition is deprecated in v1.16+, unavailable in v1.22+; use apiextensions.k8s.io/v1 CustomResourceDefinition
customresourcedefinition.apiextensions.k8s.io/dynakubes.dynatrace.com created

This issue is to track the migration of the apiVersion from apiextensions.k8s.io/v1beta1 to apiextensions.k8s.io/v1

Reference:
https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/

Feature request: Don't alert for scaled down nodes

I'm not sure if this already exists, but it would be nice if we didn't get "Host or monitoring unavailable" alerts for scaling down events.

An example alert:
Screenshot 2021-08-06 at 13 51 27

But it was for a normal scaling down event:
Screenshot 2021-08-06 at 13 54 21

Activegate image not working on ARM64

Hi,

When executing a 'standard' operator install (i.e. using install.sh) on an arm64 cluster, I get a crashloopback off on the activegate pod, with this image

Image: XXXXX.live.dynatrace.com/linux/activegate:latest

Is this image multi-arch, does it support ARM64?

kubectl apply --dry-run is deprecated in v1.18

Should replace

"${CLI}" -n dynatrace create secret generic dynakube --from-literal="apiToken=${API_TOKEN}" --from-literal="paasToken=${PAAS_TOKEN}" --dry-run -o yaml | "${CLI}" apply -f -

https://github.com/Dynatrace/dynatrace-operator/blob/master/deploy/install.sh#L114

With something like

"${CLI}" -n dynatrace create secret generic dynakube --from-literal="apiToken=${API_TOKEN}" --from-literal="paasToken=${PAAS_TOKEN}" --dry-run=client -o yaml | "${CLI}" apply -f -

https://kubernetes.io/blog/2019/01/14/apiserver-dry-run-and-kubectl-diff/

Process/container restart post deploying OA operator

This is more of a question than an issue. I tried both automated way of deploying OA operator from the Dynatrace Hub on the UI and also manual way using the documentation doc

Why was process/container restart not required when tried the automated way? Is the code taking care of it?

arm64 issue: exec user process caused: exec format error

I had previously got my ARM setup to work properly, but recently after upgrading to operator 0.3 and refactoring my yaml, it does not work anymore. It looks like the image itself is not ARM64 compatible. As I'm inspecting the logs, I'm getting only this line:

standard_init_linux.go:228: exec user process caused: exec format error

image

Is something wrong with my config, or is this an operator bug?

apiVersion: dynatrace.com/v1beta1
kind: DynaKube
metadata:
  name: dynakube
  namespace: dynatrace
spec:
  apiUrl: https://<...>/api

  networkZone: home-lab

  activeGate:
    capabilities:
      - routing
      - kubernetes-monitoring
#      - metrics-ingest
    
    nodeSelector:
      kubernetes.io/arch: amd64

  oneAgent:
    classicFullStack:

      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      
      nodeSelector:
        kubernetes.io/arch: amd64

  oneAgent:
    classicFullStack:
      tolerations:
      - effect: NoSchedule
        key: node-role.kubernetes.io/master
        operator: Exists
      
      nodeSelector:
        kubernetes.io/arch: arm64

      env:
        - name: ONEAGENT_INSTALLER_SCRIPT_URL
          value: https://<...>/api/v1/deployment/installer/agent/unix/default/latest?arch=arm&flavor=default&Api-Token=<...>

Lightweight OneAgent image does not start

Following the Private Registries Instructions for a lightweight image, the Docker Hub image given at docker.io/dynatrace/oneagent:latest does not start for a Dynatrace Managed environment when set via the classic full stack monitoring CR, giving this startup error:

01:39:57 Bootstrapping regular deployment
01:39:57 Started agent deployment as a container, PID 15738.
01:39:57 System version: Linux ec2amaz-redacted 5.4.156-83.273.amzn2.x86_64 #1 SMP Sat Oct 30 12:59:07 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
01:39:57 Command line: --set-host-property=OperatorVersion=v0.3.0 --set-deployment-metadata=orchestration_tech=Operator-classic_fullstack --set-deployment-metadata=script_version=v0.3.0 --set-deployment-metadata=orchestrator_id=REDACTED --set-host-id-source=auto
01:39:57 Installed version:
01:39:57 ONEAGENT_INSTALLER_SCRIPT_URL=
01:39:57 ONEAGENT_INSTALLER_DOWNLOAD_TOKEN=
01:39:57 ONEAGENT_INSTALLER_DOWNLOAD_VERBOSE=
01:39:57 ONEAGENT_INSTALLER_SKIP_CERT_CHECK=
01:39:57 ONEAGENT_ENABLE_VOLUME_STORAGE=
01:39:57 ONEAGENT_CONTAINER_STORAGE_PATH=
01:39:57 ONEAGENT_NO_REMOUNT_ROOT=
01:39:57 ONEAGENT_ADDITIONAL_UNMOUNT_PATTERN=
01:39:57 ONEAGENT_DISABLE_CONTAINER_INJECTION=
01:39:57 ONEAGENT_READ_ONLY_MODE=
01:39:57 AGENT_CONTAINER_IMAGE_VERSION=1.229.65.20211018-111427
01:39:57 Path: /usr/sbin:/usr/bin:/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
01:39:57 Started with capabilities: Capabilities for `self': = cap_chown,cap_dac_override,cap_dac_read_search,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_net_admin,cap_net_raw,cap_sys_chroot,cap_sys_ptrace,cap_sys_admin,cap_sys_resource,cap_setfcap+eip
01:39:57 Started with user: uid=0(root) gid=0(root) groups=0(root)
01:39:57 Error: The ONEAGENT_INSTALLER_SCRIPT_URL environment variable must be initialized with your cluster's agent download location (to be obtained via "Deploy Dynatrace" in the Dynatrace UI). Example:
01:39:57 If you are not sure how to launch the container please visit: https://www.dynatrace.com/support/help/shortlink/oneagent-docker
01:39:57 Error: Initialization procedure failed

The default Dynatrace environment-hosted OneAgent image (or a private-hosted version of it per the "immutable" directions) works as expected. If this particular option has been intentionally superseded by the new monitoring modes, then the private registries documentation likely needs updating.

skipCertCheck: false

Hello Guys,

We have faced with a problem regarding docker registry certificate.
Our setup is the Dynatrace cluster (as well as it docker registry) is protected with self-sign CA (internal CA).
The skipCertCheck flag is affecting only the dynatrace server and not affecting with docker registry (sometimes called insecure registry).
We cannot (and we dont want to) modify your docker image and there is no way to get this activegate image (oneagent images can be pulled from dockerhub, or from openshift).
Workaround is to manually pulled from dynatrace registry and push to another private docker registry which is not protected with internalCA issued certificate.
Another workaround to use managed dynatrace dns but imho this is not a good solution.

Please extend your image to be able to use "insecure" (internal CA issued tls protected) registry.

Thanks in advance.
Viktor

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.