Code Monkey home page Code Monkey logo

tableau-server-in-kubernetes's Introduction

Tableau Server In Kubernetes

Community Supported

This project consists of documentation and examples demonstrating how to run Tableau Server in an existing Kubernetes cluster. Make sure the Tableau Server image used is sourced from the Tableau Server in a Container Project.

Project Resources

Kubernetes example templates are stored in the templates/ directory of this project. There are single-node, multi-node, and upgrade job templates that can be used as starting points.

Requirements

There are many different kinds of Kubernetes deployments which can affect how Tableau Server might be deployed in a container orchestration system. To account for this variability and give you the information to make the best decision for how to deploy Tableau Server, this section will go over the high-level requirements for what Tableau Server needs in order to run properly in a container orchestration system.

Single Node Requirements

Network

Hostnames inside the container must be static and consistent. This means on every restart of the container or pod, the hostname in the container must stay the same. This is one of the primary reasons we recommend using the StatefulSet workload API object to deploy the Tableau Server pod. Persistent volumes that store Tableau Server state also expect to be used with the same container hostname. Tableau Server containers receive client traffic on port 8080 by default (8443 for TLS).

Multi-Node Requirements

Network

Container short hostnames must be DNS resolvable; Tableau Server containers in a cluster will self-register and discover each other by their short hostname. This means a standard Kubernetes deployment using core-dns or kube-dns will require customizing the container's DNS policy. Check the example Kubernetes multinode configuration to see one way of handling this. The Kubernetes documentation on DNS lookups provides more information on this topic. Using a Headless Service is recommended because every Tableau Server pod will get a DNS entry with that deployment model. The Kubernetes documentation on headless services provides more details on how this works.

Bootstrap file

A bootstrap file must be shared between the initial node and all subsequent workers. We recommend using an NFS mount to facilitate the sharing of this file (this would enable the multi-node deployment to be fully automated). If you are using AWS you can use EFS to the same effect. Check the Multinode bootstrap file section for more details. The Kubernetes multinode template shows one possible way of handling this.

Documentation

Status

There are two status checks provided for Tableau Server. These can be used by an orchestration system, like Kubernetes, to determine whether Tableau Server is starting or not.

Aliveness Check

The aliveness check indicates whether or not TSM services are running. This means it will indicate whether the orchestrated services of Tableau Server are operating and are functioning. This check is callable here:

/docker/alive-check

Another option is to expose the TSM Controller service (running on port 8850) to provide administrative functions through a web browser. One could periodically check the health of the service by checking the health of the service through TCP health checks.

Readiness Check

The readiness check indicates whether Tableau Server is running and business services are ready to receive traffic. This can be determined using the following script:

/docker/server-ready-check

Another option is to use TCP health checks against port 8080 (or whatever port Tableau Server is bound to receive traffic). Sometimes this kind of TCP health check is more reliable than the server-ready-check, as the server-ready-check is based on service status reported to TSM which can sometimes be delayed as service state is updated.

Resource Limits

We strongly recommend that deployments to Kubernetes set appropriate resource limits for the deployed containers. Note that Tableau Server has significant resource requirements, make sure that the resource limits you specify matching at least the Tableau Server resource requirements for testing and production.

Network Properties

Tableau Server does not handle container hostname changes well, so it is important to specify the container's internal hostname so it is consistent between container runs.

Tableau Server nodes in a cluster communicate by between nodes by registering their container hostname amongst the other Tableau Server nodes. This means the container hostname must be resolvable by DNS.

Deployment Properties

We recommend using StatefulSets and persistent volume claims when deploying Tableau Server in Kubernetes. At the moment Tableau Server is a stateful application so appropriate measures should be taken to preserve and backup Tableau's application state. Future advancements will loosen these requirements.

tableau-server-in-kubernetes's People

Contributors

bhushantableau avatar nbrandes-tableau avatar rbrewer avatar seanmakesgames avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tableau-server-in-kubernetes's Issues

single node yaml runs into an error connecting to zookeeper

When deploying the single node yaml onto EKS 1.21 with latest version of Tableau server and Tableau docker image setup repo I get an error when initializing the topology. The node has 8CPU and 32GB memory, which I imagine should be enough to at least start Tableau.

tail -n 20 /var/opt/tableau/tableau_server/supervisord/run-tableau-server.log
42% - Waiting for services to reconfigure.    
Running - Initializing the topology.
An error occurred retrieving job status. The job may still be running, but TSM could not retrieve its status. You may be able to check the job status using the 'tsm jobs list' or 'tsm jobs reconnect' commands.
+ delete_files_on_exit
+ '[' 0 -gt 0 ']'
[tableau@tableau-0 /]$ tsm jobs reconnect --id 1
Reconnecting to asynchronous job...
2% - Checking control plane services.
5% - Validating that there are no pending changes.
7% - Generating new asset key.
10% - Saving asset key.
12% - Generating passwords.
15% - Generating secret keys.
17% - Generating apigateway mutual SSL certificates.
20% - Generating Apache Gateway Internal mutual SSL certificates.
22% - Generating Unique Cluster Identifier.
25% - Generating Search Server SSL certificate.
27% - Generating Index And Search Server SSL certificate.
30% - Generating ActiveMQ Server SSL certificate.
32% - Generating internal Metadata API mutual SSL certificates.
35% - Generating key store.
37% - Generating Hyper SSL certificate.
40% - Promoting configuration.
42% - Waiting for services to reconfigure.
Running - Initializing the topology.
Job id is '1', timeout is 150 minutes.

Server initialization was unsuccessful.
This job failed due to unexpected error: '{0}'

If "{0}'" could be replaced with a more descriptive error that would be great. Anyways, continuing on:

In /var/opt/tableau/tableau_server/data/tabsvc/logs/tabadmincontroller/tabadmincontroller_node1-0.log I saw

2022-05-12 02:29:23.331 +0000  pool-20-thread-1 : ERROR com.tableausoftware.tabadmin.webapp.asyncjobs.JobStepRunner - Running step InitializeTopology failed

and

2022-05-12 02:42:21.787 +0000  main-SendThread(tableau-0:8976) : INFO  org.apache.zookeeper.ClientCnxn - Opening socket connection to server tableau-0/10.10.43.33:8976. Will not attempt to authenticate using SASL (unknown error)
2022-05-12 02:42:21.856 +0000  nioEventLoopGroup-4-1 : INFO  org.apache.zookeeper.ClientCnxnSocketNetty - SSL handler added for channel: [id: 0x5047ef5d]
2022-05-12 02:42:21.864 +0000  nioEventLoopGroup-4-1 : INFO  org.apache.zookeeper.ClientCnxnSocketNetty - future isn't success, cause:
io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: tableau-0/10.10.43.33:8976
Caused by: java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
        at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source) ~[?:?]
        at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330) ~[netty-transport-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:710) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496) [netty-transport-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986) [netty-common-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [netty-common-4.1.73.Final.jar:4.1.73.Final]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.73.Final.jar:4.1.73.Final]
        at java.lang.Thread.run(Unknown Source) [?:?]

Note the "Will not attempt to authenticate using SASL (unknown error)" message. Not sure if that's of concern.

Zookeeper status says active. There is a error listed, may or may not be source of issue? I have not made any modifications to zookeeper.

[tableau@tableau-0 /]$ /var/opt/tableau/tableau_server/data/tabsvc/services/appzookeeper_0.20221.22.0415.1144/appzookeeper/appzookeeper status
[584] [INFO] 2022-05-12 17:58:22.233 +0000 : Loading configuration from /var/opt/tableau/tableau_server/data/tabsvc/services/appzookeeper_0.20221.22.0415.1144/appzookeeper/appzookeeper.runjavaservice.json
[584] [INFO] 2022-05-12 17:58:22.234 +0000 : Loading configuration from /var/opt/tableau/tableau_server/data/tabsvc/services/appzookeeper_0.20221.22.0415.1144/config/appzookeeper.runjavaservice.json
[584] [INFO] 2022-05-12 17:58:22.234 +0000 : Loading manifest from /var/opt/tableau/tableau_server/data/tabsvc/services/appzookeeper_0.20221.22.0415.1144/appzookeeper/appzookeeper.jar
[584] [INFO] 2022-05-12 17:58:22.234 +0000 : Starting malloc_trim thread. Run every 60 sec. Heap pad MB: 1
[584] [INFO] 2022-05-12 17:58:22.239 +0000 : Loading JVM library /var/opt/tableau/tableau_server/data/tabsvc/services/appzookeeper_0.20221.22.0415.1144/repository/jre/lib/server/libjvm.so
[584] [INFO] 2022-05-12 17:58:22.513 +0000 : Java class name: com.tableausoftware.zookeeper.Zookeeper; Method name: main; Arguments: status
	ERROR org.apache.zookeeper.server.quorum.QuorumPeerConfig - Invalid configuration, only one server specified (ignoring)
{
  "currentDeploymentState" : "NONE",
  "details" : {
    "code" : "imok"
  },
  "name" : "appzookeeper_0",
  "processStatus" : "ACTIVE",
  "timestampUtc" : 1652378307447,
  "version" : "20221.22.0415.1144"
}

Here's our /etc/hosts file:

[tableau@tableau-0 /]$ cat /etc/hosts
# Kubernetes-managed hosts file.
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
fe00::0	ip6-mcastprefix
fe00::1	ip6-allnodes
fe00::2	ip6-allrouters
10.10.43.33	tableau-0

[bug] hyper fails due to incorrect permissions

hyper.root.key has the wrong permissions (0660) when the pod is created or recreated. This causes hyper to fail. The workaround is manually fixing the permissions to be 0600. This has already been officially reported (case 08354366) and was not solved. I'm posting publicly so other people can easily find the workaround.

[tableau@censored-hostname /]$ tsm status -v
node1: localhost
        Status: DEGRADED
        'Tableau Server Application Server 0' is in an error state.
        'Tableau Server Data Engine 0' is in an error state.
        'Tableau Server Ask Data 0' is in an error state.

[tableau@censored-hostname /]$ # note that /var/opt/tableau is a EBS volume mounted onto the image
[tableau@censored-hostname /]$ stat /var/opt/tableau/tableau_server/data/tabsvc/config/hyper_0.20221.22.0415.1144/hyperSecurity/hyper.root.key
  File: ‘/var/opt/tableau/tableau_server/data/tabsvc/config/hyper_0.20221.22.0415.1144/hyperSecurity/hyper.root.key’
  Size: 1704       Blocks: 8          IO Block: 4096   regular file
Device: 10303h/66307d Inode: 9961693     Links: 1
Access: (0660/-rw-rw----)  Uid: (  999/ tableau)   Gid: (  998/ tableau)
Access: 2022-08-01 15:40:06.450123093 +0000
Modify: 2022-06-21 00:21:46.211158153 +0000
Change: 2022-08-06 06:01:13.096799104 +0000
 Birth: -

[tableau@censored-hostname /]$ tail /var/opt/tableau/tableau_server/data/tabsvc/services/hyper_0.20221.22.0415.1144/logs/stdout_hyper_0.log
Permissions for 'ssl_key' have to be 0600
LogFile: /var/opt/tableau/tableau_server/data/tabsvc/logs/hyper/hyper_0_2022_08_06_00_00_00.log
hyperd server version 9.1.0 build version 2022.1.0.14040.rf1eaa7d9
Permissions for 'ssl_key' have to be 0600

[tableau@censored-hostname /]$ chmod 0600 /var/opt/tableau/tableau_server/data/tabsvc/config/hyper_0.20221.22.0415.1144/hyperSecurity/hyper.root.key

[tableau@censored-hostname /]$ lsb_release -a
Distributor ID: CentOS
Description: CentOS Linux release 7
Release: 7

[tableau@censored-hostname /]$ tsm status -v
node1: localhost
        Status: RUNNING

Upgrade Job Documentation

Hi!

I'm wondering how one would use the upgrade-job in production. is there any documentation for how one would use this? It seems like upgrades should be handled by creating new images and rolling out new pods, just not sure how the job template helps with that....

Tableau deployment is failing at initialization phase in kubernetes platform when using single-node and three-node yamls

Hi Team,

We are creating Tableau server image using the build-image tool prescribed in https://help.tableau.com/current/server-linux/en-us/server-in-container_setup-tool.htm link as there is no official image for it in hub.docker.com.
We have built the image in Ubuntu-20.04.
After image creation, we deployed it in kubernetes cluster using the single-node yaml given in the link https://github.com/tableau/tableau-server-in-kubernetes/blob/main/templates/single-node.yml.

It is stuck at initialization phase itself. We are using high config machines - 16vcpus, 64GB RAM and 128GB space.
We are unable to move forward because of this issue. Attaching logs for your further reference.
Kindly help.

Tableau Kubernetes Deployment Issue

Can anyone assist me with this issue I'm facing while deploying tableau in Kubernetes?

I'm trying to deploy a single node using the template. But unfortunately, when I run the deployment I'm getting insufficient CPU and memory issues. I'm using m5X4large Amazon Linux instance that has 16vcpus and 64GB memory. I did specify the same in my YAML file.

Error:
"Unable to schedule pod; no fit; waiting" pod="tableau-server-ns/tableau-deployment-7f8b674dbb-qfql9" err="0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory."

Name: tableau-deployment-7f8b674dbb-qfql9
Namespace: tableau-server-ns
Priority: 0
Node:
Labels: app=tableau-server
pod-template-hash=7f8b674dbb
Annotations: kubernetes.io/psp: eks.privileged
Status: Pending
IP:
IPs:
Controlled By: ReplicaSet/tableau-deployment-7f8b674dbb
Containers:
tableau-repo:
Image:
Port: 8080/TCP
Host Port: 0/TCP
Limits:
cpu: 16
memory: 64Gi
Requests:
cpu: 16
memory: 64Gi
Liveness: exec [/bin/sh -c /docker/alive-check] delay=600s timeout=1s period=60s #success=1 #failure=3
Readiness: exec [/bin/sh -c /docker/server-ready-check] delay=360s timeout=1s period=30s #success=1 #failure=3
Environment:
LICENSE_KEY: <set to the key 'LICENSE_KEY' in secret 'tableau-secrets'> Optional: false
TABLEAU_USERNAME: <set to the key 'TABLEAU_USERNAME' in secret 'tableau-secrets'> Optional: false
TABLEAU_PASSWORD: <set to the key 'TABLEAU_PASSWORD' in secret 'tableau-secrets'> Optional: false
TSM_REMOTE_PASSWORD: <set to the key 'TSM_REMOTE_PASSWORD' in secret 'tableau-secrets'> Optional: false
Mounts:
/docker/config/config.json from configmount (rw,path="config.json")
/var/opt/tableau from datamount (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-82wtz (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
configmount:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: configfile
Optional: false
datamount:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir
ReadOnly: false
kube-api-access-82wtz:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: Guaranteed
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning FailedScheduling 24s (x66 over 65m) default-scheduler 0/1 nodes are available: 1 Insufficient cpu, 1 Insufficient memory.

Am I out of luck for a disconnected OpenShift instance?

Per this documentation, it seems that Tableau server does not support license activation on an air gapped network while running in a container.

Is this still true? Like my title states, I want to deploy Tableau Server on an air gapped OpenShift instance, and I want to know if I am on a fool's errand before I go too far down the path of attempting to deploy it.

If air gapped deployment on OCP is truly not possible and there exist no workarounds, are there any plans to have this capability in the future?

Thank you!

Trusted IP and hostname

I have a pod that is requesting a trusted token to the Tableau pod.
Since pods are ephemeral and the IP will likely change often, what do I put in the field for the whitelisted sources for the trusted authentication?
Would be perfect if Tableau accepted a CIDR instead of a static IP, unfortunately, that's not the case.

Failing to install/configure Tabadmin Controller

Hi I am facing an issue when initializing tableau on kubernetes

From tableau/tableau_server/data/tabsvc/logs/tabadmincontroller/control_tabadmincontroller_node1-0.log:

2022-01-28 14:26:22.523 +0000 205 main : DEBUG com.tableausoftware.tabadmin.configuration.builder.AppConfigurationBuilder - Writing connections properties /var/opt/tableau/tableau_server/data/tabsvc/services/connections.properties
2022-01-28 14:26:22.524 +0000 205 main : DEBUG com.tableausoftware.tabadmin.configuration.builder.AppConfigurationBuilder - Picking connection settings from dataengine over vizqlserver.
2022-01-28 14:26:22.537 +0000 205 main : ERROR com.tableausoftware.tabadmin.webapp.TabadminController - Exception while configuring process.
java.nio.file.FileSystemException: /var/opt/tableau/tableau_server/data/tabsvc/services/connections.properties: Operation not supported
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:100) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
	at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
	at sun.nio.fs.LinuxDosFileAttributeView.readAttributes(LinuxDosFileAttributeView.java:182) ~[?:?]
	at com.tableausoftware.files.PropertiesFile.cloneSecurity(PropertiesFile.java:83) ~[file-utils-20214.0.10.jar:?]
	at com.tableausoftware.files.PropertiesFile.storeSorted(PropertiesFile.java:118) ~[file-utils-20214.0.10.jar:?]
	at com.tableausoftware.tabadmin.configuration.builder.AppConfigurationBuilder.writeConnectionProperties(AppConfigurationBuilder.java:486) ~[tab-tabadmin-config-latest.jar:?]
	at com.tableausoftware.tabadmin.configuration.builder.AppConfigurationBuilder.writeConfigurationFiles(AppConfigurationBuilder.java:635) ~[tab-tabadmin-config-latest.jar:?]
	at com.tableausoftware.tabadmin.configuration.builder.AppConfigurationBuilder.build(AppConfigurationBuilder.java:328) ~[tab-tabadmin-config-latest.jar:?]
	at com.tableausoftware.tabadmin.configuration.builder.AppConfigurationBuilder.buildAndWriteConfigurations(AppConfigurationBuilder.java:104) ~[tab-tabadmin-config-latest.jar:?]
	at com.tableausoftware.service.control.BaseTableauServiceCommands.buildAndWriteWorkgroupConfig(BaseTableauServiceCommands.java:359) ~[control-shared-latest.jar:?]
	at com.tableausoftware.service.control.BaseTableauServiceCommands.generateLocalConfigs(BaseTableauServiceCommands.java:315) ~[control-shared-latest.jar:?]
	at com.tableausoftware.tabadmin.webapp.TabadminController$Commands.configureImpl(TabadminController.java:299) [control-tabadmincontroller.jar:?]
	at com.tableausoftware.tabadmin.webapp.TabadminController$Commands.install(TabadminController.java:446) [control-tabadmincontroller.jar:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
	at com.tableausoftware.commandline.SwitchCommand$1.run(SwitchCommand.java:174) [tab-commandline-jewel-cli-latest.jar:?]
	at com.tableausoftware.commandline.SimpleCommand.execute(SimpleCommand.java:47) [tab-commandline-jewel-cli-latest.jar:?]
	at com.tableausoftware.commandline.SwitchCommand.execute(SwitchCommand.java:129) [tab-commandline-jewel-cli-latest.jar:?]
	at com.tableausoftware.tabadmin.webapp.TabadminController.main(TabadminController.java:171) [control-tabadmincontroller.jar:?]

Do you have any idea what might be causing this?

Tableau container logging secrets

As part of the startup of tableau's container, it will print all variables in the .env which may contain secrets. That's probably due to the script having -x set.

I didn't know where to open this issue, so I'm sorry if that's the wrong place.

insufficient timeout for readiness probe

default timeout for readiness probe is 1 second
[ref] https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes

script on my deployment takes just over 5s to finish running.

[tableau@bi-test docker]$ time ./server-ready-check
node1: localhost
        Status: RUNNING
        'Tableau Server Gateway 0' is running.
        'Tableau Server Application Server 0' is running.
        'Tableau Server Interactive Microservice Container 0' is running.
                'MessageBus Microservice 0' is running.
                'Relationship Query Microservice 0' is running.
                'Credentials Service 0' is running.
        'Tableau Server VizQL Server 0' is running.
        'Tableau Server VizQL Server 1' is running.
        'Tableau Server Cache Server 0' is running.
        'Tableau Server Cache Server 1' is running.
        'Tableau Server Coordination Service 0' is running.
        'Tableau Server Cluster Controller 0' is running.
        'Tableau Server Search And Browse 0' is running.
        'Tableau Server Backgrounder 0' is running.
        'Tableau Server Backgrounder 1' is running.
        'Tableau Server Non-Interactive Microservice Container 0' is running.
        'Tableau Server Data Server 0' is running.
        'Tableau Server Data Server 1' is running.
        'Tableau Server Data Engine 0' is running.
        'Tableau Server File Store 0' is running.
        'Tableau Server Repository 0' is running (Active Repository).
        'Tableau Server Tableau Prep Conductor 0' is running.
        'Tableau Server Tableau Prep Flow Authoring 0' is running.
        'Tableau Server Tableau Prep Minerva Service 0' is running.
        'Tableau Server Index And Search Server 0' is running.
        'Tableau Server Ask Data 0' is running.
        'Tableau Server Administration Agent 0' is running.
        'Tableau Server Administration Controller 0' is running.
        'Tableau Server License Manager 0' is running.
        'Tableau Server Activation Service 0' is running.
        'Tableau Server Client File Service 0' is running.
        'Tableau Server Database Maintenance 0' is stopped.
        'Tableau Server Backup/Restore 0' is stopped.
        'Tableau Server Site Import/Export 0' is stopped.
        'Tableau Server Collections Service 0' is running.
        'Tableau Server Content Exploration Service 0' is running.
        'Tableau Server Webhooks 0' is running.
        'Tableau Server Authentication 0' is running.
        'Tableau Server API Gateway 0' is running.
        'Tableau Server Analytics Extensions Microservice 0' is running.
        'Tableau Server Messaging Service 0' is running.
        'Tableau Server Data Source Properties Service 0' is running.
        'Tableau Server Internal Data Source Properties Service 0' is running.
        'Tableau Server Metrics Service 0' is running.
        'Tableau Server Resource Limits Manager 0' is running.
        'Tableau Server Statistical Service 0' is running.
        'Tableau Server NonRelational Storage Service 0' is running.
        'Tableau Server Data Stories Service 0' is running.

real	0m5.019s

suggestion:
add timeoutSeconds: 10 to both probes.

readinessProbe:
  exec:
    command:
    - /bin/sh
    - -c
    - /docker/server-ready-check
  initialDelaySeconds: 360
  periodSeconds: 30
  timeoutSeconds: 10
livenessProbe:
  exec:
    command:
    - /bin/sh
    - -c
    - /docker/alive-check
  initialDelaySeconds: 600
  periodSeconds: 60
  timeoutSeconds: 10

three-node setup, pgsql (repository) does not survive restart if the ip of the pod changes

tableau version: tableau-server-20214.22.0420.0834-20214-22.0420.0834.x86_64 official docker image

We have been experimenting with 3 statefulset pods setup, the installation went smooth and we got it up and running, with active passive local pgsql (repository) running on primary and one of the worker nodes accordingly.
The problem appeared when we tried to reboot any of the pods which has pgsql role, so when pod is up and running, in case the ip changes some services like pgsql never recovered.
I've tried different conditions:

  1. just reboot the pod, - issue occurs
  2. tsm stop, and then reboot the pod, - the same issue
  3. tsm topology failover-repository and then reboot the passive pod - the same issue.

some exceptions from the logs:
/var/opt/tableau/tableau_server/data/tabsvc/logs/pgsql/spawn.log

pg_basebackup: error: FATAL:  no pg_hba.conf entry for replication connection from host "10.xx.xx.xx", user "tblwgadmin", SSL on
FATAL:  no pg_hba.conf entry for replication connection from host "10.xx.xx.xx", user "tblwgadmin", SSL off

/var/opt/tableau/tableau_server/data/tabsvc/logs/vizqlserver/vizqlserver_node1-0.log

NativeApiLifecycleThread : ERROR wgsessionId= com.tableausoftware.resource.maps.GlobalGeocodingInfoProvider - Exception while reading geocoding paths
org.springframework.transaction.CannotCreateTransactionException: Could not open JPA EntityManager for transaction; nested exception is org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection
	at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:467) ~[spring-orm-5.3.18.jar:5.3.18]
	at org.springframework.transaction.support.AbstractPlatformTransactionManager.startTransaction(AbstractPlatformTransactionManager.java:400) ~[spring-tx-5.3.18.jar:5.3.18]
	at org.springframework.transaction.support.AbstractPlatformTransactionManager.getTransaction(AbstractPlatformTransactionManager.java:373) ~[spring-tx-5.3.18.jar:5.3.18]
	at org.springframework.transaction.interceptor.TransactionAspectSupport.createTransactionIfNecessary(TransactionAspectSupport.java:595) ~[spring-tx-5.3.18.jar:5.3.18]
	at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(TransactionAspectSupport.java:382) ~[spring-tx-5.3.18.jar:5.3.18]
	at org.springframework.transaction.aspectj.AbstractTransactionAspect.ajc$around$org_springframework_transaction_aspectj_AbstractTransactionAspect$1$2a73e96c(AbstractTransactionAspect.aj:71) ~[spring-aspects-5.3.18.jar:5.3.18]
	at com.tableausoftware.resource.maps.dao.GlobalGeocodingDaoJpa.getAllActiveEntries(GlobalGeocodingDaoJpa.java:80) ~[tab-resource-maps-latest.jar:?]
	at com.tableausoftware.resource.maps.GeocodingInfoService.getAllActiveEntries_aroundBody14(GeocodingInfoService.java:133) ~[tab-resource-maps-latest.jar:?]
	at com.tableausoftware.resource.maps.GeocodingInfoService$AjcClosure15.run(GeocodingInfoService.java:1) ~[tab-resource-maps-latest.jar:?]
	at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:167) ~[aspectjrt-1.9.5.jar:?]
	at com.tableausoftware.aspects.search.TransactionLoggerAspect.trackLongRunningTransaction(TransactionLoggerAspect.java:71) ~[tab-domain-indexing-interfaces-latest.jar:?]
	at com.tableausoftware.resource.maps.GeocodingInfoService.getActiveGlobalEntries(GeocodingInfoService.java:133) ~[tab-resource-maps-latest.jar:?]
	at com.tableausoftware.resource.maps.GlobalGeocodingInfoProvider.getVersionedGeocodingPath(GlobalGeocodingInfoProvider.java:202) ~[tab-resource-maps-latest.jar:?]
	at com.tableausoftware.resource.maps.GlobalGeocodingInfoProvider.ensureCurrent(GlobalGeocodingInfoProvider.java:154) ~[tab-resource-maps-latest.jar:?]
	at com.tableausoftware.resource.maps.GlobalGeocodingInfoProvider.runPreInitHook(GlobalGeocodingInfoProvider.java:97) ~[tab-resource-maps-latest.jar:?]
	at com.tableausoftware.nativeapi.NativeApiManager.invokePreInitHooks(NativeApiManager.java:203) ~[tab-nativeapi-latest.jar:?]
	at com.tableausoftware.nativeapi.NativeApiManager.serverInit(NativeApiManager.java:208) ~[tab-nativeapi-latest.jar:?]
	at com.tableausoftware.nativeapi.NativeApiManager.lambda$start$0(NativeApiManager.java:355) ~[tab-nativeapi-latest.jar:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?]
	at java.util.concurrent.FutureTask.run(Unknown Source) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?]
	at java.lang.Thread.run(Unknown Source) [?:?]
Caused by: org.hibernate.exception.GenericJDBCException: Unable to acquire JDBC Connection
	at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:47) ~[hibernate-core-5.4.27.Final.jar:5.4.27.Final]
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:113) ~[hibernate-core-5.4.27.Final.jar:5.4.27.Final]
	at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:99) ~[hibernate-core-5.4.27.Final.jar:5.4.27.Final]
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:111) ~[hibernate-core-5.4.27.Final.jar:5.4.27.Final]
	at org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:138) ~[hibernate-core-5.4.27.Final.jar:5.4.27.Final]
	at org.hibernate.internal.SessionImpl.connection(SessionImpl.java:480) ~[hibernate-core-5.4.27.Final.jar:5.4.27.Final]
	at org.springframework.orm.jpa.vendor.HibernateJpaDialect.beginTransaction(HibernateJpaDialect.java:152) ~[spring-orm-5.3.18.jar:5.3.18]
	at org.springframework.orm.jpa.JpaTransactionManager.doBegin(JpaTransactionManager.java:421) ~[spring-orm-5.3.18.jar:5.3.18]
	... 22 more
Caused by: org.postgresql.util.PSQLException: FATAL: no pg_hba.conf entry for host "10.xx.xx.xx", user "rails", database "workgroup", SSL on

Does this mean we need static internal ips to make it work properly?
Is there any workaround for this except rebuilding the cluster ?

What I also see that /var/opt/tableau/tableau_server/data/tabsvc/config/pgsql_0.20214.22.0420.0834/pg_hba.conf
contains the outdated ips, maybe there is way to reload it alongside the other configs ?

getting an error when attempting to build the kubernetes container, might you know the reason? or be able to assist?

attempt:

[~/kube/apps/tableau/tableau-server-container-setup-tool-2021.4.0]
$ ./build-image --accepteula -i ../tableau-server-2021-4-2.x86_64.rpm -f
Copying installer to docker context directory: /home/lknite/kube/apps/tableau/tableau-server-container-setup-tool-2021.4.0/image/tableau-server-2021-4-2.x86_64.rpm
ERROR: Installer file corrupt: ../tableau-server-2021-4-2.x86_64.rpm. Correct the file and re-run command with -f flag.

I've re-downloaded the file a few times using wget.

rpm checksig thinks everything is ok:

$ rpm --checksig ../tableau-server-2021-4-2.x86_64.rpm
../tableau-server-2021-4-2.x86_64.rpm: digests OK

running the latest centos-8-stream w/ docker 20.20.11

Failed to pull image: no space left on device

When pulling the image I got an error:

Failed to pull image "CENSORED": rpc error: code = Unknown desc = failed to register layer: Error processing tar file(exit status 1): write /tableau-server-CENSORED.rpm: no space left on device

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.