genepi / cloudgene Goto Github PK

View Code? Open in Web Editor NEW

35.0 4.0 18.0 15.74 MB

A framework to build Software As A Service (SaaS) platforms for data analysis pipelines.

Home Page: http://www.cloudgene.io

License: GNU Affero General Public License v3.0

Shell 0.10% JavaScript 12.91% HTML 0.61% CSS 0.70% Java 85.14% Groovy 0.09% Dockerfile 0.20% Makefile 0.03% EJS 0.21%

cloudgene hadoop docker analysis-pipeline cloudgene-saas

cloudgene's Introduction

Cloudgene

A framework to build Software As A Service (SaaS) platforms for data analysis pipelines.

Features

🔧 Build your analysis pipeline in your favorite language or use Hadoop based technologies (MapReduce, Spark, Pig)
📄 Integrate your analysis pipeline into Cloudgene by writing a simple configuration file
💡 Get a powerful web application with user management, data transfer, error handling and more
⭐ Deploy your application with one click to any Hadoop cluster or to public Clouds like Amazon AWS
☁️ Provide your application as SaaS to other scientists and handle thousands of jobs like a pro
🌎 Share your application and enable everyone to clone your service to its own hardware or private cloud instance

Requirements

You will need the following things properly installed on your computer.

Java 8 or higher
Hadoop (Optional)
Docker (Optional)
MySQL Server (Optional)

Installation

You can install Cloudgene via our install script:

mkdir cloudgene
cd cloudgene
curl -s install.cloudgene.io | bash

Test the installation with the following command:

./cloudgene version

We provide a Docker image to get a full-working Cloudgene instance in minutes without any installation.

Getting started

The hello-cloudgene application can be installed by using the following command:

./cloudgene github-install lukfor/hello-cloudgene

The webserver can be started with the following command:

./cloudgene server

The webservice is available on http://localhost:8082. Please open this address in your web browser and enter as username admin and as password admin1978 to login.

Click on Run to start the application.

A job can be started by filling out the form and clicking on the blue submit button. The hello-cloudgene application displays several inspiring quotes:

The documentation is available at http://docs.cloudgene.io

More examples can be found in genepi/cloudgene-examples.

Cloudgene and Genomics

See Cloudgene in action:

Developing

More about how to build Cloudgene from source can be found here.

Contact

Lukas Forer @lukfor
Sebastian Schönherr @seppinho

License

Cloudgene is licensed under AGPL-3.0.

cloudgene's People

Contributors

Stargazers

Watchers

Forkers

gesedna wshh08 bg1115009656 smile2014 chao-huang statgen ondrocks yfarjoun moyu3390 shursulei jdpleiness hmgu-itg abought dtenenba asangphukieo alanseb92 bingli2019 gjmulder

cloudgene's Issues

html output

I have step that render html file output.. It would be greate to add them as an html widget.
I didn't success.. I tested to set the template to $output/myfile.html but it doesn't take the good path.. Any idea ?

Feature Request : Dynamic visibility of options

Hi,

Great application !

if I may, I think adding a dynamic visibility to the inputs deemed optional might be a helpful addition. For instance, having input options relevant to a part of the pipeline be visible conditional on the value of another input would allow the visibility of parameter such as 'rsq Filter' conditional on the appropriate Mode input (within an imputation pipeline).

Thank you !

Cancel jobs when a user is deleted

Summary

Currently, if a user deletes their account while a job is in the queue, it will continue running even though there may be no way to deliver results. (since they can't log in to get download links)

This is wasteful in times of peak load. We've recently started warning users with multiple accounts, and have seen some examples of people deleting accounts while jobs are running.

Proposed change

The user deletion code currently sets jobs to the user "public" but does not check if jobs are running.

It makes sense to keep some record of jobs for tracking purposes, but ideally, the jobs should be canceled if running (before being changed)

A much longer term solution would be to pair this with "soft deletion", where the user record in the DB was retained, with a new field is_deleted = true that blocks login etc. This would allow auditing of user behavior that is not possible when the entire row is completely removed from the DB.

Jobs Wrong URL

When viewing a job as an admin, the "Jobs" link in the upper left is linked incorrectly to /admin.html#!pages/jobs instead of to /admin.html#!pages/admin-jobs and clicking it results in a 404.

Edit: Version: 1.30.5 (built by travis on 2018-09-14T07:42:13Z)

Kerberos authentication

After submitting a vcf for imputation on cloudgene, The following error occurs:

org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]

The log is below. This hadoop cluster is using Kerberos for authentication. Can ImputationServer be configured to authenticate with Kerberos?

There is also a warning that the Cluster seems unreachable. Hadoop support disabled. Is this related? Any suggestions?

./cloudgene server  --user farrell  --conf /usr/hdp/2.4.0.0-169/hadoop/conf

Cloudgene 1.30.3
http://www.cloudgene.io
(c) 2009-2018 Lukas Forer and Sebastian Schoenherr
Built by travis on 2018-02-12T14:59:50Z

Use Haddop configuration folder /usr/hdp/2.4.0.0-169/hadoop/conf with username farrell
18/05/26 19:44:55 INFO mapred.Main: Cloudgene 1.30.3
18/05/26 19:44:55 INFO mapred.Main: Built by travis on 2018-02-12T14:59:50Z
18/05/26 19:44:55 INFO mapred.Main: Establish connection successful
18/05/26 19:44:55 INFO mapred.Main: Database is uptodate.
18/05/26 19:44:55 INFO util.Fixtures: User admin already exists.
18/05/26 19:44:55 INFO util.Fixtures: Template MAINTENANCE_MESSAGE already exists.
18/05/26 19:44:55 INFO util.Fixtures: Template FOOTER already exists.
18/05/26 19:44:55 INFO util.Fixtures: Template REGISTER_MAIL already exists.
18/05/26 19:44:55 INFO util.Fixtures: Template RETIRE_JOB_MAIL already exists.
18/05/26 19:44:55 INFO util.Fixtures: Template RECOVERY_MAIL already exists.
18/05/26 19:44:55 INFO jobs.PersistentWorkflowEngine: Init Counters....
18/05/26 19:44:55 INFO mapred.Main: Starting web server at port 8082
18/05/26 19:44:55 INFO mapred.WebServer: Start CronJobScheduler...
[WARN]  Cluster seems unreachable. Hadoop support disabled.
[WARN]  Docker not found. Docker support disabled.

Server is running on http://localhost:8082

Please press ctrl-c to stop.
18/05/26 19:45:08 INFO LogService: 2018-05-26   19:45:08        10.48.225.55    -       10.48.225.55    8082    GET     /api/v2/server/apps/genepi-imputationserver     -       200     2973    0       341     http://scc-hadoop.bu.edu:8082   Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0     http://scc-hadoop.bu.edu:8082/start.html
 Parsed: refpanel
 Parsed: mode
 Parsed: check1
 Parsed: aesEncryption
 Parsed: check2
 Parsed: job-name
 Parsed: r2Filter
 Parsed: build
 Parsed: phasing
 Parsed: files
 Parsed: tool
 Parsed: population
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
...

Can't get jobs run on hadoop 3.3.6

Hey, has Cloudgene been tested with Hadoop 3.3.6 ?
If not, which latest Hadoop version is supported?

Potential Data Leak

I noticed that there is an option for users to share their data in the Results tab for a job on the far right of the table, which gives them a direct download link. Isn't this a security risk? The link is accessible without being logged into their account. Can this be easily disabled? Was there some option for that in the .yaml application file for this that isn't documented? I tested if it was the "download" option, and while setting download to false certainly makes it unavailable, it makes it unavailable to the user that submitted the job as well.

Or is this unavailable until they click there to share it? In which case it might be prudent to place a second button to revoke access in case a user clicks it by mistake or without understanding the link is publicly accessible.

Version: 1.30.5 (built by travis on 2018-09-14T07:42:13Z)

Checkboxes always have value "true" when submitting jobs from the command line

Dockerfile uses unsupported Cloudera 5 / Ubuntu 14

Dockerfile:

FROM genepi/cdh5-hadoop-mrv1:latest

[[cdh5-hadoop-mrv1](https://github.com/genepi/cdh5-hadoop-mrv1/blob/master/Dockerfile):](https://github.com/genepi/cdh5-hadoop-mrv1/blob/master/Dockerfile)

FROM ubuntu:14.04

It looks like Cloudera has dropped access to later versions of CDH:

As of February 1, 2021, all downloads of CDH and Cloudera Manager require a username and password and use a modified URL. You must use the modified URL, including the username and password when downloading the repository contents described below. You may need to upgrade Cloudera Manager to a newer version that uses the modified URLs.

This can affect new installations, upgrades, adding new hosts to a cluster, downloading a new parcel, and adding a cluster.

If I don't need Hadoop, can I simply use a supported version of Ubuntu and avoid the Cloudera deps?

Cloudgene seems not to work with Yarn

Hi,

I found Cloudgene depends on Jobtracker & Tasktracker. They have already been replaced by Yarn. I installed Hadoop 3.1.0 and set up as:

# mapred-site.xml
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

# yarn-site.xml
<configuration>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>hdfs://localhost:8021</value>
  </property>
</configuration>

But in cloudgene.err it displayed:

Unknown rpc kind in rpc headerRPC_WRITABLE

, which is caused by incompatibility between MR1 with MR2 (using yarn). Is there anyway I can fix it? Or do I have to downgrade Hadoop to older version?

How to output a file ?

I don't know how to specify the output filename ! Using type = local_file create a file output inside the output folder. But I want a file named "merged.vcf" .

workflow:
    steps:
      - name: Step1 
        cmd: /bin/bash run.sh $input > $output
        bash: true

    inputs:
      - id: input
        description: a zip folder with many VCF 
        type: local_file

    outputs:
      - id: output
        description: Merged vcf file 
        type: local_file

Application with Docker and mounted volumes

Hi!

Is there a way to develop a Cloudgene application with a docker image that need to mount volumes (such as databases folders)?

Thanks a lot!

Best,

Some job counters are prone to overflow/wraparound

Summary

Certain job metrics are not being stored correctly in the database. This makes it more difficult to investigate system performance questions like "how many people would be affected if we put a limit on number of SNPs submitted".

There may be alternate methods to grab the data eventually from job logs, but not as convenient. It's not an urgent fix, but definitely a "gotcha".

Tracking in case this surprises anyone else!

Description/ root cause

A counter value like genotypes is calculated by multiplying two large ints, like "genotypes * samples". The result is bigger than the maximum java value for that type (2147483647 for signed ints)
Java represents this as a much smaller number
The correct value is shown in UI / job logs (which are stored separately as a pre-constructed text string), but the wrong value is stored in the DB table.

This affects both the initial calculation, and the incCounters method (which accepts an int).

Example

A recent job submitted 2.5M SNPs with 15k samples. (3.75 e 10) The Java max value for an int is ~2.1B. The resulting value is wrapped to ~3e6. The correct # of SNPs and samples are shown in the job report (where they are represented separately), but the values in the report do not match the numbers stored in the database (which are multiplied together).

In practice, this is usually not obvious until one needs to query to find big jobs. A subtler sign of an issue is that in TIS, ~10% of "genotypes" counters are < 0.

select count(*) from counters where name='genotypes' and value <0;

Note: the MySQL table definition would already support bigger numbers (counters.value = bigint column type). The issue appears to be in java.

OpenID Feature Request

I realize that this is not a high priority, but to reduce the barrier to entry for new users it would be nice if they can use their existing accounts. For instance their Google, Facebook, or ORCID account. ORCID provides some documentation on to integrate their sign in here:

https://members.orcid.org/api/integrate/orcid-sign-in

I believe these utilize Oauth, so maybe it would involve incorporating a project from these http://openid.net/developers/certified/

Valid Gruntfile could not be found

I tried to build the Webinterface and used the following steps according to the docs:

cd src/main/html/webapp
npm install
mkdir tmp
grunt

However the grunt command indicated a valid Gruntfile could not be found. There is not a Gruntfile in src/main/html/webapp. Any suggestions for resolving this?

Docker image in this repo can't be built - missing file

Hello, when I try and build the Dockerfile in this repo I get this error:

 > [15/22] COPY target/cloudgene-installer.sh /opt/cloudgene/cloudgene-installer.sh:
------
failed to compute cache key: "/target/cloudgene-installer.sh" not found: not found

There is no file target/cloudgene-installer.sh.

Can you help?
Thanks

Docker Support is Disabled When Running as a Daemon

Perhaps there isn't too much that you can do about this, but running the server normally with sudo shows docker support "Docker is installed and running (Client version: 17.12.1-ce, Client API Version: 1.35)" while running the daemon version shows it as disabled. I suspect this is because the daemon reduces itself to a regular user after port binding so it no longer has the credentials to start docker containers. This could be overcome by allowing normal users to run docker containers, but that is perhaps worse from a security standpoint than leaving the daemon with sudo. Maybe the daemon version could present a more specific error message, I noticed this quickly, but I can imagine someone trying to get it to pick up on their docker and not realizing it was a permissions issue.

Complete Documentation

Several areas of cloudgene.io have incomplete documentation, e.g. http://docs.cloudgene.io/daemon/administration/

I can't find anything about how to modify the webui other than that it uses some combination of javascript, java, and is built with nodejs grunt and maven. It is really very unclear to me still though.

I found this https://scotch.io/tutorials/use-ejs-to-template-your-node-application

Which suggests to me that placing .ejs files in /data/pages (based on the cloudgene.conf file) would be used to replace the defaults? That doesn't seem to work for me though. Does it need to be totally rebuilt, ala the DEVELOPING.md file?

I may just be in over my head with this framework.

Set autocomplete to OFF for text boxes

[deprecation] Google Analytics: old style no longer collects data

Summary

Analytics features may be broken due to a vendor change.

TIS doesn't presently use it, but might in the future. Does it impact MIS?

Detail

Cloudgene has a feature to use Google Analytics, but the code was last updated ~5 yr ago.

As of July 1 (2 weeks ago), google stopped collecting data from old style Google Analytics properties unless they upgrade. This will almost certainly affect the Google Analytics feature in cloudgene.

Action

If at least one site needs it, we can investigate updating the analytics code + backend project settings. This might make more sense post cloudgene v3, since front and backend are under an active rewrite.

Hard-Coded Rscript Path

In the cloudgene 2.0.0 RC6 release, the Rscript path is hard coded to be RSCRIPT_PATH = "/usr/bin/Rscript" in src/main/java/cloudgene/mapred/plugins/rscript/RScriptBinary.java. Could this variable be made configurable? Or check if Rscript is present on the path and use that? For our HPCC, we use module to update paths to software versions. So Rscript is not found in a standard path and will vary based on the version of R loaded.

Build from source testReturnTrueStep failure

Hi,

I cloned the repository and ran mvn install, but the build reports that testReturnTrueStep fails.

Here is the build.log

[bug] Uncaught exceptions lead to unclear job failures

Summary

Certain kinds of QC errors are not handled in the code, and lead to mysterious job failures that a user cannot diagnose or fix. This is one of our most frequent helpdesk inquiries.

Actual behavior

Non-admin users are not allowed to see the full job logs tab. Thus, they cannot inspect the stack trace to see the error description. Th actual information they are presented with is rather opaque.

Common scenarios

htsjdk does not support VCF4.3, and files in this format fail to parse.

Task 'Calculating QC Statistics' failed.
Exception:java.lang.IllegalArgumentException: Writing VCF version VCF4_3 is not implemented
at htsjdk.variant.variantcontext.writer.VCFWriter.rejectVCFV43Headers(VCFWriter.java:275)

It appears that certain VCF fields are required. This isn't captured in the data preparation docs, and some users have triggered an error they cannot see.

Task 'Calculating QC Statistics' failed.
Exception:java.io.IOException: /mnt/jobs/job-20230623-145718-031/input/files/split.chr1.vcf.gz: Line 7812: No GT field found in FORMAT column.
at genepi.io.text.AbstractLineReader.next(AbstractLineReader.java:46)

In the newest Minimac 4.1.x series, Minimac has been changed to stop the job if too many allele swaps are detected. QC does not check this, and the error is indicated only in minimac stdout (--> not captured by the admin or user level job logs)

Imputing chrREDACTED:x-y ...
Loading target haplotypes ...
Loading target haplotypes took 0 seconds
Loading reference haplotypes ...
Loading reference haplotypes took 22 seconds
Typed sites to imputed sites ratio: 0 (0/redacted)
Error: not enough target variants are available to impute this chunk. The --min-ratio, --chunk, or --region options may need to be altered.

Expected behavior

Document required VCF fields such as GT in the data preparation docs. (if appropriate)
Handle the two exception cases noted above, and provide helpful messages that will appear in the part of the job report visible to regular users
Provide a fallback message for any other unhandled error types, indicating that a user should reach out to the helpdesk.
Consider adding some sort of logging event for unhandled error cases that stop the QC flow, so that developers can identify future edge cases that might be confusing.

How to install from local file?

The installation of 1000genomes fails before the zip download is finished:

genepi/imputationserver-docker#5

I'm downloading the zip now to inspect why it's complaining about not finding the zip headers, but the download seems to take a few hours for 12.6 GB.

Is there a way to download the zip and then install using that local copy?

Hard-coded database connection pool size too small, requests failing

Problem

During periods of high traffic, users are randomly unable to connect to the website. Logs trace this to a connection pooling problem: although our database can support hundreds of simultaneous connections (our default value: 304), the pool size is capped at 10.

This results in elevated 500 errors, interrupted job uploads, and generally broken stuff. As the site gets more traffic, this becomes more common.

Proposed solution(s)

Make maxActive connection pool size configurable when the database connection is first opened
...or at least increase the default value (based on metrics, 20-50 open connections would be sufficient, -1 for unlimited)
(fixed, release is pending) Database indexing PR should reduce the time spent on each request, allowing open connections to be re-used sooner
TIS will be increasing the time between "job status" refresh from 5 -> 20 sec in the next release. This should cut the load from our most common server request by ~75%.

Root cause

Many types of API endpoint require a database connection. A small pool of connections is re-used across all requests. If the pool is full, a request will time out without having a database connection. Ours logs have many stacktraces like the following; the error can be traced to the maxActive setting of dbcp. (renamed to maxTotal in newer versions)

2024-03-08 23:10:06,057 [qtp911501858-581359] ERROR cloudgene.mapred.database.DownloadDao - update download failed.
org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044) ~[commons-dbcp-1.4.jar:1.4]
at org.apache.commons.dbutils.AbstractQueryRunner.prepareConnection(AbstractQueryRunner.java:319) ~[commons-dbutils-1.7.jar:1.7]
at org.apache.commons.dbutils.QueryRunner.update(QueryRunner.java:495) ~[commons-dbutils-1.7.jar:1.7]
at cloudgene.mapred.database.util.JdbcDataAccessObject.update(JdbcDataAccessObject.java:93) ~[cloudgene.jar:?]
at cloudgene.mapred.database.DownloadDao.update(DownloadDao.java:63) ~[cloudgene.jar:?]
at cloudgene.mapred.api.v2.jobs.ShareResults.get(ShareResults.java:48) ~[cloudgene.jar:?]
at sun.reflect.GeneratedMethodAccessor210.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_402]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_402]
at org.restlet.resource.ServerResource.doHandle(ServerResource.java:511) ~[org.restlet-2.3.12.jar:?]
at org.restlet.resource.ServerResource.get(ServerResource.java:723) ~[org.restlet-2.3.12.jar:?]
at org.restlet.resource.ServerResource.doHandle(ServerResource.java:603) ~[org.restlet-2.3.12.jar:?]
at org.restlet.resource.ServerResource.doNegotiatedHandle(ServerResource.java:662) ~[org.restlet-2.3.12.jar:?]
at org.restlet.resource.ServerResource.doConditionalHandle(ServerResource.java:348) ~[org.restlet-2.3.12.jar:?]
at org.restlet.resource.ServerResource.handle(ServerResource.java:1020) ~[org.restlet-2.3.12.jar:?]
at org.restlet.resource.Finder.handle(Finder.java:236) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Router.doHandle(Router.java:422) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Router.handle(Router.java:641) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:140) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:202) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.application.ApplicationHelper.handle(ApplicationHelper.java:77) ~[org.restlet-2.3.12.jar:?]
at org.restlet.Application.handle(Application.java:385) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Router.doHandle(Router.java:422) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Router.handle(Router.java:641) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Router.doHandle(Router.java:422) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Router.handle(Router.java:641) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.application.StatusFilter.doHandle(StatusFilter.java:140) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.doHandle(Filter.java:150) ~[org.restlet-2.3.12.jar:?]
at org.restlet.routing.Filter.handle(Filter.java:197) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.CompositeHelper.handle(CompositeHelper.java:202) ~[org.restlet-2.3.12.jar:?]
at org.restlet.Component.handle(Component.java:408) ~[org.restlet-2.3.12.jar:?]
at org.restlet.Server.handle(Server.java:507) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.connector.ServerHelper.handle(ServerHelper.java:63) ~[org.restlet-2.3.12.jar:?]
at org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:143) ~[org.restlet-2.3.12.jar:?]
at org.restlet.ext.jetty.JettyServerHelper$WrappedServer.handle(JettyServerHelper.java:256) ~[org.restlet.ext.jetty-2.3.12.jar:?]
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311) ~[jetty-server-9.2.14.v20151106.jar:9.2.14.v20151106]
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) ~[jetty-server-9.2.14.v20151106.jar:9.2.14.v20151106]
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) ~[jetty-io-9.2.14.v20151106.jar:9.2.14.v20151106]
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) ~[jetty-util-9.2.14.v20151106.jar:9.2.14.v20151106]
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) ~[jetty-util-9.2.14.v20151106.jar:9.2.14.v20151106]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_402]
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134) ~[commons-pool-1.5.4.jar:1.5.4]
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106) ~[commons-dbcp-1.4.jar:1.4]
... 57 more

Manually Confirming User Accounts

We have a few Cloudgene instances and quite often the confirmation emails end up in junk mail or are even not delivered by big webmail providers at all. Edu accounts seem to be especially aggressive in not delivering at all, which is a problem since most of the userbase makes their accounts with .edu mails. There are potentially things we can do about delivery rates going forward, but in the meantime I want to manually confirm user accounts (we don't want to give up the confirmation email as we have been hit with spam attacks on other services in the past where they get an account and the plaster text everywhere for SEO, even in jobnames...). The number of unconfirmed accounts is probably something like 30-40% now and often when an account isn't confirmed they tried with 2-3 different emails... it's been becoming more and more of a problem over time.

Maybe I'm missing something obvious... but I can't seem to figure out how to do this. I don't see a way in the interface, though the check on the left side of the users list seems to indicate whether or not they have confirmed. I might be missing something in the UI... I almost feel like I asked this before but I can't find it now, just that feeling of deja vu.

I'm using the docker container and still on the H2 database, I thought it wasn't a big deal because there isn't a ton of user info, the dataset is small... but H2 seems incredibly obnoxious now that I'm looking at this. I thought, ok, I'll manually mark them in the database. I'm not totally unfamiliar with databases... MariaDB, MySQL, Oracle... but the H2 setup has me scratching my head on how to even start their console for this... also because it is running in embedded mode... I'm not sure, is it even exposed outside of the cloudgene app?

Alternatively is there a way I can migrate the database from H2 to MySQL? After that I should be fine to make the appropriate modifications in there.

Non-Admin Job Logfiles

When a job fails, the message given to the user is: "Execution failed. Please have a look at the logfile for details."

Non-admin users don't seem to have access to the logfiles, so this message should probably be changed for them. Maybe to, "Execution failed. Please contact the server administrators for help if you believe this job should have completed successfully." Or something along those lines.

That, or give them access to the logfiles, but I would assume that potentially could be a security risk.

Edit: Version: 1.30.5 (built by travis on 2018-09-14T07:42:13Z)

genepi / cloudgene Goto Github PK

cloudgene's Introduction

Cloudgene

Features

Requirements

Installation

Getting started

Cloudgene and Genomics

Developing

Contact

License

cloudgene's People

Contributors

Stargazers

Watchers

Forkers

cloudgene's Issues

Summary

Proposed change

Summary

Description/ root cause

Example

Summary

Detail

Action

Summary

Actual behavior

Common scenarios

Expected behavior

Problem

Proposed solution(s)

Root cause

Recommend Projects

Recommend Topics

Recommend Org