andreasheger / cgatreport Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 5.0 5.39 MB

A report generator using sphinx.

Home Page: https://www.cgat.org/downloads/public/CGATReport/documentation/

License: Other

Python 69.32% Makefile 0.36% JavaScript 29.65% R 0.02% Shell 0.15% HTML 0.49%

cgatreport's People

Contributors

Stargazers

Watchers

Forkers

iansudbery tim-hu sudlab andreashegergenomics stjordanis

cgatreport's Issues

caching of Tracker code is broken

... or unreliable:

“make html” doesn’t seem to see when I’ve updated my tracker code. The program always says “0 trackers changed”, but despite this sometimes does re-runs the tracker, but not always, even if the code has in fact changed. I now always delete the _build directory (_cache itself isn’t enough) before rerunning; it’s a fine workaround but non-intuitive.

Python 3 Numeric types need redefinition

  File "/Users/gerton/anaconda/lib/python3.4/site-packages/CGATReport-0.5.1-py3.4.egg/CGATReport/DataTree.py", line 6, in <module>
    from CGATReport import Utils
  File "/Users/gerton/anaconda/lib/python3.4/site-packages/CGATReport-0.5.1-py3.4.egg/CGATReport/Utils.py", line 47, in <module>
    NumberTypes = (int, long, float, int, type(None),
NameError: name 'long' is not defined
make: *** [html] Error 1

Rendering of seperate tables as html files

I have a tracker that renders tables beyond the length at which tables are inserted in the report page.
The tracker works via cgatreport-test, but when included in a page gives the following error:

CGATReportError

    stage: collection

    exception: IOError

    message: [Errno 2] No such file or directory: \u2018_static/report_directive/Motifs_DREMELocations@table@f723a5d056535c5ec0c0a9e7d7141e20_Alyref-FLAG-R3/GMAS.html\u2019

    traceback:

        File \u201c/ifs/devel/Ian/python/lib/python2.7/site-packages/CGATReport-0.2.1-py2.7.egg/CGATReport/report_directive.py\u201d, line 370, in run

            links=links))
        File \u201c/ifs/devel/Ian/python/lib/python2.7/site-packages/CGATReport-0.2.1-py2.7.egg/CGATReportPlugins/HTMLPlugin/__init__.py\u201d, line 49, in collect

            outf = open(outputpath, \u201cw\u201d)

I assume this is because the track/slice using '/' as the separator is being interpreted as a directory: the render will fail to save the file because the directory won't exist.

Ian

Setting titles in excel spreadsheets

Add functionality to set titles in spread-sheets. Currently, if the whole data set is one spread-sheet,
the title is empty.

AttributeError: 'tuple' object has no attribute 'append'

I'm getting this error on running one of my trackers:

Traceback (most recent call last):
  File "/ifs/devel/Ian/python/bin/cgatreport-test", line 9, in <module>
    load_entry_point('CGATReport', 'console_scripts', 'cgatreport-test')()
  File "/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReport/test.py", line 508, in main
    result = dispatcher(**kwargs)
  File "/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReport/Dispatcher.py", line 677, in __call__
    self.data = DataTree.asDataFrame(self.tree)
  File "/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReport/DataTree.py", line 468, in asDataFrame
    df = concatDataFrames(dataframes, index_tuples)
  File "/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReport/DataTree.py", line 239, in concatDataFrames
    df = pandas.concat(dataframes, keys=index_tuples)
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/pandas/tools/merge.py", line 1325, in concat
    copy=copy)
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/pandas/tools/merge.py", line 1464, in __init__
    self.new_axes = self._get_new_axes()
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/pandas/tools/merge.py", line 1552, in _get_new_axes
    new_axes[self.axis] = self._get_concat_axis()
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/pandas/tools/merge.py", line 1609, in _get_concat_axis
    self.levels, self.names)
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/pandas/tools/merge.py", line 1675, in _make_concat_multiindex
    levels.append(categories)
AttributeError: 'tuple' object has no attribute 'append'

To replicate run

cgatreport-test --path=../src/pipeline_docs/pipeline_proj028/trackers/ -t FirstLastExonCount

/ifs/projects/proj028/project_pipeline_iCLIP5

code of the tracker is here:

/ifs/devel/projects/proj028/pipeline_docs/pipeline_proj028/trackers/NormalisedProfiles.py

Additional data files do not work with ~

When running report from a script that contains ~, the download content such as code will not be appear.

Make path names absolute and resolve user path.

Extension error

Hello,

Using cgat-report version 0.7.6.1 with sphinx version 1.7.0 generates the following error:

Extension error:
Could not import extension CGATReport.errors_directive (exception: No module named 'sphinx.util.compat')

The problem don't seem to happen with sphinx version 1.6.7

Best regards,
Sebastian

table renders have no ouput when large

I don't know when this started happening, but table renders above the max_rows are not outputting the linked html or xls, and the outputted rst is missing the anchor to the external file:

e.g.:

$ cat "_static/report_directive/RetainedIntrons_DetainedChunkSplicing@[email protected]"
`803205 x 12 table <#$html $#>`__




::

Aggregate transformers for histograms

Aggregate transformers for histograms are not functioning.

This is because the "converters" in TransformerAggregate expect lists/Vectors/Series and are being passed full dataframes by TransformerHistogram.

I suggest that this is fixed by using apply to apply the "converter" to each of the columns of the dataframe (excluding bin)

a la:

 def transform(self, data):
        debug("%s: called" % (str(self)))

        df = self.toHistogram(data)
        df = df.set_index("bin", append=True)
        for converter in self.mConverters:
            df = df.apply(converter, axis=0)

        df.reset_index(level="bin", inplace=True)
        debug("%s: completed" % (str(self)))
        return df

Pull request incoming...

output of GalleryStatus table does not render images

fix, seems to be an issue that output is prefixed by index in dataframe.

Missing templates

The templates are missing from my version:
(copied by typing, running in VM...)

cgatreport-quickstart -d newproject
.....
IOerror: [Errno 2] No such file .... /templates/Makefile

The whole CGATReport/templates directory is missing.
I was trying to install on Debian/testing.
I installed most of the required packages through apt-get, then "pip install cgatreport".
I end with an empty Makefile in the newproject directory.

Dataframe shape when using getValues

If I have a tracker that returns a list of values, via self.getValues, I usually get a DataFrame with a multi-level index (track, slice) and a single column. This can then be passed by track or by slice to various down stream transforms or renders. However, if by any chance all the track, slice combinations have the same length (e.g. I have one track and two slices, and each slice returns 100 values) then I get a DataFrame with a single level index (track) and a column for each slice.

This makes it very difficult to know what sort of dataframe my tracker is going to return, sometimes its a multi-indexed series and sometimes a multi-columned dataframe, and there is no way of telling which will be returned.

Familty 'sans' not included in postscript() device

I got a new error I've not seen before in my report. It's from a rather complicated user renderer, actaully something I was working on as a possible tracker plugin. The tracker creates plots using the Gviz genomic visualisaiton library in R. The error is:


CGATReportError

        stage: collection
        exception: rpy2.rinterface.RRuntimeError
        message: Error in grid.Call.graphics(L_text, as.graphicsAnnot(x$label), x$x, x$y, :

family ‘sans’ not included in postscript() device

        traceback:

            File “/ifs/apps/apps/python-2.7.1/lib/python2.7/site-packages/CGATReport-0.2-py2.7.egg/CGATReport/report_directive.py”, line 358, in run

                ‘notebook_url’: linked_notebookname}))
            File “/ifs/apps/apps/python-2.7.1/lib/python2.7/site-packages/CGATReport-0.2-py2.7.egg/CGATReportPlugins/RPlotPlugin/__init__.py”, line 114, in collect

                onefile=True)
            File “/ifs/apps/apps/python-2.7.1/lib/python2.7/site-packages/rpy2/robjects/functions.py”, line 166, in __call__

                return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
            File “/ifs/apps/apps/python-2.7.1/lib/python2.7/site-packages/rpy2/robjects/functions.py”, line 99, in __call__

                res = super(Function, self).__call__(*new_args, **new_kwargs)

I assume that this happens when CGATReprort is trying to save the user generated plot as an eps file, and it contains a font that isn't availible.

You can see this for yourself by running the CircularCandidates tracker in the CircularCandidates module in project 28 (for e.g. /ifs/projects/proj028/project_pipeline_full2).

enable parallel native sphinx support, deprecate build pre-processor

Database backend logic in TrackerSQL

Currently the __init__ routine of TrackerSQL follows the following logic when deciding what to use as the database:

Checks if a backend argument is passed to __init__
If not, checks is an sql_backend paramter is passed to __init__
If not checks if there is class member self.backend, if there isn't if uses PARAMS["report_sql_backend"]
However if the last check fails (i.e. there IS a self.backend set) if sets the backend sqlite:///./csvdb.

This logic means that if a self.backend is set by a mixin class, it will always be overwritten with sqlite:///./csvdb. What is more sqlite:///./csvdb will only ever be used if there IS a value specified by self.backend.

Pull request incoming.

Dealing with factors in pandas to R conversion.

By default, pandas to r dataframe conversion leaves columns with an AsIs class. This was causing problems, so we put in code to remove this. Unfortunately our code did this in the following manner:

Check if the type of the column was interger or double. If so set class to numeric.
Check if the type of the column is character if so set class to character
Otherwise return unchanged.

Unfortunatly the new pandas to R converter converts strings into R factors. R factors have the type integer and the class factor, meaning our UnAsIs converts factors to integers, loosing the text.

I propose we change the UnAsIs logic to simply remove the "AsIs" class from the list of classes for each column.

Pull request incoming.

Column alignment in scrollable tables

Hi Andreas,

If table type is "scrollable", then data columns don't always line up under their titles.

See for example, the mapping contexts table in the mapping report. Looks terribly unprofessional.

Ian

r-ggplot render in latest update.

[ians@cgath2 project_pipeline_iCLIP5]$ cgatreport-test --path=../src/pipeline_docs/pipeline_proj028/trackers/ -t ChimericReadProportions -r r-ggplot --o statement='aes(protein, fraction, col=replicate, ymin=fraction-1.96*se, ymax=fraction+1.96*se) + geom_bar(position="dodge", stat="identity") + theme_bw()' --o groupby=all
/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/rpy2/robjects/functions.py:106: UserWarning: Find out what's changed in ggplot2 at
http://github.com/tidyverse/ggplot2/releases.

  res = super(Function, self).__call__(*new_args, **new_kwargs)
WARNING:root:could not find module ChimericReadProportions in None: msg=No module named ChimericReadProportions
.. ---- TEMPLATE START --------

.. report:: ChimericReads.ChimericReadProportions
   :render: r-ggplot
   :groupby: all
   :statement: aes(protein, fraction, col=replicate, ymin=fraction-1.96*se, ymax=fraction+1.96*se) + geom_bar(position="dodge", stat="identity") + theme_bw()

   add caption here

.. ---- TEMPLATE END ----------

.. ---- OUTPUT-----------------
b''
b''
b''
b''
b'#$ggplot $#'
b''
b''
Traceback (most recent call last):
  File "/ifs/devel/Ian/python/bin/cgatreport-test", line 9, in <module>
    load_entry_point('CGATReport', 'console_scripts', 'cgatreport-test')()
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/CGATReport-0.7.6-py2.7.egg/CGATReport/test.py", line 590, in main
    for r in rr:
TypeError: 'ResultBlock' object is not iterable

Remove plugin system to make CGATReport better packagable

Column ordering/returning dataframes

In sphinx report v1, returning and odict of odicts would result in a table with row and column order preserved.

However, since the move to dataframes, the conversion of the odict of odicts to Dataframe causes the loss of column order.

One solution to this is to build the result directly as a dataframe in the tracker like so:

result = pandas.DataFrame(columns=["col1","col2","col3"])

for line in data:
    result.append(odict(("col1", do_stuff1(line[1])),
                        ("col2", line[2].some_attribute),
                        ("col3", line[3] + line[0]), ignore_index=True)

return result

However, if for some track, data is empty, then an empy dataframe is returned and you get this error:

raceback (most recent call last):
  File "/ifs/devel/Ian/python/bin/cgatreport-test", line 9, in <module>
    load_entry_point('CGATReport==0.2.1', 'console_scripts', 'cgatreport-test')()
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/CGATReport-0.2.1-py2.7.egg/CGATReport/test.py", line 488, in main
    result = dispatcher(**kwargs)
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/CGATReport-0.2.1-py2.7.egg/CGATReport/Dispatcher.py", line 645, in __call__
    self.data = DataTree.asDataFrame(self.tree)
  File "/ifs/devel/Ian/python/lib/python2.7/site-packages/CGATReport-0.2.1-py2.7.egg/CGATReport/DataTree.py", line 421, in asDataFrame
    assert min(levels) == max(levels)
AssertionError

This can be solved by changing return result to

if result.shape[0] > 0:
    return result

But this seems awefully clunky. Is there a better way? Can the conversion from odict to DataFrame be rewritten to preserve column order?

enumerate over index in Renderer.py

I get this issue for multiple pipeline reports:

stage: rendering
exception: AttributeError
message: ‘Index’ object has no attribute ‘labels’

traceback:
File “/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReport/Dispatcher.py”, line 758, in __call__
result = self.render()

File “/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReport/Dispatcher.py”, line 595, in render
results.append(self.renderer(dataframe, path=()))

File “/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReportPlugins
/Renderer.py”, line 117, in __call__
result.extend(self.render(dataframe, path))

File “/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReportPlugins/Plotter.py”, line 2265, in render
matrix, rows, columns = self.buildMatrix(work)

File “/ifs/apps/apps/python-2.7.11-gcc-5.2.1/lib/python2.7/site-packages/CGATReportPlugins/Renderer.py”, line 1229, in buildMatrix
drop = [x for x, y in enumerate(dataframe.index.labels)```

Need clarification for "transpose" flag in rst-table renderer

Hi,

I would like to ask: Is "transpose" flag in rst-table renderer working? Or otherwise... what is it doing?

My intention was to swap rows and columns in the table. So that tracks (slices) would each render in column. But it seems that this flag has no effect.

Thanks.