scikit-hep / uproot3-methods Goto Github PK

View Code? Open in Web Editor NEW

21.0 10.0 28.0 317 KB

Pythonic behaviors for non-I/O related ROOT classes.

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

scikit-hep

uproot3-methods's Introduction

uproot3-methods

Pythonic mix-ins for ROOT classes.

This package is typically used as a dependency for uproot 3.x, to define methods on the classes that are automatically generated from ROOT files. This includes histograms (TH*) and physics objects like TLorentzVectors. The reason it's a separate library is so that we can add physics-specific functionality on a shorter timescale than we can update Uproot 3 itself, which is purely an I/O package.

Occasionally, this library is used without Uproot 3, as a way to make arrays of TLorentzVectors.

Note: this package is incompatible with awkward>=1.0 and uproot>=4.0! For Lorentz vectors, use vector. Since the versions of Awkward Array and Uproot that this is compatible with are deprecated, this library is deprecated as well.

Installation

Install uproot3-methods like any other Python package:

pip install uproot3-methods               # maybe with sudo or --user, or in virtualenv

Dependencies:

numpy (1.13.1+)
Awkward Array 0.x

Reference documentation

TBD.

Acknowledgements

Support for this work was provided by NSF cooperative agreement OAC-1836650 (IRIS-HEP), grant OAC-1450377 (DIANA/HEP) and PHY-1520942 (US-CMS LHC Ops).

Thanks especially to the gracious help of uproot3-methods contributors!

uproot3-methods's People

Contributors

Stargazers

Watchers

uproot3-methods's Issues

Bug in rotatex and family

If d is a TVector3 or TVector3Array:

d.rotatex(np.pi/2)

gives:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-52-a7bcb9559055> in <module>()
----> 1 d.rotatex(np.pi/2)

~/anaconda3/lib/python3.7/site-packages/uproot_methods/classes/TVector3.py in rotatex(self, angle)
     98 
     99     def rotatex(self, angle):
--> 100         return self.rotate_axis(Methods(1.0, 0.0, 0.0), angle)
    101 
    102     def rotatey(self, angle):

TypeError: Methods() takes no arguments

Also, there's no way to use a rotation matrix.

Package has undocumented dependency on `python-awkward-array`

As the title says. Please include in README.

ROOT::Math::LorentzVector

The documentation for ROOT master recommends using ROOT::Math::LorentzVector specializations instead of TLorentzVector (https://root.cern.ch/doc/master/classTLorentzVector.html). We can't do this with uproot until these are added to uproot_methods.

from_ptetaphi() should be from_ptetaphie()

TLorentzVector.from_ptetaphi()'s signature is actually pt, eta, phi, energy. There's from_ptetaphim() for pt, eta, phi, mass, so one would expect the "e" to be in the first function's name. This would be more consistent with ROOT's TLorentzVector. Can we just add from_ptetaphie() as a synonym?

NameError numbers.Real in TH1.fillallw

TH1.fillallw uses numbers.Real but doesn't import the numbers module

In [33]: x.fillallw(np.random.randn(10),np.random.randn(10))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-33-ddea8ee61580> in <module>()
----> 1 x.fillallw(np.random.randn(10),np.random.randn(10))

~/Software/Python/anaconda3/lib/python3.6/site-packages/uproot_methods/classes/TH1.py in fillallw(self, data, weights)
    183             data = numpy.array(data)
    184
--> 185         if isinstance(weights, numbers.Real):
    186             weights = numpy.empty_like(data)
    187

NameError: name 'numbers' is not defined

Also, out of curiosity, for a non array argument should empty_like be replaced with ones_like multiplied by the number?

weights = numpy.ones_like(data) * weights

Deprecation warning due to invalid escape sequences

Deprecation warnings are raised due to invalid escape sequences. This can be fixed by using raw strings or escaping the literals. pyupgrade also helps in automatic conversion : https://github.com/asottile/pyupgrade/

find . -iname '*.py' | grep -Ev 'test.py' | xargs -P4 -I{} python3.8 -Wall -m py_compile {}
./uproot_methods/profiles/__init__.py:9: DeprecationWarning: invalid escape sequence \.
  m = re.match("^([a-zA-Z_][a-zA-Z_0-9]*)(\.[a-zA-Z_][a-zA-Z_0-9]*)*$", name)

Setters for TH1?

Could TH1 (or other methods) as well grow some setters, such that things like this might be possible?

aTH1 = TH1()
aTH1.values = np.array([2,3,2])
aTH1.edges = np.array([0,1,2,3])
aTH1.variances = np.array([4,5,1])

Consistent method resolution order (MRO) issue with TLorentzVector

I'm unsure if this belongs here or perhaps with awkward (though I think it is here).

The following reduced snippet:

import awkward
import uproot_methods
ex = awkward.JaggedArray.fromiter([ [1.0], [2.0, 3.0] ])
p4 = uproot_methods.TLorentzVectorArray.from_ptetaphim(ex, ex, ex, ex)
p4crossp4 = p4.cross(p4)
awkward.topandas(p4crossp4)

Throws

Traceback (most recent call last):
  File "test.py", line 9, in <module>
    awkward.topandas(p4crossp4)
  File "/home/meloam/00-projects/tauid/venv/lib/python3.6/site-packages/awkward/util.py", line 225, in topandas
    return pandas.DataFrame({n: out[n] for n in out.columns}, columns=out.columns)
  File "/home/meloam/00-projects/tauid/venv/lib/python3.6/site-packages/awkward/util.py", line 225, in <dictcomp>
    return pandas.DataFrame({n: out[n] for n in out.columns}, columns=out.columns)
  File "/home/meloam/00-projects/tauid/venv/lib/python3.6/site-packages/awkward/array/jagged.py", line 514, in __getitem__
    cls = awkward.array.objects.Methods.maybemixin(type(content), self.JaggedArray)
  File "/home/meloam/00-projects/tauid/venv/lib/python3.6/site-packages/awkward/array/objects.py", line 29, in maybemixin
    return type(awkwardtype.__name__ + "Methods", allbases, {})
TypeError: Cannot create a consistent method resolution
order (MRO) for bases AwkwardSeries, ExtensionArray, JaggedSeries

with

awkward==0.12.21
awkward-numba==0.12.21
uproot==3.11.7
uproot-methods==0.7.4

There's a couple of different ways to trigger whatever's happening, but they all involve taking the cross product of a TLorentzVector (the same operation with floats succeeds).

Recursion error with *=

If you use *, the following works:

a = TVector3Array([1,2], [2,3], [3,4])
b = np.array([2,3])
a = a * b

But, with *=, an odd error occurs:

a *= b

---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
<ipython-input-71-0d04f348f081> in <module>()
----> 1 a *= b

~/anaconda3/lib/python3.7/site-packages/numpy/lib/mixins.py in func(self, other)
     41     """Implement an in-place binary method with a ufunc, e.g., __iadd__."""
     42     def func(self, other):
---> 43         return ufunc(self, other, out=(self,))
     44     func.__name__ = '__i{}__'.format(name)
     45     return func

~/anaconda3/lib/python3.7/site-packages/uproot_methods/classes/TVector3.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    187 
    188         else:
--> 189             return super(ArrayMethods, self).__array_ufunc__(ufunc, method, *inputs, **kwargs)
    190 
    191 class Methods(Common, uproot_methods.common.TVector.Methods, uproot_methods.base.ROOTMethods):

~/anaconda3/lib/python3.7/site-packages/awkward/array/objects.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    216                 contents.append(x)
    217 
--> 218         result = getattr(ufunc, method)(*contents, **kwargs)
    219 
    220         if awkward.util.iscomparison(ufunc):

~/anaconda3/lib/python3.7/site-packages/awkward/array/table.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    623         tuplelen = None
    624         for n, x in inputsdict.items():
--> 625             newcolumns[n] = getattr(ufunc, method)(*x, **kwargs)
    626 
    627             if tuplelen is None:

~/anaconda3/lib/python3.7/site-packages/uproot_methods/classes/TVector3.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    187 
    188         else:
--> 189             return super(ArrayMethods, self).__array_ufunc__(ufunc, method, *inputs, **kwargs)
    190 
    191 class Methods(Common, uproot_methods.common.TVector.Methods, uproot_methods.base.ROOTMethods):

~/anaconda3/lib/python3.7/site-packages/awkward/array/objects.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    216                 contents.append(x)
    217 
--> 218         result = getattr(ufunc, method)(*contents, **kwargs)
    219 
    220         if awkward.util.iscomparison(ufunc):

... last 2 frames repeated, from the frame below ...

~/anaconda3/lib/python3.7/site-packages/uproot_methods/classes/TVector3.py in __array_ufunc__(self, ufunc, method, *inputs, **kwargs)
    187 
    188         else:
--> 189             return super(ArrayMethods, self).__array_ufunc__(ufunc, method, *inputs, **kwargs)
    190 
    191 class Methods(Common, uproot_methods.common.TVector.Methods, uproot_methods.base.ROOTMethods):

RecursionError: maximum recursion depth exceeded in comparison

broken repr on ObjectArrays from ChunkedArrays and inconsistent behavior of choose()

~~i'm not sure whether this is expected or not, but when reading lazily from a tree, and thus operating with ChunkedArrays I can't seem to create ObjectArrays~~

rather than the above, it seems like the __repr_ method is broken for object arrays created from chunked arrays

Also, I noticed that .choose(n) seems to behave differently depending on whether the ObjectArray was created from ChunkedArrays vs JaggedArrays. Is this expected?

TLorentzVectorArray.from_ptetaphim without unpacking

Right now I have a structured array which includes pt, eta, phi, and mass. I'm using this to build TLorentzVectors by unpacking

pt, eta, phi, m = [jets[x] for x in ['pt', 'eta', 'phi', 'm']]
flat = TLorentzVector.TLorentzVectorArray.from_ptetaphim(pt, eta, phi, m)

this seems like it creates an intermediate copy unnecessarily. Is there some way to build this directly from from a N * 4 numpy array?

Multidimensional TLorentzVectorArray doesn't work

I can create them:

x = np.ones((5,2))
y = np.ones((5,2))
z = np.ones((5,2))
t = 5*np.ones((5,2))
arr = TLorentzVectorArray(x,y,z,t)

And print their mass:

arr.mass

[[4.69041576 4.69041576]
 [4.69041576 4.69041576]
 [4.69041576 4.69041576]
 [4.69041576 4.69041576]
 [4.69041576 4.69041576]]

But I can't print the array itself:

print(arr)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-59-e3eb363b4781> in <module>
----> 1 print(arr)

/usr/lib/python3.7/site-packages/awkward/array/base.py in __str__(self)
     96     def __str__(self):
     97         if len(self) <= 6:
---> 98             return "[{0}]".format(" ".join(self._util_arraystr(x) for x in self.__iter__(checkiter=False)))
     99 
    100         else:

/usr/lib/python3.7/site-packages/awkward/array/base.py in <genexpr>(.0)
     96     def __str__(self):
     97         if len(self) <= 6:
---> 98             return "[{0}]".format(" ".join(self._util_arraystr(x) for x in self.__iter__(checkiter=False)))
     99 
    100         else:

/usr/lib/python3.7/site-packages/awkward/array/objects.py in __iter__(self, checkiter)
    176             self._checkiter()
    177         for x in self._content:
--> 178             yield self.generator(x, *self._args, **self._kwargs)
    179 
    180     def __getitem__(self, where):

/usr/lib/python3.7/site-packages/uproot_methods/classes/TLorentzVector.py in <lambda>(row)
    124 class ArrayMethods(Common, uproot_methods.base.ROOTMethods):
    125     def _initObjectArray(self, table):
--> 126         self.awkward.ObjectArray.__init__(self, table, lambda row: TLorentzVector(row["fX"], row["fY"], row["fZ"], row["fE"]))
    127 
    128     def __awkward_serialize__(self, serializer):

/usr/lib/python3.7/site-packages/uproot_methods/classes/TLorentzVector.py in __init__(self, x, y, z, t)
    897 class TLorentzVector(Methods):
    898     def __init__(self, x, y, z, t):
--> 899         self._fP = uproot_methods.classes.TVector3.TVector3(float(x), float(y), float(z))
    900         self._fE = float(t)
    901 

TypeError: only size-1 arrays can be converted to Python scalars

Or add along only the inner axis, since sum takes no axis argument.

rotate method for TVector2 and TVector2Array

I see rotate method for TVector2 and TVector2Array missing and marked as TODO:
https://github.com/scikit-hep/uproot-methods/blob/master/uproot_methods/classes/TVector2.py#L20-L21

Is it possible to add that? I see you mentioned in #27 . Thanks!

TH3 not yet implemented

Hi,
Is there a deep reason why TH3 is not yet implemented or there was not interest until now?
Matt

Status bits for a TH1?

Hi! I was wondering if there is there any way to get status bits for a TH1? I'm particularly interested in the value of kIsAverage. Thanks!

TGraphAsymmErrors unusable.

Hi,

currently with uproot I cannot do anything with with TGraphAsymmErrors object. Probably TGraph are also not handled by uproot yet (?). Would it be possible the add the treatment of this class ?

Matt

TLorentzVectorArray.p3 and TLorentzVectorArray.boostp3 broken, TLorentzVectorArray.empty_like returns invalid

Calling empty_like on a TLorentzVectorArray gives invalid results. The returning object's _valid method raises exceptions.
Example:

import awkward
import uproot_methods

a = awkward.fromiter([[1, 2], [3]])
t = uproot_methods.TLorentzVectorArray.from_cartesian(a, a, a, a * 10)
print(t.empty_like())

This causes p3 and boostp3 to break, however it might not be visible in all cases. Slicing for example will make it visible:

t[[True, True]].p3

raises an exception.

CMS NanoAOD interface

There should be a mechanism that recognizes a TTree as NanoAOD and presents a virtual, formatted view of the data, using knowledge of NanoAOD idioms. For instance, Muon_* should be collected into a single jagged table called muons with the muon branches as its columns. It should use VirtualArrays, so that you can carry an array of muons around without having loaded all of the branches. References between particles and jets—expressed as integer indexes in NanoAOD—should be IndexedArrays. I'm on the fence about making them ChunkedArrays at the basket level—that may be too small. Perhaps they could be ChunkedArrays at the file level (or a function for loading them that takes the chunking size as an option).

This was inspired by scikit-hep/awkward-0.x#95.

TH1 physt() method doesn't preserve errors

If the physt() method is used to convert a TH1 to a Physt histogram, the errors aren't preserved (they're just the sqrt(N) errors).

Unexpected behaviour of cartesian product of TVector3Arrays

Hello,

I met an error when trying to get a cartesian product of two TVector3Arrays. Below is how to reproduce the error:

from awkward import JaggedArray # v0.7.0
from uproot_methods import TVector3Array # v0.3.1
import numpy as np # v1.14.2

_x = JaggedArray.fromoffsets([0,3,3,5,10], np.zeros(10))
_y = JaggedArray.fromoffsets([0,3,3,5,10], np.array([1]*10))
_z = JaggedArray.fromoffsets([0, 3,3,5,10], np.array([2]*10))
v = TVector3Array.from_cartesian(_x, _y, _z)

_xx = JaggedArray.fromoffsets([0,2,7,8,10], np.array([4]*10))
_yy = JaggedArray.fromoffsets([0,2,7,8,10], np.array([5]*10))
_zz= JaggedArray.fromoffsets([0,2,7,8,10], np.array([6]*10))
vv = TVector3Array.from_cartesian(_xx, _yy, _zz)

v.cross(vv)
# or v.cross(vv, nested=True)

The error seems to be related with TVector3's .cross() method. I didn't dig deep..
Could you pls take a look? Thanks!

It's an awesome set of packages! 👍🏼

angle between two vectors

Is there a reason why the angle between two vectors is not implemented?

TH1 from_numpy does not fill _fSumw2

The from_numpy function in TH1.py does not fill the field _fSumw2, therefore calling the method variances() on a TH1 generated in this way fails with an error.

Bug -> uproot_methods 0.3.2 does not read in two dimensional histograms correctly

Using uproot 0.3.2 causes the binning of the pt axis for the histogram here to be incorrect:
https://github.com/CoffeaTeam/fnal-column-analysis-tools/blob/master/tests/samples/testSF2d.histo.root

output in uproot 0.3.1 (correct by verifying in ROOT):

mac-129479:fnal-column-analysis-tools lagray$ pip install uproot-methods==0.3.1
Collecting uproot-methods==0.3.1
  Using cached https://files.pythonhosted.org/packages/f4/30/de9f8a9d7b380c5a9bcd9177836deccf18ddb9cb88533c8683e02af4bb66/uproot_methods-0.3.1-py2.py3-none-any.whl
Requirement already satisfied: awkward>=0.7.0 in /anaconda2/lib/python2.7/site-packages (from uproot-methods==0.3.1) (0.7.1)
Requirement already satisfied: numpy>=1.13.1 in /anaconda2/lib/python2.7/site-packages (from uproot-methods==0.3.1) (1.15.4)
Installing collected packages: uproot-methods
  Found existing installation: uproot-methods 0.3.2
    Uninstalling uproot-methods-0.3.2:
      Successfully uninstalled uproot-methods-0.3.2
Successfully installed uproot-methods-0.3.1
mac-129479:fnal-column-analysis-tools lagray$ python
Python 2.7.15 |Anaconda, Inc.| (default, Dec 14 2018, 13:10:39) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import uproot
>>> hist = uproot.open('tests/samples/testSF2d.histo.root')['scalefactors_Tight_Electron;1']
>>> hist.numpy()
(array([[0.80655736, 0.82857144, 1.0328639 , 1.007752  , 0.94072163,
        0.9458763 , 0.9897698 , 1.0339806 , 0.8274854 , 0.79701495],
       [0.88245934, 0.9274874 , 1.007595  , 0.97203946, 0.9529984 ,
        0.9819967 , 0.9752066 , 0.9748744 , 0.90893763, 0.8633218 ],
       [0.9188406 , 0.9669876 , 0.9881956 , 0.97466666, 0.9534575 ,
        0.97981155, 0.97463286, 0.96638656, 0.95782316, 0.9078014 ],
       [0.9403973 , 0.9809645 , 0.9953917 , 0.97167486, 0.95308644,
        0.9775281 , 0.97893435, 0.9799073 , 0.9685535 , 0.93766236],
       [1.0510856 , 1.0059242 , 1.1041056 , 0.989486  , 0.97549593,
        1.0118906 , 1.010727  , 1.0071633 , 0.9882629 , 1.0213568 ],
       [1.0510856 , 1.0059242 , 1.1041056 , 0.989486  , 0.97549593,
        1.0118906 , 1.010727  , 1.0071633 , 0.9882629 , 1.0213568 ]],
      dtype=float32), (array([-2.5  , -2.   , -1.566, -1.444, -0.8  ,  0.   ,  0.8  ,  1.444,
        1.566,  2.   ,  2.5  ]), array([ 10.,  20.,  35.,  50.,  90., 150., 500.])))
>>>

output in uproot 0.3.2 (second axis is totally wrong):

mac-129479:fnal-column-analysis-tools lagray$ pip install uproot-methods==0.3.2
Collecting uproot-methods==0.3.2
  Using cached https://files.pythonhosted.org/packages/d2/e4/8294ce0ead0ecc2b1d42bcb35409c8b9361c25139c1fc2e887494dd2f0f9/uproot_methods-0.3.2-py2.py3-none-any.whl
Requirement already satisfied: awkward>=0.7.0 in /anaconda2/lib/python2.7/site-packages (from uproot-methods==0.3.2) (0.7.1)
Requirement already satisfied: numpy>=1.13.1 in /anaconda2/lib/python2.7/site-packages (from uproot-methods==0.3.2) (1.15.4)
Installing collected packages: uproot-methods
  Found existing installation: uproot-methods 0.3.1
    Uninstalling uproot-methods-0.3.1:
      Successfully uninstalled uproot-methods-0.3.1
Successfully installed uproot-methods-0.3.2
mac-129479:fnal-column-analysis-tools lagray$ python
Python 2.7.15 |Anaconda, Inc.| (default, Dec 14 2018, 13:10:39) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import uproot
>>> hist = uproot.open('tests/samples/testSF2d.histo.root')['scalefactors_Tight_Electron;1']
>>> hist.numpy()
(array([[0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        ],
       [0.        , 0.80655736, 0.82857144, 1.0328639 , 1.007752  ,
        0.94072163, 0.9458763 , 0.9897698 , 1.0339806 , 0.8274854 ,
        0.79701495, 0.        ],
       [0.        , 0.88245934, 0.9274874 , 1.007595  , 0.97203946,
        0.9529984 , 0.9819967 , 0.9752066 , 0.9748744 , 0.90893763,
        0.8633218 , 0.        ],
       [0.        , 0.9188406 , 0.9669876 , 0.9881956 , 0.97466666,
        0.9534575 , 0.97981155, 0.97463286, 0.96638656, 0.95782316,
        0.9078014 , 0.        ],
       [0.        , 0.9403973 , 0.9809645 , 0.9953917 , 0.97167486,
        0.95308644, 0.9775281 , 0.97893435, 0.9799073 , 0.9685535 ,
        0.93766236, 0.        ],
       [0.        , 1.0510856 , 1.0059242 , 1.1041056 , 0.989486  ,
        0.97549593, 1.0118906 , 1.010727  , 1.0071633 , 0.9882629 ,
        1.0213568 , 0.        ],
       [0.        , 1.0510856 , 1.0059242 , 1.1041056 , 0.989486  ,
        0.97549593, 1.0118906 , 1.010727  , 1.0071633 , 0.9882629 ,
        1.0213568 , 0.        ],
       [0.        , 1.0510856 , 1.0059242 , 1.1041056 , 0.989486  ,
        0.97549593, 1.0118906 , 1.010727  , 1.0071633 , 0.9882629 ,
        1.0213568 , 0.        ]], dtype=float32), array([  -inf, -2.5  , -2.   , -1.566, -1.444, -0.8  ,  0.   ,  0.8  ,
        1.444,  1.566,  2.   ,  2.5  ,    inf]), array([        -inf,  10.        ,  91.66666667, 173.33333333,
       255.        , 336.66666667, 418.33333333, 500.        ,
                inf]))
>>>

Store additional information in TLorentzVector

I asked earlier about constructing the Lorentz vector objects directly from numpy arrays. That works really well! But I'd also like to be able to add additional information to the objects, i.e. in some cases we have other jet characteristics like b-tagging scores. Is there some method to append additional information (which is stored in another array) to each object?

awkeard.array module not found on importing uproot_methods

I recently installed uproot and uproot_methods:

conda config --add channels conda-forge  
conda install uproot
conda install uproot-methods

When trying to import uproot_methods I get the following error message:

python
>>> import uproot
>>> import awkward
>>> import uproot_methods
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usera/wfawcett/software/miniconda3/lib/python3.8/site-packages/uproot_methods/__init__.py", line 5, in <module>
    from uproot_methods.classes.TVector2 import TVector2, TVector2Array
  File "/usera/wfawcett/software/miniconda3/lib/python3.8/site-packages/uproot_methods/classes/TVector2.py", line 8, in <module>
    import awkward.array.jagged
ModuleNotFoundError: No module named 'awkward.array'

Note that I am using

>>> awkward.__version__
'1.1.2'
>>> uproot.__version__
'4.0.4'

as well as Python 3.8.6

My apologies if this isn't the right use of this tool.

TH1 methods `values`, `variances`, `allvalues`, `allvariances` should return numpy arrays, not lists

These methods currently return copies of internal lists. It would be more convenient and potentially faster to return numpy arrays instead of lists. arrays are faster to allocate and expected in a context of numerical computing.

Tests are missing for histograms

Histograms are completely untested, and the current implementation of TH1 methods has bugs.

issue with the Numpy() method for TH2

I have seen that someone commented that in TH1 values and allvalues methods were mixed somehow and this is fixed.. However, there is still an issue with TH2 when calling the Numpy() method on them.. I get :

File "data_MC_plots.py", line 18, in
arr, bins= histo_pythia.numpy()
ValueError: too many values to unpack (expected 2)

I have an older uproot version for my python 2 installation and this implementation used to work.. now for some reason it does not !

uproot-methods v0.5.1 breaking pyhf

Hi @jpivarski, in the pyhf test suite we're seeing problems caused by uproot-methods release v0.5.1 (The test-suite passes if I pin uproot-methods at v0.5.0). The main culprit seems to be

E   ModuleNotFoundError: No module named 'uproot.write.objects.TH'

I'm taking a look at this now, but I just wanted to bring it to your attention.

License file not included on PyPI for 0.5.1

The licence file was included in the PyPI tarball for previous releases.

Bug in TH1.py, line 146, in show method

If you have <class 'uproot.rootio.TH2D'> type object, when you try to use the 'show' method, there will be error.
The error is comming from:
TH1.py file line 146:
intervals = ["[{0:<.5g}, {1:<.5g})".format(l, h) for l, h in [self.interval(i) for i in range(len(self))]]

ValueError/KeyError after boosting PtEtaPhiMassLorentzVectorArray

When boosting a PtEtaPhiMassLorentzVectorArray, the resulting object is in an invalid state, such that all of the properties (such as x, y, pt, eta) raise errors.

import uproot_methods
vec = uproot_methods.TLorentzVectorArray.from_ptetaphim([1], [1], [1], [1])
boost = uproot_methods.TVector3Array.from_cartesian([0.1],[0.1],[0.1])
vecboosted = vec.boost(boost)
print(vecboosted.pt)

Same is true for rotate_axis and rotate_euler.

TH2 variances returned as a list not a 2 by 2 array

It is not possible to use the variances method effectivly for TH2 histograms ... as it does not return an array with the correct dimensions

boosting 4-vectors does not work on TLorentzVectorArray

Hi,

I am stumbling over a problem w/ 4-vectors in uproot_methods:

Consider the problem of having events with lorentz vectors and wanting to boost all of them into their rest frame. The operation works per-event, but not o the full full (batched) set of events

I'm looking into what the issue could be but perhaps someone is faster

x = [
    [
        [0,0,12,13],
        [0,0,4,5]
    ],
    [
        [0,0,15,17],
        [0,0,24,25]
    ],
]

mom = awkward.fromiter(x)
v = uproot_methods.TLorentzVectorArray.from_cartesian(mom[:,:,0],mom[:,:,1],mom[:,:,2],mom[:,:,3])

boost_back = (v.p3/v.p3.mag)*v.beta
print(
    [vv.boost(bb) for vv,bb in zip(v,-boost_back)]
)
print(
    v.boost(-boost_back)
)

the first print works as expected, the seconds does not and fails with

in boost(self, p3)
    227         gamma2 = self.awkward.numpy.zeros(b2.shape, dtype=self.awkward.numpy.float64)
    228         mask = (b2 != 0)
--> 229         gamma2[mask] = (gamma[mask] - 1) / b2[mask]
    230         del mask
    231 

IndexError: too many indices for array

TH1.py index method wrongly assumes a regular binning

The index method in TH1.py's Methods assumes that the binning is always regular, which is not the case.

https://github.com/scikit-hep/uproot-methods/blob/0aae83f4fc16875332fa85ea08013cde7b840f0a/uproot_methods/classes/TH1.py#L140

I think that uproot-method should not provide this interface at all, since it tries to emulate the behavior of a ROOT histogram. I think uproot and uproot-method should focus on giving access to the data of ROOT objects, and not try to imitate their behavior.

TLorentzVector ValueError in repr

Using from_ptetaphim to create a TLorentzVector if you pass a large value for eta then pz and e will be np.inf because of the exponential behaviour of sinh with eta. This isn't too much of an issue but if you try to print the TLorentzVector the __repr__ function gives a ValueError because for format code 'g' is unknown for np.inf.

TVector3 should suffer with a similar issue.

Error:

File "uproot_methods/classes/TLorentzVector.py", line 336, in __repr__
    return "TLorentzVector({0:.5g}, {1:.5g}, {2:.5g}, {3:.5g})".format(self._fP._fX, self._fP._fY, self._fP._fZ, self._fE)
ValueError: Unknown format code 'g' for object of type 'str'

New version doesn't allow add columns to JaggedTLorentzVectorArray

The JaggedTLorentzVectorArray is defined as

JaggedTLorentzVectorArray = awkward.Methods.mixin(uproot_methods.classes.TLorentzVector.ArrayMethods, awkward.JaggedArray)

It was later on treated as awkward.Table and adding new columns by
jaggedarray[att] =arrays[k]

This used to work fine, but started to break down with the latest update. I got errors as

  File "/uscms_data/d3/benwu/temp/CMSSW_10_2_11/src/NanoUpTools/test/../../NanoUpTools/framework/datamodel.py", line 73, in Object
    jaggedarray[att] =arrays[k]
  File "/uscms/home/benwu/.local/lib/python2.7/site-packages/awkward/array/jagged.py", line 756, in __setitem__
    self._content[where] = self.tojagged(what)._content
  File "/uscms/home/benwu/.local/lib/python2.7/site-packages/awkward/array/jagged.py", line 774, in tojagged
    return self.copy(content=data._content[self.IndexedArray.invert((self.index + self._starts)._content)])
  File "/uscms/home/benwu/.local/lib/python2.7/site-packages/numpy/lib/mixins.py", line 25, in func
    return ufunc(self, other)
  File "/uscms/home/benwu/.local/lib/python2.7/site-packages/uproot_methods/classes/TLorentzVector.py", line 301, in __array_ufunc__
    raise TypeError("(arrays of) TLorentzVector can only be added to/subtracted from other (arrays of) TLorentzVector")
TypeError: (arrays of) TLorentzVector can only be added to/subtracted from other (arrays of) TLorentzVector

Adding TLorentzVectorArray

Hi, I am defining two TLorentzVectorArray and wanting to take the sum. I can add mu_p4 to itself, or met_p4 to itself, but I cannot add them to each other without it blowing up. I'm using awkward1 and uproot_methods3. What am I doing wrong?

>>> candidatemuon = ak.firsts(events.Muon)
>>> mu_p4 = TLorentzVectorArray.from_ptetaphim(ak.fill_none(candidatemuon.pt,0),ak.fill_none(candidatemuon.eta,0),ak.fill_none(candidatemuon.phi,0),ak.fill_none(candidatemuon.mass,0))
>>> met_p4 = TLorentzVectorArray.from_ptetaphim(ak.from_iter([[v] for v in events.MET.pt]),ak.from_iter( [[v] for v in np.zeros(len(events))]), ak.from_iter([[v] for v in events.MET.phi]), ak.from_iter([[v] for v in np.zeros(len(events))]) )
>>> mu_p4.shape, met_p4.shape
((100000,), (100000,))
>>> mu_p4 + met_p4
MemoryError: Unable to allocate 74.5 GiB for an array with shape (100000, 100000) and data type float64

Question on behavior of weights and variances

From the discussions on Gitter today motivated by matthewfeickert/heputils#24 I think I am confused by the behavior of uproot3-methods treatment of weights and variances. Here is a short example

$ cat requirements.txt 
uproot~=4.0.6
uproot3~=3.14.4
$ pip list | grep "uproot"
uproot            4.0.6
uproot3           3.14.4
uproot3-methods   0.10.0

# issue.py
import numpy as np
import uproot
import uproot3

if __name__ == "__main__":
    bins = np.arange(0, 8)
    counts = np.array([2 ** x for x in range(len(bins[:-1]))])
    with uproot3.recreate("test.root", compression=uproot3.ZLIB(4)) as outfile:
        outfile["data"] = (counts, bins)

    root_file = uproot.open("test.root")
    hist = root_file["data"]

    print(f"hist has weights: {hist.weighted}")
    print(f"hist values: {hist.values()}")
    print("\nvariances are square of uncertainties\n")
    print(f"hist errors: {hist.errors()}")
    print(f"hist variances: {hist.variances()}")
    assert hist.variances().tolist() == np.square(hist.errors()).tolist()
    print("\nbut errors are not sqrt of values\n")

    print(f"expected errors to be: {np.sqrt(hist.values())}")

$ python issue.py 
hist has weights: True
hist values: [ 1  2  4  8 16 32 64]

variances are square of uncertainties

hist errors: [ 1.  2.  4.  8. 16. 32. 64.]
hist variances: [1.000e+00 4.000e+00 1.600e+01 6.400e+01 2.560e+02 1.024e+03 4.096e+03]

but errors are not sqrt of values

expected errors to be: [1.         1.41421356 2.         2.82842712 4.         5.65685425
 8.        ]

As help for .errors gives

Help on method errors in module uproot.behaviors.TH1:

errors(flow=False) method of uproot.dynamic.Model_TH1I_v3 instance
    Args:
        flow (bool): If True, include underflow and overflow bins before and
            after the normal (finite-width) bins.
    
    Errors (uncertainties) in the :ref:`uproot.behaviors.TH1.Histogram.values`
    as a 1, 2, or 3 dimensional ``numpy.ndarray`` of ``numpy.float64``.
    
    If ``fSumw2`` (weights) are available, they will be used in the
    calculation of the errors. If not, errors are assumed to be the square
    root of the values.
    
    Setting ``flow=True`` increases the length of each dimension by two.

I think(?) this is due to the behavior of

uproot3-methods/uproot3_methods/classes/TH1.py

Lines 347 to 353 in b722ee6

    
           valuesarray = numpy.empty(len(content) + 2, dtype=content.dtype) 
        
           valuesarray[1:-1] = content 
        
           valuesarray[0] = 0 
        
           valuesarray[-1] = 0 
        
           out.extend(valuesarray) 
        
           out._fSumw2 = valuesarray ** 2

where it seems that the weights are taken to be the values of the NumPy array — this is not what I would have expected.

What is the proper way to create an uproot3 TH1 with Poisson uncertainties?

Documentation of histogram methods

I've moved the issue here from scikit-hep/uproot3#193. It could be part of a long README, a Binder notebook, a wiki, or REsT for uproot-methods.readthedocs.org; I'm not sure. Ideas, @HDembinski?

Behavior of eta if x, y = 0 in TLorenzVector

Hi,

I'm wondering if we want to reproduce the behavior of ROOT.TLorentzVector for eta in the case that x, y = 0 (See https://root.cern.ch/doc/master/TVector3_8cxx_source.html#l00320). That is, ROOT returns sign(z)*10e10.

Currently, uproot-methods throws a ZeroDivisionError:

import uproot_methods
lv = uproot_methods.TLorentzVector(0,0,0,0)
lv.eta # throws ZeroDivisionError

But maybe this behavior is desired?

Thanks,
Javier

TH3 not writeable

When I try write a TH3 to a file, I get
TypeError: type TH3 from module __main__ is not writeable by uproot

I took a look at convert.py from uproot_methods and it seems there are lines for TH1 and TH2 but not for TH3. I added the two lines

elif any(x == ("uproot_methods.classes.TH3", "Methods") or x == ("TH3", "Methods") for x in types(obj.__class__, obj)):
 return (None, None, "uproot.write.objects.TH", "TH")

and it worked. And reason why these lines aren't in there? Could they be added?

	valuesarray = numpy.empty(len(content) + 2, dtype=content.dtype)
	valuesarray[1:-1] = content
	valuesarray[0] = 0
	valuesarray[-1] = 0

	out.extend(valuesarray)
	out._fSumw2 = valuesarray ** 2