pysal / libpysal Goto Github PK

View Code? Open in Web Editor NEW

254.0 29.0 78.0 75.83 MB

Core components of Python Spatial Analysis Library

Home Page: http://pysal.org/libpysal

License: Other

Python 74.98% HTML 0.37% OpenEdge ABL 24.34% Visual Basic 6.0 0.31%

gis spatial-data spatial-statistics

libpysal's Introduction

Python Spatial Analysis Library Core

libpysal modules

libpysal.cg – Computational geometry
libpysal.examples – Built-in example datasets
libpysal.graph – Graph class encoding spatial weights matrices
libpysal.io – Input and output
libpysal.weights – Spatial weights

Example Notebooks

Development

libpysal development is hosted on github.

Discussions of development occurs on the developer list as well as Discord.

Contributing

PySAL-libpysal is under active development and contributors are welcome. If you have any suggestions, feature requests, or bug reports, please open new issues on GitHub. To submit patches, please review PySAL's documentation for developers, the PySAL development guidelines, and the libpysal contributing guidelines before opening a pull request. Once your changes get merged, you’ll automatically be added to the Contributors List.

Bug reports

To search for or report bugs, please see libpysal's issues.

License information

See LICENSE.txt for information on the history of this software, terms & conditions for usage, and a DISCLAIMER OF ALL WARRANTIES.

libpysal's People

Contributors

Stargazers

Watchers

Forkers

shaohu sjsrey ljwolf weikang9009 yogabonito tra6sdc slumnitz tayloroshan darribas jgaboardi knaaptime matthewborish jiulinguo neophyte7 myrnasastre robsalasco renanxcortes martinfleis qulogic jamessaxon siddharths8212376 lixun910 mgeeeek joshuawagner93 fagan2888 shaheen19 krzyz016 helen-research yongyi-liu achapkowski lakril sazio icefoxhz matthew-law cjsyzwsh metroxed julenmontes wenxiang-li stjordanis keszybz pablocruz17 jmarca philiperleal wawnun cornhundred anekekarina99 heko3 eyenike40 mjschwenne paulommaia lx12633036 yeonwoo3202 masauso2 milodubois maras13 jblandonsv yeochan11 hnjm yunfankang eegkit adhamenaya robna munahaf gegen07 xyluo25 pixelgentechnologies yusufipektas chesterharvey pedrovma radi2015 lisawink dancejod kryndlea rxm562 roger120981 u3ks lanselin

libpysal's Issues

MGWR_Georgia_example.ipynb missing pickle import statement

The MGWR_Georgia_example.ipynb is missing a pickle import statement.

This step fails:
pk.dump(mgwr_results, open('mgwr_example.p', 'wb'), protocol=2)

docstrings in W class need editing

looks like maybe a botched find/replace sometime that created all these rogue "of"s

libpysal/libpysal/weights/weights.py

Line 51 in c4d4e66

import scipy syntax typo in the new issue template

There is a syntax typo in the new issue template.

SciPy version:

>>> import scpiy; print(scipy.__version__)

To networkx argument name changed

networkx changed the name of ebunch to ebunch_to_add in their Graph.add_weighted_edges_from method.

Now, this means that the addition of weighted edges to the graph representing the W object fails, since it specifies the name as ebunch

Add notebooks for subpackage contract

In response to reorg project.

Currently being worked on in nb branch.

Nightli.es build permissions

For some reason I have permission to set Nightli.es builds for both spint and spglm, but not for the actual package I maintain, spaghetti. Is there an quick solution for this? I haven't found one. The screen shot below was taken from pysal's travis profile.

Kernel docstring does not mention unique Gaussian kernel behavior

I keep getting bit by this.

Our Gaussian kernel only computes for observations within the bandwidth distance.

But, in theory, this isn't necessary, since observations are still connected in the Gaussian kernel past this bandwidth.

Thus, in quite a few cases, this distance can result in truncations at pretty high w_{ij}; I get truncation at around .25 in an adaptive bandwidth on berlin neighborhoods data from geopython...

Since this isn't going to be fixed (I recall @TaylorOshan running into this when trying to build GWR on top of existing PySAL stuff), we need to disclaim that we force all kernels to be truncated at the bandwidth.

sphinx docs need updating

The source docs are outdated and reflect legacy pysal.

Pysal doesn't efficiently recognize all neighbors

I am using pysal function ps.weights.Queen.from_shapefile to calculate the neighbors of different polygons.
However it can't identify all the neighbours as seen in the photo attached these two polygons are not identified as neighbors, is there a way to solve this problem?
I have found this link also talking about the same problem.
https://stackoverflow.com/questions/45758233/pysal-doesnt-recognize-some-polygon-neighbors

Pulling example datasets from Carto

Opening this ticket to explore an idea that @ljwolf and myself had chatted briefly about. For the example datasets that are used in pysal, could these be maintained externally and just pulled by the library when required and cached locally? It's really easy to pull a Carto table directly in to a pandas dataframe using our SQL API so it might be a natural fit to store some of those data sources in Carto?

This would be similar to the approach scikit takes with grabbing example datasets.

nonplanar_neighbors fails when sindex is not constructed.

nonplanar_neighbors uses the sindex attribute to avoid unnecessarily fuzzing some observations. We assume sindex exists and, if it does not, the computation fails with an AttributeError.

Right now, if geopandas is installed using pip, it does not bring with it libspatialindex, which is a C library. If you install geopandas with conda, you do get libspatialindex by default. On travis, we use pip, so that's failing.

So, we need to

bail on nonplanar_neighbors when geodataframe.sindex is None.
install geopandas on travis from github.

I'm doing 2. on #58

alphashapes & n<4

in theory, the "delaunay" triangulation for cases where the number of points is less than 4 is still "known." N=3 is the triangle, n=2 is the line, and n=1 is the point.

as it stands, Qhull will error out when passed a collection with less than four points. It'd be nice if we checked this & returned something sane in these cases, rather than bailing out from qhull.

weights.distance.KNN.from_dataframe ignoring radius

files.zip

LEP: Stuff/use pysal/network stuff to provide queen weights on linestring dataframes

I've gotten asks from @michellemho, an ASU PhD student, and another user of PySAL about edge contiguity using linestrings.

I'd like to take the network stuff and:

check if the dataframe is composed of linestrings
construct the weights matrix from the linestrings

Rook & KNN take different id lists from the dataframe

KNN correctly takes the index of the dataframe as the index of the weights, but rook does not.

from libpysal import weights
from libpysal import examples
import geopandas

columbus = geopandas.read_file(examples.get_path('columbus.shp'))
columbus_sub = columbus.sample(frac=1)

print(columbus_sub.index[0:5]) # should not be (0,1,2,3,4,) but instead random

Wr = weights.Rook.from_dataframe(columbus_sub)
Wknn1 = weights.KNN.from_dataframe(columbus_sub)

print(Wr.id_order) # is (0,1,2,3...)
print(Wknn1.id_order) # matches columbus_sub.index

These should behave the same everywhere, and I think they should behave like KNN, taking the indices from the dataframe directly.

Missing example file

pysal/pysal#1045

weights.plot does not handle named observations

right now, if you have weights constructed using an idVariable or ids list, the plot method fails.

This is because it is using the iloc method to do lookups based on iteration indices.

It should be possible to rework this to use names and only names.

Edit island neighbors values in spatial Wobject Pysal

I am working on clustering a map with some islands in it.
These islands have no neighbors in the w object and so I can't run the maxp clustering function. I get the command ''No initial solution found''.
What I tried to do is to calculate the neighbors using the distance spatial weights, then substituting the island value in the congenital spatial weight.
The new W object still causes the same problem I used to get before, I am not sure where the problem comes from.
Here is the piece of code

shape=ps.open('file.dbf')
shape=shape.by_col_array('Solar')
shape=shape.astype(float)
hape[np.isnan(shape)]=0
scaler=MinMaxScaler()
shape1=scaler.fit_transform(shape)
w=ps.weights.Queen.from_shapefile('file.shp')
knn5=ps.knnW_from_shapefile('file.shp')
aa=w.islands
mtrx,idx=w.full()
for indx,val in enumerate(aa):

    w.neighbors[aa[indx]]=[knn5.neighbors[aa[indx]][0]]
    w.neighbors[knn5.neighbors[aa[indx]][0]]=w.neighbors[knn5.neighbors[aa[indx]][0]]+[aa[indx]]

w1=ps.weights.weights.W(w.neighbors,id_order=w.id_order,ids=w.id_order)
thr=0.1*sum(shape1[:,0])
np.random.seed(1234)
r=ps.region.maxp.Maxp(w1,shape1,floor=thr,floor_variable=shape1[:,0],initial=4000)

Then I get the command
''No initial solution found''

''
Any help please?

Some example datasets are missing documentation

tokyo
berlin
georgia
nyc_bikes
clearwater

on-the-fly W

This is originally a question, but if not implemented, it could be a nice additional user method to have. Is it possible to build a W from a subset of polygons from a shapefile?

Test-case example: calculate W for counties in only one state using a shapefile of all counties in the US. What I have in mind is something like ps.queen_from_shapefile but, instead of taking a path to a shapefile, it takes an iterable of polygons.

This would be useful for my example case, but also in many more contexts. Take the hypothetical case in which a user extracts a subset of polygons from a large database (e.g. spatialite) into memory and needs to build a W matrix from there (assuming the user has a way of converting the polygons from spatialite to PySAL geometries).

deprecate or test shapely_ext

the shapely extension was and is a very handy piece of code, wrapping correct interfaces between PySAL and shapely.

It's also used by geotable, my shapely-less mimick of geopandas, and the testing suite for the io.wkb module.

Currently, tests are not run for this code. Further, usually teach geopandas directly for this functionality. There is also some loss of information moving from shapely to pysal, since we do not maintain the order in which holes are nested inside exterior rings (pysal/#820, pysal/#852).

So, we should either:

deprecate the shapely extension. This would not affect geotable, since that dependency is soft. It would affect tests for wkb, which should be simple to switch using shapely.geometry.shape.
write tests for (and commit to maintain) this namespace.

error importing v3.0.7

In [1]: import libpysal
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-b416346c1a00> in <module>()
----> 1 import libpysal

~/libpysal/libpysal/__init__.py in <module>()
     26 """
     27 from . import cg
---> 28 from . import io
     29 from . import weights
     30 from . import examples

~/libpysal/libpysal/io/__init__.py in <module>()
      1 from . import fileio
      2 from .tables import *
----> 3 from .iohandlers import *
      4 from .util import *
      5 open = fileio.FileIO

ModuleNotFoundError: No module named 'libpysal.io.iohandlers'

inconsistency in api?

libpysal/libpysal/weights/__init__.py

Line 5 in d6c22ce

from . import util

I wonder why not

from .util import *

The old api.py has it like this:

from .weights.util import lat2W, block_weights, comb, order, higher_order, shimbel, remap_ids, full2W, full, WSP2W, insert_diagonal, get_ids, get_points_array_from_shapefile, min_threshold_distance, lat2SW, w_local_cluster, higher_order_sp, hexLat2W, regime_weights, attach_islands, nonplanar_neighbors, fuzzy_contiguity

Weights for circle, spheres and other connected on borders

Hi,
I wanted to know if there is a way of generating weights for circles, spheres and such in a way similar to

w=pysal.lat2W(3,1)?

This generates

w.full()
Out[62]: (array([[ 0.,  1.,  0.],
[ 1.,  0.,  1.],
[ 0.,  1.,  0.]]), [0, 1, 2])

But on a circle it should be

(array([[ 0.,  1.,  1.],
[ 1.,  0.,  1.],
[ 1.,  1.,  0.]]), [0, 1, 2])

Thanks!

MGWR_Georgia_example.ipynb fails due to different sample data shapes

Platform information:
nt win32
Python version:
3.6.6 |Anaconda, Inc.| (default, Jun 28 2018, 11:27:44) [MSC v.1900 64 bit (AMD64)]
SciPy version:
1.1.0
NumPy version:
1.14.2

This step fails using Georgia sample data:
#Add GWR parameters to GeoDataframe
georgia_shp['gwr_intercept'] = gwr_results.params[:,0]

ValueError: Length of values does not match length of index

Initial DataFrame information:
GData_utm.csv shape is (172, 18)
G_utm.shp shape is (159, 13)

two modules “Wsets.py” and "util.py" depend on each other

Function WSP2W is defined in util.py which imports the module Wsets.py
However, in Wsets.py, WSP2W is called in the function w_clip without prior declaration or import. We would encounter the following error if outSP is set False:

Since util.py is dependent on Wsets.py, we cannot import WSP2W from util.py. Any idea how to resolve this?

if numba isn't present, libpysal warns every time imported

we use numba now in libpysal.cg.alpha_shapes. But, because of the way we wrote the try/except, we get the warning every time pysal is imported.

Instead, we need to do this every time alpha_shape or alpha_shape_auto is used.

import not working after local install

After installing with

python setup.py install

imports are failing

Python 2.7.13 |Continuum Analytics, Inc.| (default, Dec 20 2016, 23:09:15)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import libpysal
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named libpysal

name of geometry column is hardcoded in nonplanar_neighbors

This needs to be gdf.geometry, since the geometry column may not always be named 'geometry'.

attach_islands assumes transform='b' & assumes k=1

It'd be nice to be able to use more than 1 nearest neighbor for the islands, use different kinds of supplemental weights, or use non-binary transforms.

Functionally, I think this conceptually involves:

Ensuring the "supplement" weights to the transform of the target weights. (probably usually binary, since this seems to be used mostly for rook/queen with islands)
Merging all of the neighbors,weights of islands in the target with their corresponding neighbors,weights in the supplement:

I believe this mostly looks the same as what's implemented:

def attach_islands(target, suplement):
    assert supplement.id_order == target.id_order
    supplement.transform = target.transform
    neighbors, weights = copy.deepcopy(target.neighbors), copy.deepcopy(target.weights)
    for island_ix in target.islands:
        neighbors[island_ix] = supplement.neighbors[island_ix]
        weights[island_ix] = supplement.weights[island_ix]
    return W(neighbors, weights, id_order=target.id_order)

BUG: test_weights_IO.py is using pysal and hard-coded paths

This test file has two issues that will raise failures if the tests are run:

It is using pysal not libpysal
Directory paths are hard coded to `C:1st

I’m not sure why Travis didn’t flag these when the merge was made?

When I run the tests locally I get:

ERROR: Failure: ModuleNotFoundError (No module named 'pysal')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/serge/anaconda3/lib/python3.6/site-packages/nose/failure.py", line 39, in runTest
    raise self.exc_val.with_traceback(self.tb)
  File "/home/serge/anaconda3/lib/python3.6/site-packages/nose/loader.py", line 417, in loadTestsFromName
    addr.filename, addr.module)
  File "/home/serge/anaconda3/lib/python3.6/site-packages/nose/importer.py", line 47, in importFromPath
    return self.importFromDir(dir_path, fqname)
  File "/home/serge/anaconda3/lib/python3.6/site-packages/nose/importer.py", line 94, in importFromDir
    mod = load_module(part_fqname, fh, filename, desc)
  File "/home/serge/anaconda3/lib/python3.6/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/home/serge/anaconda3/lib/python3.6/imp.py", line 172, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 675, in _load
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 205, in _call_with_frames_removed
  File "/home/serge/Documents/p/pysal/src/libpysal/libpysal/weights/tests/test_weights_IO.py", line 2, in <module>
    import pysal
ModuleNotFoundError: No module named 'pysal'

----------------------------------------------------------------------
Ran 165 tests in 11.996s

FAILED (SKIP=9, errors=1)

`contains_point` cannot handle nested exterior rings.

I hit this bug working on the WKT serialization/deserialization for the labelled array gsoc.

Take the following example with two polygons. The first multipolygon is a double-torus defined by a square ring between 1 and .75, and another square ring between .5 and .25. The other multipolygon is a single torus with an island, defined by a square between 1 and .75, and an inner square at .5.

Our contains_point implementation bails when any hole contains the point, which is accurate for the double-torus, but incorrect for the single-torus-with-island:

>>> import pysal as ps
>>> double_torus = ps.cg.Polygon(parts= [[(-1, -1), (-1,1),(1,1),(1,-1)],  #exterior square at 1
                                        [(-.5,-.5), (-,5, .5),(.5,.5),(.5,-.5)]],  #exterior square at .5
                                 holes= [[(-.75, -.75),(-.75,.75),(.75,.75),(.75,-.75)], #hole square at .75
                                        [(-.25,-.25),(-.25,.25),(.,25, .25),(.25,-.25)]]) #hole square at .25
>>> single_island_in_torus =  ps.cg.Polygon(parts= [[(-1, -1), (-1,1),(1,1),(1,-1)],  #exterior square at 1
                                        [(-.5,-.5), (-,5, .5),(.5,.5),(.5,-.5)]],  #exterior square at .5
                                 holes= [[(-.75, -.75),(-.75,.75),(.75,.75),(.75,-.75)], #hole square at .75
                                         ]) #inner square is solid
>>> double_torus.contains_point((0,0)) #is hollow, so origin is not contained
False
>>> single_island_in_torus.contains_point((0,0)) #should contain origin, since the inner square is solid
False

Again, if any hole contains the point, the algorithm bails, meaning any concentric multi-polygon will have incorrect results for point-in-polygon searches.

In fact, I don't think we can do a correct point in polygon search on multipolygons without establishing a ring-hole nesting.

Like, take a point P related to a multipolygon M composed of 2 exterior rings, A,B and one hole, H. Let P be contained in B. Stating the rings in OGC style, if M := ((AH), B), then P in H is not sufficient to exclude P from M, since P in (B intersection H) implies P in M.

Also, no clear even-odd rule exists for rings with no topological sorting, since checking ring/hole membership of P in either M:= ((AH),B) or M:= (A, (BH)) is indeterminate; "P is contained in two exterior rings and one hole" is ambiguous about P and M, since we don't know whether H contains B.

Solution

Sort the rings of a polygon topologically and record them in OGC style. Then, conduct a level-set membership test, where a naive point-in-polygon ring test can be applied walking down the topological sorting. The "top" ring governs membership; if the "top" ring is a hole, the point is not inside. Otherwise, the point is inside.

depreciating regime_weights in the new release?

PendingDeprecationWarning for function regime_weights seems to be there for a long time. Should it be changed to DeprecationWarning for regime_weights in the new release?

Import libpysal failed on python 3.5 and 3.6

In both python 3.5 and 3.6 environments (on a mac), an attempt to import libpysal after pip install libpysal will incur the following error:

In [1]: import libpysal
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-1-3a2ef2a3cd6f> in <module>()
----> 1 import libpysal

/Users/weikang/anaconda/envs/py3/lib/python3.6/site-packages/libpysal/__init__.py in <module>()
     21     Tools for creating and manipulating weights
     22 """
---> 23 import cg
     24 import io
     25 import weights

ModuleNotFoundError: No module named 'cg'

Import is fine in a python 2.7 environment.

ENH: shared perimeter contiguity weighting.

When we're coming from geodataframes, we can probably implement a shared perimeter weighting scheme pretty easily. This means that, instead of contiguity being a binary relation, we'd assign the weight to the adjacency graph according to how much of the polygon's perimeter is shared along their shared edge.

For Queen weights, this would necessarily reduce them down to Rook weights, so we would implement this only for Rook.

do a first pass for binary contiguity
for neighbors:
1. poly_focal.intersection(poly_neighbor).length / poly_focal.boundary.length would give the asymmetric perimeter weight
2. poly_focal.intersection(poly_neighbor).length / (poly_focal.boundary.length + poly_neighbor.length) would give a symmetrized perimeter weight

This would be an interesting addition to the library. It could be implemented as a function in util, and then applied at the end of initialisation for Rook.from_shapefile or from_dataframe if perimeter=True.

libpysal/libpysal/cg/init.py not importing `rtree`

Platform information:
posix darwin
Python version:
3.6.6
SciPy version:
1.1.0
NumPy version:
1.15.0

rtree is currently not being imported during __init__ (though it is being deleted), causing failures in spaghetti.utils.py.

from .shapes import *
from .standalone import *
from .locators import *
from .kdtree import *
from .sphere import *
from .voronoi import *
from .alpha_shapes import alpha_shape, alpha_shape_auto

del rtree
del kdtree
del locators
del voronoi
del standalone
del alpha_shapes
del shapes

Handling coincident points in KNN

Refer to pysal#579 and 941.

weights.Voronoi is a function, not a class.

This thread discusses this rough edge.

Whether or not we should document for the 2.0 release, then change (mandating another subsequent change to the documentation) or change (attempting to avoid spalling out into the rest of the library), then document the 2.0 release is up for grabs.

[ENH] should `weights.util.get_ids()` also accept a geodataframe?

Would we like to have get_ids(shapefile, idVariable) in libpysal.weights.util also accept a geopandas.GeoDataFrame as a valid arg for shapefile? If so, I am willing to put a PR together.

quadtree files

it looks like there is some additional stuff that's not integrated according to the rest of the repo structure.

There's unique/distinct data in libpysal/cg/tests/data and some images in libpysal/cg/tests/img. The data should live in examples, and we shouldn't ship the images at all. Also, those are used in an ipynb file inside of the test directory which should also not be shipped in libpysal.

core.util.WKTParser.fromWKT does not correctly determine holes

AFAICT, the WKTParser in core.util.wkt never constructs polygons with holes, since it never passes a holes keyword argument.

This came up because I've gotten a MultiPolygon parser working (I think) on the examples at the bottom of that file. In testing it, I was correctly parsing the Multipolygon, but the Polygon parser was never correctly determining what was the exterior ring and what was the interior ring.

Page 2-8 of the OGC Simple Features specification says Polygons should be:

1 exterior ring and 0 or more interior boundaries.

And, as far as I can tell, the exterior ring is always be first, with an arbitrary number of holes listed afterwards. I can't find that in the spec.

defaulting to using the dataframe index as the id set

pysal/pysal@e479e9e#diff-9cbe0028bbf74b1d2ddbd67dc76d6c9b

needs to be ported here.

This can subtly make differences in the W's ids, since PySAL will pull the index off of the W & treat it as ordered, but libpysal will not.

Not sure if this will affect anything, but it's important to keep this up-to-date/farther than pysal/pysal until the final bits of the refactor.

Request a new release of libpysal

libpysal relies on shapely as a dependency but does not demand the installation procedure to automatically install shapely.

libpysal api.py contains a collection of functions/classes in libpysal including those dependent on shapely and those not. For users who only want to use non-shapely-dependent functions and do not install shapely on their machine, import libpysal.api would be problematic in the current released version of libpysal. This is also a vital issue for giddy which uses non-shapely-dependent functions from import libpysal.api .

Fortunately this commit (only import the shapely extension if shapely is available) resolves this issue. Thus it would be very important to release a new version of libpysal with this commit integrated.

Support for missing data and pandas dataframes

Original author: [email protected] (January 31, 2013 20:49:46)

It would be very useful for pysal to recognize and handle NaN values in NumPy arrays and/or pandas dataframes. Sometimes, it is not desirable to simply drop all observations with missing data, as these observations can be important when calculating spatial lags.

Related, it would also be helpful to use pandas indexing to align the spatial weights matrix or matrices with the variables. Again, this is primarily an issue because of missing data.

Thanks!

Original issue: http://code.google.com/p/pysal/issues/detail?id=239

Ran 736 tests in 14.677s

FAILED (SKIP=50, errors=6, failures=62)

swm fix not ported forward from pysal.

1ba0b923 wasn't brought forward, so our arcgis_swm readers fail when asked for sparse matrices.

Travis failures on `None` comparisons in `segmentLocators.py`.

Travis failures for both pysal & libpysal are caused by comparisons to none in cg.segmentLocators. I've got a patch incoming.